---- Begin Edit ----
User #user17732522 pointed out the flaw that invokes UB is from the fact pop_back() invalidates the references used according to the vector library documentation. And constexpr evaluation is not required to detect this when it occurs as it's not part of the C++ core.
However, the fix, which was also pointed out by #user17732522, is simple. Replace occurrences of these two consecutive lines of code:
v.pop_back();
v.emplace_back(...);
with these two lines:
std::destroy_at(&v[0]); // optional since A has a trivial destructor
std::construct_at<A, int>(&v[0], ...);
---- Begin Original ----
While references are invalidated upon the destroy_at, they are reified automatically by the construct_at. See: https://eel.is/c++draft/basic#life-8,
It's well established you can't modify const values unless they were originally non const. But there appears an exception. Vectors containing objects with const members.
Here's how:
#include <vector>
#include <iostream>
struct A {
constexpr A(int arg) : i{ arg } {}
const int i;
};
int main()
{
std::vector<A> v;
v.emplace_back(1); // vector of one A initialized to i:1
A& a = v[0];
// prints: 1 1
std::cout << v[0].i << " " << a.i << '\n';
//v.resize(0); // ending the lifetime of A and but now using the same storage
v.pop_back();
v.emplace_back(2); // vector of one A initialized to i:2
// prints: 2 2
std::cout << v[0].i << " " << a.i << '\n';
}
Now this seems to violate the general rule that you can't change the const values. But using a consteval to force the compiler to flag UB, we can see that it is not UB
consteval int foo()
{
std::vector<A> v;
v.emplace_back(1); // vector of one A initialized to i:1
A& a = v[0];
v.pop_back();
v.emplace_back(2);
return a.i;
}
// verification the technique doesn't produce UB
constexpr int c = foo();
So either this is an example of modifying a const member inside a vector w/o UB or the UB detection using consteval is flawed. Which is it or am I missing something else?
It is UB to modify a const object.
However it is generally not UB to place a new object into storage previously occupied by a const object. That is UB only if the storage was previously occupied by a const complete object (a member subobject is not a complete object) and also doesn't apply to dynamic storage, which a std::vector will likely use to store elements.
std::vector is specified to allow creating new objects in it after removing previous ones, no matter the type, so it must implement in some way that works in any case.
What you are doing has undefined behavior for a different reason. You are taking a reference to the vector element at v[0]; and then you pop that element from the vector. std::vector's specification says that this invalidates the reference. Consequently, reading from the reference afterwards with a.i has undefined behavior. (But not via v[0]).
So your code has undefined behavior (since you are doing this in both examples).
However, it is unspecified whether UB in standard library clauses of the standard such as using the invalidated reference needs to be diagnosed in constant expression evaluation. Only core language undefined behavior needs to be diagnosed (and even then there are exceptions that obviously shouldn't be required to be diagnosed, since it is impossible, e.g. order-of-evaluation undefined behavior). Therefore the compiler does not need to give you an error for the UB here.
Also, consteval is not required here. constexpr would have done the same since you already force constant evaluation with constexpr on the variable.
Related
A const vector can't be modified as it's a const object. So inserts, appends, erases, are not allowed. However, its contents are not part of that object but are owned by it. As a similar example:
int* const p = new int[10]{1,2,3,4};
p is a const object that owns non-const data which can be modified: p[1]=5;
Vector's operator[] is conditioned on whether vector is const and if so returns a const int& But if the underlying value wasn't const then a const cast removing const should be legal.
To test this I wrote the following program:
#include <vector>
constexpr int foo()
{
const std::vector<int> v{ 1,2,3 };
const int a[3]{ 1,2,3 };
*const_cast<int*>(&v[1]) = 21;
// However, this should fail and does on GCC and CLANG
//*const_cast<int*>(&a[1]) = 21;
return v[1];
}
int main()
{
constexpr int sb21 = foo();
const std::vector<int> v{ 1,2,3 };
*const_cast<int*>(&v[1]) = 21;
return v[1] + sb21;
}
compiler explorer
MSVC, CLANG, and GCC all compile and execute.
The code evaluates a constexpr function at compile time. Compilers are supposed to produce compile time errors on UB. For comparison if the array, which contains const elements, is uncommented, Clang and GCC both produce errors as expected. However, MSVC does not which appears to be a bug.
Use case is having a fixed size vector that can't be structurally altered but can have contents updated.
std::vector<T> uses std::allocator<T> and so long as the library implementation of vector doesn't use small sizes like std::string's short string optimization then this should be defined behavior.
Here's an example showing how a const std::string exhibits UB for small strings that are stored within the object while longer allocated ones do not:
#include <string>
consteval int foo()
{
const std::string v{ "1234" };
//const std::string v{ "123412341234123412341234" };
*const_cast<char*>(&v[1]) = 'A';
return v[1];
}
int main()
{
return foo();
}
Compiler Explorer
Is this defined behavior or are the compilers not flagging UB?
But if the underlying value wasn't const then a const cast removing const should be legal.
This is the weak point of your argument. It's not the underlying value that matters, but the owned object. If the owned object is not a const object, then removing the cast should be legal. However, can you prove that the owned object is not const?
I believe you cannot. Take your own example – a vector containing three ints. Hypothetically, suppose each int is 4 bytes, so the total data is 12 bytes. Also suppose the size of a vector is 24 bytes (allowing 8 bytes for each of a pointer, size, and capacity). It would not be unreasonable to optimize a bit and store the three ints in the vector itself, along with a flag to say that the data is inside the vector instead of being dynamically allocated (a similar approach is used in short string optimization).
Now that we have the possibility that the data is inside the vector itself, we have the possibility that the data is part of a const object, because the vector is const. Casting away this constness to change a value is undefined behavior.
The bottom line is that if you do not own the object, you cannot know for sure how it was created. If the owner tells you it is const, then you have to treat it as const.
Compilers are supposed to produce compile time errors on UB.
The above statement is incorrect unless explicitly specified by the standard.
Can contents of a const vector be modified w/o UB?
No. Trying to modify a const variable leads to undefined behavior.
From dcl.type.cv:
Except that any class member declared mutable can be modified, any attempt to modify a const object during its lifetime results in undefined behavior.
(end quote)
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior. The program may just crash.
So the output that you're seeing(maybe seeing) is a result of undefined behavior. And as i said don't rely on the output of a program that has UB. The program may just crash.
So the first step to make the program correct would be to remove UB. Then and only then you can start reasoning about the output of the program.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.
const_cast from const T& to T& is not undefined behavior unless the underlying data is itself const.
AFAICT, there is no difference between vector<int> and const vector<int> in that aspect: in both cases it is not UB to const_cast v[0] to int&.
I see other answers here that speculate otherwise, but I don't see any evidence of that in the standard.
See also:
Modifying element of const std::vector<T> via const_cast
Also see a similar discussion here:
const vector implies const elements?
After reading about possible ways of rebinding a reference in C++, which should be illegal, I found a particularly ugly way of doing it. The reason I think the reference really gets rebound is because it does not modify the original referenced value, but the memory of the reference itself. After some more researching, I found a reference is not guaranteed to have memory, but when it does have, we can try to use the code:
#include <iostream>
using namespace std;
template<class T>
class Reference
{
public:
T &r;
Reference(T &r) : r(r) {}
};
int main(void)
{
int five = 5, six = 6;
Reference<int> reference(five);
cout << "reference value is " << reference.r << " at memory " << &reference.r << endl;
// Used offsetof macro for simplicity, even though its support is conditional in C++ as warned by GCC. Anyway, the macro can be hard-coded
*(reinterpret_cast<int**>(reinterpret_cast<char*>(&reference) + offsetof(Reference<int>, r))) = &six;
cout << "reference value changed to " << reference.r << " at memory " << &reference.r << endl;
// The value of five still exists in memory and remains untouched
cout << "five value is still " << five << " at memory " << &five << endl;
}
A sample output using GCC 8.1, but also tested in MSVC, is:
reference value is 5 at memory 0x7ffd1b4eb6b8
reference value changed to 6 at memory 0x7ffd1b4eb6bc
five value is still 5 at memory 0x7ffd1b4eb6b8
The questions are:
Is the method above considered undefined behavior? Why?
Can we technically say the reference gets rebound, even though it should be illegal?
In a practical situation, when the code has already worked using a specific compiler in a specific machine, is the code above portable (guaranteed to work in every operational system and every processor), assuming we use the same compiler version?
Above code has undefined behavior. The result of your reinterpret_cast<int**>(…) does not actually point to an object of type int*, yet you dereference and overwrite the stored value of the hypothetical int* object at that location, violating at least the strict aliasing rule in the process [basic.lval]/11. In reality, there is not even an object of any type at that location (references are not objects)…
Exactly one reference is being bound in your code and that happens when the constructor of Reference initializes the member r. At no point is a reference being rebound to another object. This simply appears to work due to the fact that the compiler happens to implement your reference member via a field that stores the address of the object the reference is refering to, which happens to be located at the location your invalid pointer happens to point to…
Apart from that, I would have my doubts whether it's even legal to use offsetof on a reference member to begin with. Even if it is, that part of your code would at best be conditionally-supported with effectively implementation-defined behavior [support.types.layout]/1, since your class Reference is not a standard-layout class [class.prop]/3.1 (it has a member of reference type).
Since your code has undefined behavior, it cannot possibly be portable…
As shown in the other answer, your code has UB. A reference cannot be re-boud - this is by language design and no matter what kind of casting trickery you try you cannot get around that, you will still end up with UB.
But you can have re-binding reference semantics with std::reference_wrapper:
int a = 24;
int b = 11;
auto r = std::ref(a); // bind r to a
r.get() = 5; // a is changed to 5
r = b; // re-bind r to b
r.get() = 13; // b is changed to 13
References can be rebound legally, if you jump through the right hoops:
#include <new>
#include <cassert>
struct ref {
int& value;
};
void test() {
int x = 1, y = 2;
ref r{x};
assert(&r.value == &x);
// overwrite the memory of r with a new ref referring to y.
ref* rebound_r_ptr = std::launder(new (&r) ref{y});
// rebound_r_ptr points to r, but you really have to use it.
// using r directly could give old value.
assert(&rebound_r_ptr->value == &y);
}
Edit: godbolt link. You can tell that it works because the function always returns 1.
According to Herb Sutter's article http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/, the following code is correct:
#include <iostream>
#include <vector>
using namespace std;
vector<vector<int>> f() { return {{1},{2},{3},{4},{5}}; }
int main()
{
const auto& v = f();
cout << v[3][0] << endl;
}
i.e. the lifetime of v is extended to the lifetime of the v const reference.
And indeed this compiles fine with gcc and clang and runs without leaks according to valgrind.
However, when I change the main function thusly:
int main()
{
const auto& v = f()[3];
cout << v[0] << endl;
}
it still compiles but valgrind warns me of invalid reads in the second line of the function due to the fact that the memory was free'd in the first line.
Is this standard compliant behaviour or could this be a bug in both g++ (4.7.2) and clang (3.5.0-1~exp1)?
If it is standard compliant, it seems pretty weird to me... oh well.
There's no bug here except in your code.
The first example works because, when you bind the result of f() to v, you extend the lifetime of that result.
In the second example you don't bind the result of f() to anything, so its lifetime is not extended. Binding to a subobject of it would count:
[C++11: 12.2/5]: The second context is when a reference is bound to a temporary. The temporary to which the reference is bound or the temporary that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference except: [..]
…but you're not doing that: you're binding to the result of calling a member function (e.g. operator[]) on the object, and that result is not a data member of the vector!
(Notably, if you had an std::array rather than an std::vector, then the code† would be absolutely fine as array data is stored locally, so elements are subobjects.)
So, you have a dangling reference to a logical element of the original result of f() which has long gone out of scope.
† Sorry for the horrid initializers but, well, blame C++.
This question already has answers here:
Two different values at the same memory address
(7 answers)
Closed 5 years ago.
Consider this :
#include <iostream>
using namespace std;
int main(void)
{
const int a1 = 40;
const int* b1 = &a1;
char* c1 = (char *)(b1);
*c1 = 'A';
int *t = (int*)c1;
cout << a1 << " " << *t << endl;
cout << &a1 << " " << t << endl;
return 0;
}
The output for this is :
40 65
0xbfacbe8c 0xbfacbe8c
This almost seems impossible to me unless compiler is making optimizations. How ?
This is undefined behavior, you are modifying a const variable so you can have no expectation as to the results. We can see this by going to the draft C++ standard section 7.1.6.1 The cv-qualifiers paragraph 4 which says:
[...]any attempt to modify a const object during its lifetime (3.8) results in undefined behavior.
and even provides an example:
const int* ciq = new const int (3); // initialized as required
int* iq = const_cast<int*>(ciq); // cast required
*iq = 4; // undefined: modifies a const object
In the standard definition of undefined behaviour in section 1.3.24, gives the following possible behaviors:
[...] Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of
a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). [...]
Your code has undefined behaviour, because you are modifying a constant object. Anything could happen, nothing is impossible.
When you qualify them variables const the compiler can assume a few things and generate code, this works fine providing you respect that agreement and not break it. When you've broken it, you'll get undefined behaviour.
Note that when const is removed, it works as expected; here's a live example.
As has been explained by others, modifying a const value results in undefined behavior and nothing more needs to be said - any result is possible, including complete nonsense or a crash.
If you're curious as to how this particular result came about, it's almost certainly due to optimization. Since you defined a to be const, the compiler is free to substitute the value 40 that you assigned to it whenever it wants; after all, its value can't change, right? This is useful when you're using a to define the size of an array for example. Even in gcc, which has an extension for variable-sized arrays, it's simpler for the compiler to allocate a constant-size array. Once the optimization exists it's probably applied consistently.
The output of the following code:
const int i= 1;
(int&)i= 2; // or: const_cast< int&>(i)= 2;
cout << i << endl;
is 1 (at least under VS2012)
My question:
Is this behavior defined?
Would the compiler always use the defined value for constants?
Is it possible to construct an example where the compiler would use the value of the latest assignment?
It is totally undefined. You just cannot change the value of constants.
It so happens that the compiler transforms your code into something like
cout << 1 << endl;
but the program could just as well crash, or do something else.
If you set the warnings level high enough, the compiler will surely tell you that it is not going to work.
Is this behavior defined?
The behavior of this code is not defined by the C++ standard, because it attempts to modify a const object.
Would the compiler always use the defined value for constants?
What value the compiler uses in cases like this depends on the implementation. The C++ standard does not impose a requirement.
Is it possible to construct an example where the compiler would use the value of the latest assignment?
There might be cases where the compiler does modify the value and use it, but they would not be reliable.
As said by others, the behaviour is undefined.
For the sake of completeness, here is the quote from the Standard:
(§7.1.6.1/4) Except that any class member declared mutable (7.1.1) can be modified, any attempt to modify a const object during its lifetime (3.8) results in undefined behavior. [ Example:
[...]
const int* ciq = new const int (3); // initialized as required
int* iq = const_cast<int*>(ciq); // cast required
*iq = 4; // undefined: modifies a const object
]
Note that the word object is this paragraph refers to all kinds of objects, including simple integers, as shown in the example – not only class objects.
Although the example refers to a pointer to an object with dynamic storage, the text of the paragraph makes it clear that this applies to references to objects with automatic storage as well.
The answer is that the behavior is undefined.
I managed to set up this conclusive example:
#include <iostream>
using namespace std;
int main(){
const int i = 1;
int *p=const_cast<int *>(&i);
*p = 2;
cout << i << endl;
cout << *p << endl;
cout << &i << endl;
cout << p << endl;
return 0;
}
which, under gcc 4.7.2 gives:
1
2
0x7fffa9b7ddf4
0x7fffa9b7ddf4
So, it is like you have the same memory address as it is holding two different values.
The most probable explanation is that the compiler simply replaces constant values with their literal values.
You are doing a const_cast using the C-like cast operator.
Using const_cast is not guaranteeing any behaviour.
if ever you do it, it might work or it might not work.
(It's not good practice to use C-like operators in C++ you know)
Yes you can, but only if you initiate a const as a read-only but not compile-time const, as follows:
int y=1;
const int i= y;
(int&)i= 2;
cout << i << endl; // prints 2
C++ const keyword can be missleading, it's either a const or a read-only.