Can contents of a `const vector` be modified w/o UB? - c++

A const vector can't be modified as it's a const object. So inserts, appends, erases, are not allowed. However, its contents are not part of that object but are owned by it. As a similar example:
int* const p = new int[10]{1,2,3,4};
p is a const object that owns non-const data which can be modified: p[1]=5;
Vector's operator[] is conditioned on whether vector is const and if so returns a const int& But if the underlying value wasn't const then a const cast removing const should be legal.
To test this I wrote the following program:
#include <vector>
constexpr int foo()
{
const std::vector<int> v{ 1,2,3 };
const int a[3]{ 1,2,3 };
*const_cast<int*>(&v[1]) = 21;
// However, this should fail and does on GCC and CLANG
//*const_cast<int*>(&a[1]) = 21;
return v[1];
}
int main()
{
constexpr int sb21 = foo();
const std::vector<int> v{ 1,2,3 };
*const_cast<int*>(&v[1]) = 21;
return v[1] + sb21;
}
compiler explorer
MSVC, CLANG, and GCC all compile and execute.
The code evaluates a constexpr function at compile time. Compilers are supposed to produce compile time errors on UB. For comparison if the array, which contains const elements, is uncommented, Clang and GCC both produce errors as expected. However, MSVC does not which appears to be a bug.
Use case is having a fixed size vector that can't be structurally altered but can have contents updated.
std::vector<T> uses std::allocator<T> and so long as the library implementation of vector doesn't use small sizes like std::string's short string optimization then this should be defined behavior.
Here's an example showing how a const std::string exhibits UB for small strings that are stored within the object while longer allocated ones do not:
#include <string>
consteval int foo()
{
const std::string v{ "1234" };
//const std::string v{ "123412341234123412341234" };
*const_cast<char*>(&v[1]) = 'A';
return v[1];
}
int main()
{
return foo();
}
Compiler Explorer
Is this defined behavior or are the compilers not flagging UB?

But if the underlying value wasn't const then a const cast removing const should be legal.
This is the weak point of your argument. It's not the underlying value that matters, but the owned object. If the owned object is not a const object, then removing the cast should be legal. However, can you prove that the owned object is not const?
I believe you cannot. Take your own example – a vector containing three ints. Hypothetically, suppose each int is 4 bytes, so the total data is 12 bytes. Also suppose the size of a vector is 24 bytes (allowing 8 bytes for each of a pointer, size, and capacity). It would not be unreasonable to optimize a bit and store the three ints in the vector itself, along with a flag to say that the data is inside the vector instead of being dynamically allocated (a similar approach is used in short string optimization).
Now that we have the possibility that the data is inside the vector itself, we have the possibility that the data is part of a const object, because the vector is const. Casting away this constness to change a value is undefined behavior.
The bottom line is that if you do not own the object, you cannot know for sure how it was created. If the owner tells you it is const, then you have to treat it as const.

Compilers are supposed to produce compile time errors on UB.
The above statement is incorrect unless explicitly specified by the standard.
Can contents of a const vector be modified w/o UB?
No. Trying to modify a const variable leads to undefined behavior.
From dcl.type.cv:
Except that any class member declared mutable can be modified, any attempt to modify a const object during its lifetime results in undefined behavior.
(end quote)
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior. The program may just crash.
So the output that you're seeing(maybe seeing) is a result of undefined behavior. And as i said don't rely on the output of a program that has UB. The program may just crash.
So the first step to make the program correct would be to remove UB. Then and only then you can start reasoning about the output of the program.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.

const_cast from const T& to T& is not undefined behavior unless the underlying data is itself const.
AFAICT, there is no difference between vector<int> and const vector<int> in that aspect: in both cases it is not UB to const_cast v[0] to int&.
I see other answers here that speculate otherwise, but I don't see any evidence of that in the standard.
See also:
Modifying element of const std::vector<T> via const_cast
Also see a similar discussion here:
const vector implies const elements?

Related

Apparently you can modify const values w/o UB. Or can you?

---- Begin Edit ----
User #user17732522 pointed out the flaw that invokes UB is from the fact pop_back() invalidates the references used according to the vector library documentation. And constexpr evaluation is not required to detect this when it occurs as it's not part of the C++ core.
However, the fix, which was also pointed out by #user17732522, is simple. Replace occurrences of these two consecutive lines of code:
v.pop_back();
v.emplace_back(...);
with these two lines:
std::destroy_at(&v[0]); // optional since A has a trivial destructor
std::construct_at<A, int>(&v[0], ...);
---- Begin Original ----
While references are invalidated upon the destroy_at, they are reified automatically by the construct_at. See: https://eel.is/c++draft/basic#life-8,
It's well established you can't modify const values unless they were originally non const. But there appears an exception. Vectors containing objects with const members.
Here's how:
#include <vector>
#include <iostream>
struct A {
constexpr A(int arg) : i{ arg } {}
const int i;
};
int main()
{
std::vector<A> v;
v.emplace_back(1); // vector of one A initialized to i:1
A& a = v[0];
// prints: 1 1
std::cout << v[0].i << " " << a.i << '\n';
//v.resize(0); // ending the lifetime of A and but now using the same storage
v.pop_back();
v.emplace_back(2); // vector of one A initialized to i:2
// prints: 2 2
std::cout << v[0].i << " " << a.i << '\n';
}
Now this seems to violate the general rule that you can't change the const values. But using a consteval to force the compiler to flag UB, we can see that it is not UB
consteval int foo()
{
std::vector<A> v;
v.emplace_back(1); // vector of one A initialized to i:1
A& a = v[0];
v.pop_back();
v.emplace_back(2);
return a.i;
}
// verification the technique doesn't produce UB
constexpr int c = foo();
So either this is an example of modifying a const member inside a vector w/o UB or the UB detection using consteval is flawed. Which is it or am I missing something else?
It is UB to modify a const object.
However it is generally not UB to place a new object into storage previously occupied by a const object. That is UB only if the storage was previously occupied by a const complete object (a member subobject is not a complete object) and also doesn't apply to dynamic storage, which a std::vector will likely use to store elements.
std::vector is specified to allow creating new objects in it after removing previous ones, no matter the type, so it must implement in some way that works in any case.
What you are doing has undefined behavior for a different reason. You are taking a reference to the vector element at v[0]; and then you pop that element from the vector. std::vector's specification says that this invalidates the reference. Consequently, reading from the reference afterwards with a.i has undefined behavior. (But not via v[0]).
So your code has undefined behavior (since you are doing this in both examples).
However, it is unspecified whether UB in standard library clauses of the standard such as using the invalidated reference needs to be diagnosed in constant expression evaluation. Only core language undefined behavior needs to be diagnosed (and even then there are exceptions that obviously shouldn't be required to be diagnosed, since it is impossible, e.g. order-of-evaluation undefined behavior). Therefore the compiler does not need to give you an error for the UB here.
Also, consteval is not required here. constexpr would have done the same since you already force constant evaluation with constexpr on the variable.

Is it UB to change a member of a const object via a constructor-bound reference?

In short: Does the following code have Undefined Behavior, or is this fine?
struct X
{
X(int b)
: value(b)
, ref(value)
{
}
int value;
int& ref;
void isThisUB() const
{
ref = 1;
}
};
int main()
{
const X x(2);
// Is either of these fine?
x.isThisUB();
x.ref = 3;
return x.value;
}
https://godbolt.org/z/1TE9a7M4a
X::value is const for x. According to my understanding of const semantics, this means that modifying it in any way is UB. Yet we can take a non-const reference to it in the constructor and then modify it through that, either in a const member function or directly.
The C++ (at least 17) standard gives an example of const-related UB in [dcl.type.cv] that looks mostly the same, except it employs const_cast. Note how p->x.j = 99 is denoted as UB. I do not see a fundamental difference between achieving this with const_cast vs my above code.
So, is the code above UB? Are non-const reference members/pointers really this big of a footgun?
(If you can come up with search keywords that yield a related question and not just random const stuff, I'll be mighty impressed.)
Does the following code have Undefined Behavior, or is this fine?
It has UB. Standard says:
[dcl.type.cv]
Except that any class member declared mutable can be modified, any attempt to modify a const object during its lifetime results in undefined behavior.
x is const and you modify its non-mutable member.
I do not see a fundamental difference between achieving this with const_cast vs my above code.
Indeed. Both are UB for the same reason.
Are non-const reference members/pointers really this big of a footgun?
The trigger for the footgun is the issue that the object is temporarily non-const while it is within its constructor. Hence pointers and references to non-const "this" and its subobjects are readily available wthin the constructor regardless of whether the object is going to be const or not. Thus we can conclude that storing those pointers/references for later use is ill-advised.
Storing pointers and references as members referring to "this" are a footgun for several other reasons as well. They require storage that's otherwise unnecessary if you were to access the referred member through its name directly. Furthermore, you'll find that the copy-semantics of the class will likely not be what you had in mind.
If you want to point to a member out of several alternatives, then use a member-pointer, not an object pointer / reference (using storage cannot be avoided for such case). This solves both copying and accidental const violation.

Are const arguments "real" constants?

AFAIK removing constness from const variables is undefined behavior:
const int i = 13;
const_cast<int&>(i) = 42; //UB
std::cout << i << std::endl; //out: 13
But are const function arguments "real" constants? Let's consider following example:
void foo(const int k){
const_cast<int&>(k) = 42; //UB?
std::cout << k << std::endl;
}
int main(){
foo(13); //out: 42
}
Seems like compiler doesn't apply the same optimizations to const int k as to const int i.
Is there UB in the second example? Does const int k help compiler to optimize code or compiler just checks const correctness and nothing more?
Example
The i in const int i = 13; can be used as constant expression (as template argument or case label) and attempts to modify it are undefined behavior. It is so for backwards compatibility with pre-C++11 code that did not have constexpr.
The declarations void foo(const int k); and void foo(int k); are declaring same function; the top level const of parameters does not participate in function's signature. Parameter k must be passed by value and so can't be "real" constant. Casting its constness away is not undefined behavior. Edit: But any attempt to modify it is still undefined because it is const object [basic.type.qualifier] (1.1):
A const object is an object of type const T or a non-mutable subobject of such an object.
By [dcl.type.cv] 4 const object can't be modified:
Except that any class member declared mutable (10.1.1) can be modified, any attempt to modify a const object during its lifetime (6.8) results in undefined behavior.
And since function parameters are with automatic storage duration a new object within its storage can't be also created by [basic.life] 10:
Creating a new object within the storage that a const complete object with static, thread, or automatic storage duration occupies, or within the storage that such a const object used to occupy before its lifetime ended, results in undefined behavior.
It is unclear why k was declared const at the first place if there is plan to cast its constness away? The only purpose of it feels to be to confuse and to obfuscate.
Generally we should favor immutability everywhere since it helps people to reason. Also it may help compilers to optimize. However where we only declare immutability, but do not honor it, there it works opposite and confuses.
Other thing that we should favor are pure functions. These do not depend on or modify any external state and have no side-effects. These also are easier to reason about and to optimize both for people and for compilers. Currently such functions can be declared constexpr. However declaring the by-value parameters as const does not help any optimizations to my knowledge, even in context of constexpr functions.
But are const function arguments "real" constants?
I think the answer is yes.
They won't be stored in ROM (for example), so casting away const will at least appear to work OK. But my reading of the standard is that the the parameter's type is const int and therefore it is a const object ([basic.type.qualifier]), and so modifying it is undefined ([dcl.type.cv]).
You can confirm the parameter's type fairly easily:
static_assert( std::is_same_v<const int, decltype(k)> );
Is there UB in the second example?
Yes, I think there is.
Does const int k help compiler to optimize code or compiler just checks const correctness and nothing more?
In theory the compiler could assume that in the following example k is not changed by the call to g(const int&), because modifying k would be UB:
extern void g(const int&);
void f(const int k)
{
g(k);
return k;
}
But in practice I don't think compilers take advantage of that, instead they assume that k might be modified (compiler explorer demo).
I'm not sure what you mean by a "real" const, but here goes.
Your const int i is a variable declaration outside of any function. Since modifying that variable would cause Undefined Behaviour, the compiler gets to assume that its value will never change. One easy optimization would be that anywhere you read from i, the compiler doesn't have to go and read the value from main memory, it can emit the assembly to use the value directly.
Your const int k (or the more interesting const int & k) is a function parameter. All it promises is that this function won't be changing the value of k. That doesn't mean that it cannot be changed somewhere else. Each invocation of the function could have a different value for k.

Does const on value different const on array

It seems that const on array will cause the memory used to store the array mark read-only in memory. But why const on int won't do the same thing?
Code is here:
int main(int argc, const char * argv[])
{
const int vv = 10 ;
int * p = (int *)&vv ;
*p = 5 ; // work well
const int aa[3] = {11, 12, 13} ;
int * pp = (int *)&aa[1] ;
*pp = 100 ; // EXC_BAD_ACCESS
return 0;
}
Both the first and the second, as stated in §7.1.6.1/4, results in undefined behavior:
Except that any class member declared mutable (7.1.1) can be modified, any attempt to modify a const object during its lifetime (3.8) results in undefined behavior.
This is, in fact, one of those cases where a C++-style cast (like static_cast), except for const_cast, would have warned you.
In the C standard the same reference is made at §6.7.3/5:
If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined.
Any attempt to modify data that is const results in undefined behaviour (UB). That means that your code might seem to "work well", but it cannot be relied on for anything.
Both your examples are UB.
If you get a bad access on the second access it means your compiler doesn't copy the initializer list to aa but have aa point to the process's const data section (e.g. where the string literals are stored).
As others have pointed out, changing a const produces undefined behavior. In Visual Studio, your code will not produce any errors.
The problem surfaces when multiple references to the same const object (or array in C) occur. Changing one will change all of them and only in optimized builds. This sort of bug is hard to track.
look ISO/IEC 9899 TC3 ->§6.7.3 part 5 there is said:
If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is made to refer to an object defined with a volatile-qualified type through use of an lvalue with non-volatile-qualified type, the behavior is undefined.
So I guess this is clear enough for C, isn't it?
Your first part of code is pretty correct. You are modifying the value stored at location using its reference. This is defined behavior. The compiler checks this variable won't be on left side of an expression.
Use the following to keep value pointed by it to be constant.
const int * p = (int *)&vv

const cast to a global var and program crashed (C++)

int main()
{
const int maxint=100;//The program will crash if this line is put outside the main
int &msg=const_cast<int&>(maxint);
msg=200;
cout<<"max:"<<msg<<endl;
return 0;
}
The function will run ok if the 'const int maxint=100;' definition is put inside the main function but crash and popup a error message said "Access Violation" if put outside.
Someone says it's some kind of 'undefined behavior', and i want to know the exact answer and how i can use the const cast safely?
They are correct, it is undefined behaviour. You're not allowed to modify the value of a const variable, which is the danger of casting away the constness of something: you better know it's not really const.
The compiler, seeing that maxint is const and should never be modified, doesn't even have to give it an address. It can just replace all the uses of maxint with 100 if it sees fit. Also it might just put the constant in to a portion of memory that is read-only, as Matteo Italia points out, which is probably what's happening for you. That's why modifying it produces undefined behaviour.
The only way you can safely cast away the constness of a variable is if the variable is not actually const, but the const qualifier was added to a non-const variable, like this:
int f(const int& x) {
int& nonconst = const_cast<int&>(x);
++nonconst;
}
int blah = 523;
f(blah); // this is safe
const int constblah = 123;
f(constblah); // this is undefined behaviour
Think about this example, which compiles perfectly:
int f(const int& x) {
int& nonconst = const_cast<int&>(x);
++nonconst;
}
int main() {
f(4); // incrementing a number literal???
}
You can see how using const_cast is pretty dangerous because there's no way to actually tell whether a variable is originally const or not. You should avoid using const_cast when possible (with functions, by just not accepting const parameters).
Modifying an object that is const (with the exception of mutable members) results in undefined behavior (from the C++03 standard):
7.1.5.1/4 "The cv-qualifiers"
Except that any class member declared mutable (7.1.1) can be modified,
any attempt to modify a const object during its lifetime (3.8) results
in undefined behavior.
The above undefined behavior is specifically called out in the standard's section on const_cast:
5.2.11/7 "Const cast"
[Note: Depending on the type of the object, a write operation through
the pointer, lvalue or pointer to data member resulting from a
const_cast that casts away a const-qualifier68) may produce undefined
behavior (7.1.5.1). ]
So, if you have a const pointer or reference to an object that isn't actually const, you're allowed to write to that object (by casting away the constness), but not if the object really is const.
The compiler is permitted to place const objects in read-only storage, for example. It doesn't have to though, and apparently doesn't for your test code that doesn't crash.
You are only allowed to cast away constness of an object which is known not to be const. For example, an interface may pass objects through using a const pointer or a const reference but you passed in an object which isn't const and want/need to modify it. In this case it may be right thing to cast away constness.
On the other hand, casting away constness of an object which was const all the way can land you in deep trouble: when accessing this object, in particular when writing to it, the system may cause all kinds of strange things: the behavior is not defined (by the C++ standard) and on a particular system it may cause e.g. an access violation (because the address of the object is arranged to be in a read-only area).
Note that despite another response I saw const objects need to get an address assigned if the address is ever taken and used in some way. In your code the const_cast<int&>(maxint) expression essentially obtains the address of your constant int which is apparently stored in a memory area which is marked to be read-only. The interesting aspect of your code snippet is that it is like to apparently work, especially when turning on optimization: the code is simple enough that the compiler can tell that the changed location isn't really used and doesn't actually attempt to change the memory location! In this case, no access violation is reported. This is what apparently is the case when the constant is declared inside the function (although the constant may also be located on the stack which typically can't be marked as read-only). Another potential outcome of your code is (independent of whether the constant is declared inside the function or not) is that is actually changed and sometimes read as 100 and in other contexts (which in some way or another involve the address) as 200.