Non-trivial example of undefined behavior with const_cast - c++

The following code is, as far as I understand it, undefined behavior according to the c++ standard (section 7.1.5.1.4 [dcl.type.cv]/4 in particular).
#include <iostream>
struct F;
F* g;
struct F {
F() : val(5)
{
g = this;
}
int val;
};
const F f;
int main() {
g->val = 8;
std::cout << f.val << std::endl;
}
However, this prints '8' with every compiler and optimization setting I have tried.
Question: Is there an example that will exhibit unexpected results with this type of "implicit const_cast"?
I am hoping for something as spectacular as the results of
#include <iostream>
int main() {
for (int i = 0; i <=4; ++i)
std::cout << i * 1000000000 << std::endl;
}
on, e.g., gcc 4.8.5 with -O2
EDIT: the relevant section from the standard
7.1.5.1.4: Except that any class member declared mutable (7.1.1) can be modified, any attempt to modify a const object during its lifetime
(3.8) results in undefined behavior.
In reply to the comment suggesting a duplicate; it is not a duplicate because I am asking for an example where "unexpected" results occur.

Not as spectacular:
f.h (guards omitted):
struct F;
extern F* g;
struct F {
F() : val(5)
{
g = this;
}
int val;
};
extern const F f;
void h();
TU1:
#include "f.h"
// definitions
F* g;
const F f;
void h() {}
TU2:
#include "f.h"
#include <iostream>
int main() {
h(); // ensure that globals have been initialized
int val = f.val;
g->val = 8;
std::cout << (f.val == val) << '\n';
}
Prints 1 when compiled with g++ -O2, and 0 when compiled with -O0.

The main case of "undefined" behavior would be that typically, if someone sees const, they will assume that it does not change. So, const_cast intentionally does something that many libraries and programs would either not expect to be done or consider as explicit undefined behavior. It's important to remember that not all undefined behavior comes from the standard alone, even if that is the typical usage of the term.
That said, I was able to locate a place in the standard library where such thinking can be applied to do something I believe would more narrowly be considered undefined behavior: generating an std::map with "duplicate keys":
#include "iostream"
#include "map"
int main( )
{
std::map< int, int > aMap;
aMap[ 10 ] = 1;
aMap[ 20 ] = 2;
*const_cast< int* >( &aMap.find( 10 )->first ) = 20;
std::cout << "Iteration:" << std::endl;
for( std::map< int,int >::iterator i = aMap.begin(); i != aMap.end(); ++i )
std::cout << i->first << " : " << i->second << std::endl;
std::cout << std::endl << "Subscript Access:" << std::endl;
std::cout << "aMap[ 10 ]" << " : " << aMap[ 10 ] << std::endl;
std::cout << "aMap[ 20 ]" << " : " << aMap[ 20 ] << std::endl;
std::cout << std::endl << "Iteration:" << std::endl;
for( std::map< int,int >::iterator i = aMap.begin(); i != aMap.end(); ++i )
std::cout << i->first << " : " << i->second << std::endl;
}
The output is:
Iteration:
20 : 1
20 : 2
Subscript Access:
aMap[ 10 ] : 0
aMap[ 20 ] : 1
Iteration:
10 : 0
20 : 1
20 : 2
Built with g++.exe (Rev5, Built by MSYS2 project) 5.3.0.
Obviously, there is a mismatch between the access keys and the key values in the stored pairs. It also seems that the 20:2 pair is not accessible except via iteration.
My guess is that this is happening because map is implemented as a tree. Changing the value leaves it where it initially was (where 10 would go), so it does not overwrite the other 20 key. At the same time, adding an actual 10 does not overwrite the old 10 because on checking the key value, it's not actually the same
I do not have a standard to look at right now, but I would expect this violates the definition of map on a few levels.
It might also lead to worse behavior, but with my compiler/OS combo I was unable to get it to do anything more extreme, like crash.

Related

Values in optional<map<string, string>> getting "corrupted" in very specific cases

Sorry for the poor title, but what I'm seeing is bizarre and hard to explain succinctly.
Basically, we have an optional<map<string, string>> in our code accessed through a getter/setter, and sometimes when we inspect the values we get very strange results. Here is simplified code which repros the issue:
#include <optional>
#include <map>
#include <iostream>
using namespace std;
optional<map<string, string>> optmap;
static void Set(optional<map<string, string>> m);
static optional<map<string, string>> Get();
static void PrintMap(map<string, string> m);
int main(int const argc, char const * const *argv)
{
map<string, string> sample;
sample.emplace("testtesttesttest1", "testtesttesttest1");
sample.emplace("testtesttesttest2", "testtesttesttest2");
sample.emplace("testtesttesttest3", "testtesttesttest3");
cout << "sample:" << endl;
PrintMap(sample);
Set(sample);
map<string, string> result = Get().value();
cout << "result:" << endl;
PrintMap(result);
cout << "function call:" << endl;
PrintMap(Get().value());
cout << "inline iteration:" << endl;
for (auto &item : Get().value())
{
cout << item.first << ", " << item.second << endl;
}
}
static void Set(optional<map<string, string>> m)
{
optmap = m;
}
static optional<map<string, string>> Get()
{
return optmap;
}
static void PrintMap(map<string, string> m)
{
for (auto &item : m)
{
cout << item.first << ", " << item.second << endl;
}
}
I compiled using g++ -std=c++17 and got this output on my most recent run:
$ ./a.out
sample:
testtesttesttest1, testtesttesttest1
testtesttesttest2, testtesttesttest2
testtesttesttest3, testtesttesttest3
result:
testtesttesttest1, testtesttesttest1
testtesttesttest2, testtesttesttest2
testtesttesttest3, testtesttesttest3
function call:
testtesttesttest1, testtesttesttest1
testtesttesttest2, testtesttesttest2
testtesttesttest3, testtesttesttest3
inline iteration:
#�M�OVtesttest1, ��M�OVtesttest1
��M�OVtesttest2, ��M�OVtesttest2
��M�OVtesttest3, testtest3
Note that the values only get "corrupted" in the last case where we iterate using for (auto &item : Get().value()). What's even stranger is that this only appears to happen for strings of a certain length. If the values are less than 16 characters long, we have no problem. If I change the map to contain the following:
sample.emplace("fifteencharokay", "15");
sample.emplace("sixteencharweird", "16");
I get this output:
$ ./a.out
sample:
fifteencharokay, 15
sixteencharweird, 16
result:
fifteencharokay, 15
sixteencharweird, 16
function call:
fifteencharokay, 15
sixteencharweird, 16
inline iteration:
fifteencharokay, 15
harweird, 16
(Notice that "sixteencharweird" has been truncated to "harweird" in the last line)
What is happening here? Why is it that we're having issues in this one very specific case (long strings and iterating directly over the result of a function call)? Is there some sort of C++ rule I'm breaking here by iterating this way?
In this loop:
for (auto &item : Get().value())
you are invoking undefined behavior, because the temporary returned by Get() will die at the end of the full expression, and the .value() that your range for loop will be iterating over refers to memory that no longer exists.
The strange behavior that you notice with strings less than 16 chars long, is possibly due to small-string-optimization. Since the string holds on to the internal buffer for short strings, you can still see the memory there. Of course, this is still UB, and you can't rely on it.
You can fix this issue by doing:
auto const &g = Get();
for (auto &item : g.value())
Here's a demo.
In fact, c++20 adds the range-for with initializer construct for exactly this purpose:
for (auto const &g = Get(); auto &item : g.value())

Printing a vector of reference_wrapper<int>

I have the following test code that's runnable under clang.
#include <algorithm>
#include <vector>
#include <iostream>
int main() {
std::vector<int> vs{1, 2, 4, 5};
std::vector<std::reference_wrapper<int>> vs1;
for (int i : vs) {
std::cout << "loop: " << i << std::endl;
vs1.emplace_back(i);
}
for (auto p : vs1) {
std::cout << p << std::endl;
}
return 0;
}
You can plug that into https://rextester.com/l/cpp_online_compiler_clang (or locally). The result is:
loop: 1
loop: 2
loop: 4
loop: 5
5
5
5
5
I'd expect 1,2,4,5, not 5 all the way.
The code won't work on non-clang. Where's the problem?
i is a local variable inside its declaring for loop. It is a copy of each int in the vs vector. You are thus (via the emplace_back() call) creating reference_wrapper objects that refer to a local variable, keeping the references alive after the lifetime of the referred-to variable (i) has ended. This is undefined behavior.
The fix is to make i be a reference to each int, not a copy, that way the reference_wrappers refer to the ints in vs as expected:
for (int& i : vs)
First, you forgot <functional> header.
Second, reference_wrapper<int> stores reference to an int. Not its value. So in this loop:
for (int i : vs) {
std::cout << "loop: " << i << std::endl;
vs1.emplace_back(i);
}
You are changing value of i but not its place in memory. It is always the same variable. That's why it prints the last value stored in that variable, which is 5.
You may imagine this range- based for loop
for (int i : vs) {
std::cout << "loop: " << i << std::endl;
vs1.emplace_back(i);
}
the following way
for ( auto first = vs.begin(); first != vs.end(); ++first )
{
int i = *first;
vs1.emplace_back(i);
}
that is within the loop you are dealing with the local variable i that will not be alive after exiting the loop.
You need to use a reference to elements of the vector like
for (int &i : vs) {
std::cout << "loop: " << i << std::endl;
vs1.emplace_back(i);
}

Is this a bug in std::gcd?

I've come across this behavior of std::gcd that I found unexpected:
#include <iostream>
#include <numeric>
int main()
{
int a = -120;
unsigned b = 10;
//both a and b are representable in type C
using C = std::common_type<decltype(a), decltype(b)>::type;
C ca = std::abs(a);
C cb = b;
std::cout << a << ' ' << ca << '\n';
std::cout << b << ' ' << cb << '\n';
//first one should equal second one, but doesn't
std::cout << std::gcd(a, b) << std::endl;
std::cout << std::gcd(std::abs(a), b) << std::endl;
}
Run on compiler explorer
According to cppreference both calls to std::gcd should yield 10, as all preconditions are satisfied.
In particular, it is only required that the absolute values of both operands are representable in their common type:
If either |m| or |n| is not representable as a value of type std::common_type_t<M, N>, the behavior is undefined.
Yet the first call returns 2.
Am I missing something here?
Both gcc and clang behave this way.
Looks like a bug in libstc++. If you add -stdlib=libc++ to the CE command line, you'll get:
-120 120
10 10
10
10

Moving objects from one unordered_map to another container

My question is that of safety. I've searched cplusplus.com and cppreference.com and they seem to be lacking on iterator safety during std::move. Specifically: is it safe to call std::unordered_map::erase(iterator) with an iterator whose object has been moved? Sample code:
#include <unordered_map>
#include <string>
#include <vector>
#include <iostream>
#include <memory>
class A {
public:
A() : name("default ctored"), value(-1) {}
A(const std::string& name, int value) : name(name), value(value) { }
std::string name;
int value;
};
typedef std::shared_ptr<const A> ConstAPtr;
int main(int argc, char **argv) {
// containers keyed by shared_ptr are keyed by the raw pointer address
std::unordered_map<ConstAPtr, int> valued_objects;
for ( int i = 0; i < 10; ++i ) {
// creates 5 objects named "name 0", and 5 named "name 1"
std::string name("name ");
name += std::to_string(i % 2);
valued_objects[std::make_shared<A>(std::move(name), i)] = i * 5;
}
// Later somewhere else we need to transform the map to be keyed differently
// while retaining the values for each object
typedef std::pair<ConstAPtr, int> ObjValue;
std::unordered_map<std::string, std::vector<ObjValue> > named_objects;
std::cout << "moving..." << std::endl;
// No increment since we're using .erase() and don't want to skip objects.
for ( auto it = valued_objects.begin(); it != valued_objects.end(); ) {
std::cout << it->first->name << "\t" << it->first.value << "\t" << it->second << std::endl;
// Get named_vec.
std::vector<ObjValue>& v = named_objects[it->first->name];
// move object :: IS THIS SAFE??
v.push_back(std::move(*it));
// And then... is this also safe???
it = valued_objects.erase(it);
}
std::cout << "checking... " << named_objects.size() << std::endl;
for ( auto it = named_objects.begin(); it != named_objects.end(); ++it ) {
std::cout << it->first << " (" << it->second.size() << ")" << std::endl;
for ( auto pair : it->second ) {
std::cout << "\t" << pair.first->name << "\t" << pair.first->value << "\t" << pair.second << std::endl;
}
}
std::cout << "double check... " << valued_objects.size() << std::endl;
for ( auto it : valued_objects ) {
std::cout << it.first->name << " (" << it.second << ")" << std::endl;
}
return 0;
}
The reason I ask is that it strikes me that moving the pair from the unordered_map's iterator may (?) therefore *re*move the iterator's stored key value and therefore invalidate its hash; therefore any operations on it afterward could result in undefined behavior. Unless that's not so?
I do think it's worth noting that the above appears to successfully work as intended in GCC 4.8.2 so I'm looking to see if I missed documentation supporting or explicitly not supporting the behavior.
// move object :: IS THIS SAFE??
v.push_back(std::move(*it));
Yes, it is safe, because this doesn't actually modify the key. It cannot, because the key is const. The type of *it is std::pair<const ConstAPtr, int>. When it is moved, the first member (the const ConstAPtr) is not actually moved. It is converted to an r-value by std::move, and becomes const ConstAPtr&&. But that doesn't match the move constructor, which expects a non-const ConstAPtr&&. So the copy constructor is called instead.

Why this data member is initialized? [duplicate]

This question already has answers here:
Uninitialized values being initialized?
(7 answers)
Closed 8 years ago.
I'm doing some testing...
Firstly I post my source code
the .h file
class Complex{
private:
int r = 0;//initializer
int i ;
public:
Complex(int , int I = 0);
Complex();
void print();
void set(int, int I = 1);
static void print_count();
static int count;
};
the .cpp file
#include <iostream>
#include "complex.h"
int Complex::count = 1;
Complex::Complex(int R , int I){
r = R;
i = I;
count++;
std::cout << "constructing Complex object...count is " << Complex::count << std::endl;
}
Complex::Complex(){//default constructor
std::cout << "default constructor is called..." << std::endl;
}
void Complex::print(){
std::cout << "r = " << r << ';' << "i = " << i << std::endl;
return;
}
void Complex::set(int R, int I /*= 2*/){//will be "redefaulting", an error
r = R;
i = I;
return;
}
void Complex::print_count(){//static
Complex::count = -1;//jsut for signaling...
std::cout << "count is " << count << std::endl;
return;
}
the main function
#include <iostream>
#include "complex.h"
int main(){
Complex d;//using default constructor
d.print();
/*Complex c(4, 5);*/
Complex c(4);
//c.print();
/*c.set(2, 3)*/
c.print();
c.set(2 );
c.print();
std::cout << "count is " << c.count << std::endl;//c can access member data
c.print_count();
c.count++;//
return 0;
}
consider the Complex object d constructed with default ctor
because the data member r is initialized using with 0, when executing d.print(),
r is expected to be 0
and i isn't, so I expected it to be garbage value
but when I'm testing, one strange thing happens.
if I eliminate this and the following lines of code in the main file:
std::cout << "count is " << c.count << std::endl;//c can access member data
then d.print() will give the value of i as 32767 on my system, which I guess it's a garbage value;
but once that line is added, d.print() just give i's value to 0 on my system.
I don't get it. I hasn't set, modiify or initialize i's value, why should it be 0?
or, it is also a garbage value?
or, calling one of those function corrupts the value of i?
how is the thing run behind the scene here?
thx for helping.
0 is just as garbage value as any other. Don't make the mistake of thinking otherwise.
Formally, reading an uninitialized variable is undefined behavior, so there's no point in wondering about it: just fix it by initializing the variable properly.