Undefined behaviour accessing const ptr sometimes - c++

I have a header file defined as
#pragma once
#include <iostream>
template<int size>
struct B
{
double arr[size * size];
constexpr B() : arr()
{
arr[0] = 1.;
}
};
template<int size>
struct A
{
const double* arr = B<size>().arr;
void print()
{
// including this statement also causes undefined behaviour on subsequent lines
//printf("%i\n", arr);
printf("%f\n", arr[0]);
printf("%f\n", arr[0]); // ???
// prevent optimisation
for (int i = 0; i < size * size; i++)
printf("%f ", arr[i]);
}
};
and call it with
auto a = A<8>();
a.print();
Now this code only runs expectedly when compiled with msvc release mode (all compiled with c++17).
expected output:
1.000000
1.000000
msvc debug:
1.000000
-92559631349317830736831783200707727132248687965119994463780864.000000
gcc via mingw (with and without -g):
1.000000
0.000000
However, this behaviour is inconsistent. The expected output is given if I replace double arr[size * size] with double arr[size] instead. No more problems if I allocate arr on the heap of course.
I looked at the assembly of the msvc debug build but I don't see anything out of the ordinary. Why does this undefined behaviour only occur sometimes?
asm output
decompiled msvc release

In this declaration
const double* arr = B<size>().arr;
there is declared a pointer to (the first element of ) a temporary array that will not be alive after the declaration
So dereferencing the pointer results in undefined behavior.

When you wrote:
const double* arr = B<size>().arr;
The above statement initializes a pointer to const double i.e., const double* named arr with a temporary array object. Since this temporary array object will be destroyed at the end of the full-expression, using arr will lead to undefined behavior.
Why does this undefined behaviour only occur sometimes?
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior.
So the output that you're seeing(maybe seeing) is a result of undefined behavior. And as i said don't rely on the output of a program that has UB. The program may just crash.
So the first step to make the program correct would be to remove UB. Then and only then you can start reasoning about the output of the program.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.

It seems that it was completely coincidental that smaller allocations were always addressed in a spot that would not get erased by the rep stosd instruction present in printf. Not caused by strange compiler optimisations as I first thought it was.
What does the "rep stos" x86 assembly instruction sequence do?
I also have no idea why I decided to do it this way. Not exactly the question I asked but I ultimately wanted a compile time lookup table so the real solution was static inline constexpr auto arr = B<size>() on c++20. Which is why the code looks strange.

Related

Same array giving garbage value at one place and an unrelated value at the other place

In the following code:
#include<iostream>
using namespace std;
int main()
{
int A[5] = {10,20,30,40,50};
// Let us try to print A[5] which does NOT exist but still
cout <<"First A[5] = "<< A[5] << endl<<endl;
//Now let us print A[5] inside the for loop
for(int i=0; i<=5; i++)
{
cout<<"Second A["<<i<<"]"<<" = "<<A[i]<<endl;
}
}
Output:
The first A[5] is giving different output (is it called garbage value?) and the second A[5] which is inside the for loop is giving different output (in this case, A[i] is giving the output as i). Can anyone explain me why?
Also inside the for loop, if I declare a random variable like int sax = 100; then A[5] will take the value 100 and I don't have the slightest of clue why is this happening.
I am on Windows, CodeBlocks, GNUGCC Compiler
Well you invoke Undefined Behaviour, so behaviour is err... undefined and anything can happen including what your show here.
In common implementations, data past the end of array could be used by a different element, and only implementation details in the compiler could tell which one.
Here your implementation has placed the next variable (i) just after the array, so A[5] is an (invalid) accessor for i.
But please do not rely on that. Different compilers or different compilation options could give a different result. And as a compiler is free to assume that you code shall not invoke UB an optimizing compiler could just optimize out all of your code and only you would be to blame.
TL/DR: Never, ever try to experiment UB: anything can happen from a consistent behaviour to an immediate crash passing by various inconsistent outputs. And what you see will not be reproduced in a different context (context here can even be just a different run of same code)
In your Program, I think "there is no any syntax issue" because when I execute this same code in my compiler. Then there is no any issue likes you.
It gives same garbage value at direct as well as in loop.
enter image description here
The problem is that when you wrote:
cout <<"First A[5] = "<< A[5] << endl<<endl;//this is Undefined behavior
In the above statement you're going out of bounds. This is because array index starts from 0 and not 1.
Since your array size is 5. This means you can safely access A[0],A[1],A[2],A[3] and A[4].
On the other hand you cannot access A[5]. If you try to do so, you will get undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior.
So the output that you're seeing is a result of undefined behavior. And as i said don't rely on the output of a program that has UB.
So the first step to make the program correct would be to remove UB. Then and only then you can start reasoning about the output of the program.
For the same reason, in your for loop you should replace i<=5 with i<5.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.

Why change UB that will always work as intended? [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Undefined behavior and sequence points
(5 answers)
Closed 5 years ago.
In the legacy code base I'm working on, I discovered the line
n = ++n % size;
that is just a bad phrasing of the intended
n = (n+1) % size;
as deduced from the surrounding code and runtime-proved. (The latter now replaces the former.)
But since this code was marked as an error by Cppckeck, and caused a warning in GCC, without ever having caused any malfunction, I didn't stop thinking here. I reduced the line to
n = ++n;
still getting the original error/warning messages:
Cppcheck 1.80:
Id: unknownEvaluationOrder
Summary: Expression 'n=++n' depends on order of evaluation of side effects
Message: Expression 'n=++n' depends on order of evaluation of side effects
GCC (mingw32-g++.exe, version 4.9.2, C++98):
warning: operation on 'n' may be undefined [-Wsequence-point]|
I already learned that assignment expressions in C/C++ can be heavily affected by undefined evaluation order, but in this very case I just can't imagine how.
Can the undefined evaluation order of n = ++n; really be relevant for the resulting program, especially for intended value of n? That's what I imagine what may happen.
Scenario #1
++n;
n=n;
Scenario #2
n=n;
++n;
I know that the meaning and implications of relaying on undefined behaviour in C++, is hard to understand and hard to teach.
I know that the behaviour of n=++n; is undefined by C++ standards before C++11. But it has a defined behaviour from C++11 on, and this (now standard-defined behaviour) is exactly the same I'm observing with several compilers[1] for this small demo program
#include <iostream>
using namespace std;
int main()
{
int n = 0;
cout << "n before: " << n << endl;
n=++n;
cout << "n after: " << n << endl;
return 0;
}
that has the output
n before: 0
n after: 1
Is it reasonable to expect that the behaviour is actually the same for all compilers regardless of being defined or not by standards? Can you (a) show one counter example or (b) give an easy to understand explanation how this code could produce wrong results?
[1] the compilers a used
Borland-C++ 5.3.0 (pre-C++98)
Borland-C++ 5.6.4 (C++98)
C++ (vc++)
C++ (gcc 6.3)
C++14 (gcc 6.3)
C++14 clang
The increment order is precisely defined. It is stated there that
i = ++i + 2; // undefined behavior until C++11
Since you use a C++11 compiler, you can leave your code as is is. Nevertheless, I think that the expressiveness of
n = (n+1) % size;
is higher. You can more easily figure out what was intended by the writer of this code.
According to cppreference:
If a side effect on a scalar object is unsequenced relative to another side effect on the same scalar object, the behavior is undefined:
i = ++i + 2; // undefined behavior until C++11
i = i++ + 2; // undefined behavior until C++17
f(i = -2, i = -2); // undefined behavior until C++17
f(++i, ++i); // undefined behavior until C++17, unspecified after C++17
i = ++i + i++; // undefined behavior
For the case n = ++n; it would be an undefined behavior but we do not care which assignment happens first, n = or ++n.

Dealing with int arrays in c++ 2

Hi guys could anyone explain why does this program correctly even being a bit starnge:
int main()
{
int array[7]={5,7,57,77,55,2,1};
for(int i=0;i<10;i++)
cout<<i[array]<<", "; //array[i]
cout<<endl;
return 0;
}
why does the program compile correctly??
An expression (involving fundamental types) such as this:
x[y]
is converted at compile time to this:
*(x + y)
x + y is the same as y + x
Therefore: *(x + y) is the same as *(y + x)
Therefore: x[y] is the same as y[x]
In your program, you are trying to index an array out of its bounds. This will probably lead to a Segmentation Violation error, meaning that in your program, there is an attempt from the CPU to access memory that can not be physically addressed (think that it is not allocated for the array, as it is out of its bounds). This error is a runtime error, meaning that it is not in the responsibility of the compiler to check it but will it will be raised from the Operating System, having become notified by the hardware. Compiler's 'error' responsibilities are lexical and syntactical errors checking, in order to compile correctly your code into machine code and finally, binary.
For more information about Segmentation Violation error or Segmentation Fault, as commonly known, look here:
http://en.wikipedia.org/wiki/Segmentation_fault
You've come across Undefined Behavior. This means that the compiler is allowed to do whatever it wants with your program -- including compiling it without warnings or errors. Furthermore, it can produce any code it wants to for the case of undefined behavior, including assuming that it does not occur (a common optimization). Accessing an array out-of-bounds is an example of undefined behavior. Signed integer overflow, data races, and invalid pointer creation/use are others.
Theoretically, the compiler could emit code that invoked the shell and performed rm -rf /* (delete every file you have permission to delete)! Of course, no reasonable compiler would do this, but you get the idea.
Simply put, a program with undefined behavior is not a valid C++ program. This is true for the entirety of the program, not just after the undefined behavior. A compiler would have been perfectly free to compile your program to a no-op.
Adding to Benjamin Lindley, Compile the below code and you will see how the address are calculated:
int main()
{
int array[7]={5,7,57,77,55,2,1};
cout<<&(array[0])<<endl;
cout<<&(array[1])<<endl;
return 0;
}
output:(for me);-)
0x28ff20
0x28ff24
Its just &(array+0) and &(array+1)..

Is it legal to initialize a possibly invalid reference without using it?

I would like to save typing in some loop, creating reference to an array element, which might not exist. Is it legal to do so? A short example:
#include<vector>
#include<iostream>
#include<initializer_list>
using namespace std;
int main(void){
vector<int> nn={0,1,2,3,4};
for(size_t i=0; i<10; i++){
int& n(nn[i]); // this is just to save typing, and is not used if invalid
if(i<nn.size()) cout<<n<<endl;
}
};
https://ideone.com/nJGKdW compiles and runs the code just fine (I tried locally with both g++ and clang++), but I am not sure if I can count on that.
PS: Neither gcc not clang complain, even when compiled+run with -Wall and -g.
EDIT 2: The discussion focuses on array indexing. The real code actually uses std::list and a fragment would look like this:
std::list<int> l;
// the list contains something or not, don't know yet
const int& i(*l.begin());
if(!l.empty()) /* use i here */ ;
EDIT 3: Legal solution to what I was doing is to use iterator:
std::list<int> l;
const std::list<int>::iterator I(l.begin()); // if empty, I==l.end()
if(!l.empty()) /* use (*I) here */ ;
No it's not legal. You are reading data out of bounds from the vector in the declaration of n and therefore your program have undefined behavior.
No, for two reasons:
The standard states (8.3.2):
A reference shall be initialized to refer to a valid object or function
std::vector::operator[] guarantees that even if N exceeds the container size, the function never throws exceptions (no-throw guarantee, no bounds checking other than at()). However, in that case, the behavior is undefined.
Therefore, your program is not well-formed (bullet point 1) and invoke undefined behaviour (bullet point 2).
I'd be surprised if this is "allowed" by the specification. However, what it does is store the address of an element that is outside the range of its allocation, which shouldn't in itself cause a problem in most cases - in extreme cases, it may overflow the pointer type, which could cause problems, I suppose.
In other words, if i is WAY outside the size of nn, it could be a problem, not necessarily saying i has to be enormous - if each element in the vector is several megabytes (or gigabytes in a 64-bit machine), you can quite quickly run into problems with address range.
But don't ask me to quote the specification - someone else will probably do that.
Edit: As per comment, since you are requesting the address of a value outside of the valid size, at least in debug builds, this may well cause the vector implementation to assert or otherwise "warn you that this is wrong".

issue related to const and pointers

I have written 2 programs. Please go through both the programs and help me in understanding why variable 'i' and '*ptr' giving different values.
//Program I:
//Assumption: Address of i = 100, address of ptr = 500
int i = 5;
int *ptr = (int *) &i;
*ptr = 99;
cout<<i; // 99
cout<<&i;// 100
cout<<ptr; // 100
cout<<*ptr; // 99
cout<<&ptr; // 500
//END_Program_I===============
//Program II:
//Assumption: Address of i = 100, address of ptr = 500
const int i = 5;
int *ptr = (int *) &i;
*ptr = 99;
cout<<i; // 5
cout<<&i;// 100
cout<<ptr; // 100
cout<<*ptr; // 99
cout<<&ptr; // 500
//END_PROGRAM_II===============
The confusion is: Why variable i still coming as 5, even though *ptr ==99?
In the following three lines, you are modifying a constant:
const int i = 5;
int *ptr = (int *) &i;
*ptr = 99;
This is undefined behavior. Anything can happen. So don't do it.
As for what's happening underneath in this particular case:
Since i is const, the compiler assumes it will not change. Therefore, it simply inlines the 5 to each place where it is used. That's why printing out i shows the original value of 5.
All answer will probably talk about "undefined behavior", since you are attempting the logical nonsense of modifying a constant.
Although this is technically perfect, let me give you some hints about why this happens (about "how", see Mysticial answer).
It happens because C++ is by design an "imperfectly specified language". The "imperfection" consist in a number of "undefined behaviors" that pervade the language specification.
In fact, language designers deliberately choose that -in some circumstances- instead of say "if you do this, will gave you that", (that may be: you got this code, or you got this error) thay prefer to say "we don't define what will happen".
This lets the compiler manufacturers free to decide what to do. And since there are many compiler working on many platforms, may be the optimal solution for one in not necessarily the optimal solution for another (that may have rely to a machine with a different instruction set) and hence you (as a programmer) are left in the dramatic situation that you'll never know what to expect, and even if you test it, you cannot trust the result of the test, since in another situation (compiling the same code with a different compiler or just a different version of it, or for a different platform) it will be different.
The "bad" thing, here, is that a compiler should warn when an undefined behavior is hit (forcing a const should be warned as a potential bug, especially if the compiler does const-inlining otimizations, since it is a nonsense if a const is allowed to be changed), as mot likely it does, if you specify the proper flag (may be -W4 or -wall or -pedantic or similar, depending of the compiler you have).
In particular the line
int *ptr = (int *) &i;
should issue a warning like:
warning: removing cv-qualifier from &i.
So that, if you correct your program as
const int *ptr = (const int *) &i;
to satisfy the waarning, you wil get an error at
*ptr = 99;
as
error: *ptr is const
thus making the problem evident.
Moral of the story:
From a legal point of view, you wrote bad code since it is -by language definition- relying on undefined behavior.
From a moral point of view: the compiler kept an unfair behavior: performing const-inlining (replacing cout << i with cout << 5) after accepting (int*)&i is a self-contradition, and incoherent behavior should at least be warned.
If it wants to do one thing must not accept the other, or vice-versa.
So check if there is a flag you can set to be warned, and if not, report to the compiler manufacturer its unfairness: it didn't warn about its own contradiction.
const int i = 5;
Implies that the variable i is a const and it cannot/should not be changed, it is Imuttable and changing it through a pointer results in Undefined Behavior.
An Undefined Behavior means that the program is ill-formed and any behavior is possible. Your program might seem to work as desired, or not or it might even crash. All safe bets are off.
Remember the Rule:
It is Undefined Behavior to modify an const variable. Don't ever do it.
You're attempting to modify a constant through a pointer, which is undefined. This means anything unexpected can happen, from the correct output, to the wrong output, to the program crashing.