I have a piece of code:
#include <iostream>
using namespace std;
int main(){
char a[5][5] = {{'x','y','z','a','v'},{'d','g','h','v','x'}};
for(int i=0; i<2; i++){
for(int j = 0; j<6; j++)
{
cout << a[i][j];
}
}
return 0;
}
As you can see the first and second dimensions are or size 5 elements each. With the double for loop's I am just printing what is initialized for variable a.
The size of int "j", as it increases the output changes dramatically.
Why is this happen?
Does pointer is the solution to this? If yes how? If no what can we do to avoid run time errors caused by this incorrect access?
You might be treating this issue like an out-of-bounds error in Java, where the behavior is strictly defined: you'll get an ArrayIndexOutOfBoundsException, and the program will immediately terminate, unless the exception is caught and handled.
In C++, this kind of out-of-bounds error is undefined behavior, which means the compiler is allowed to do whatever silly thing it thinks will achieve the best performance. Generally speaking, this results in the compiler just blindly performing the same pointer arithmetic it would perform on array accesses that are in-bounds, regardless of whether the memory is valid or not.
In your case, because you've allocated 25 chars worth of memory, you'll access valid memory (in most environments, UB withstanding) at least until i * 5 + j >= 25, at which point any number of things could happen:
You could get garbage data off the stack
You could crash the program with a Segmentation Fault (Access Violation in Windows/Visual Studio)
The loop could refuse to terminate at the index you expect it to terminate.
That last one is an incredible bug: If aggressive loop optimization is occurring, you could get some very odd behavior when you make mistakes like this in your code.
What's almost certainly happening in the code you wrote is that that first point: though you allocated space for 25 chars, you only defined the contents of 10 of them, meaning any accesses beyond those first 10 will invoke a different kind of undefined behavior (access of an uninitialized variable), which the vast majority of the time, results in their values being filled in with whatever coincidentally was in that memory space before the variable was used.
Related
In the following code:
#include<iostream>
using namespace std;
int main()
{
int A[5] = {10,20,30,40,50};
// Let us try to print A[5] which does NOT exist but still
cout <<"First A[5] = "<< A[5] << endl<<endl;
//Now let us print A[5] inside the for loop
for(int i=0; i<=5; i++)
{
cout<<"Second A["<<i<<"]"<<" = "<<A[i]<<endl;
}
}
Output:
The first A[5] is giving different output (is it called garbage value?) and the second A[5] which is inside the for loop is giving different output (in this case, A[i] is giving the output as i). Can anyone explain me why?
Also inside the for loop, if I declare a random variable like int sax = 100; then A[5] will take the value 100 and I don't have the slightest of clue why is this happening.
I am on Windows, CodeBlocks, GNUGCC Compiler
Well you invoke Undefined Behaviour, so behaviour is err... undefined and anything can happen including what your show here.
In common implementations, data past the end of array could be used by a different element, and only implementation details in the compiler could tell which one.
Here your implementation has placed the next variable (i) just after the array, so A[5] is an (invalid) accessor for i.
But please do not rely on that. Different compilers or different compilation options could give a different result. And as a compiler is free to assume that you code shall not invoke UB an optimizing compiler could just optimize out all of your code and only you would be to blame.
TL/DR: Never, ever try to experiment UB: anything can happen from a consistent behaviour to an immediate crash passing by various inconsistent outputs. And what you see will not be reproduced in a different context (context here can even be just a different run of same code)
In your Program, I think "there is no any syntax issue" because when I execute this same code in my compiler. Then there is no any issue likes you.
It gives same garbage value at direct as well as in loop.
enter image description here
The problem is that when you wrote:
cout <<"First A[5] = "<< A[5] << endl<<endl;//this is Undefined behavior
In the above statement you're going out of bounds. This is because array index starts from 0 and not 1.
Since your array size is 5. This means you can safely access A[0],A[1],A[2],A[3] and A[4].
On the other hand you cannot access A[5]. If you try to do so, you will get undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior.
So the output that you're seeing is a result of undefined behavior. And as i said don't rely on the output of a program that has UB.
So the first step to make the program correct would be to remove UB. Then and only then you can start reasoning about the output of the program.
For the same reason, in your for loop you should replace i<=5 with i<5.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.
I was practicing an array manipulation question. While solving I declared an array (array A in code).
For some test cases, I got a segmentation fault. I replaced the array with vector and got AC. I don't know the reason for this. Plz, explain.
#include <bits/stdc++.h>
using namespace std;
int main()
{
int n,m,a,b,k;
cin>>n>>m;
vector<long int> A(n+2);
//long int A[n+2]={0};
for(int i=0;i<m;i++)
{
cin>>a>>b>>k;
A[a]+=k;
A[b+1]-=k;
}
long res=0;
for(int i=1;i<n+2;i++)
{
A[i]+=A[i-1];
if(res<A[i])
res=A[i];
}
cout<<res;
return 0;
}
Since it looks like you haven't being programming in C++ for very long I will try to break it down for you to make it simpler to understand:
First of all c++ does not intialize any values for you this is not Java, so please do not do:
int n,m,a,b,k;
And then use:
A[a]+=k;
A[b+1]-=k;
At this point we have no idea what a and b are it might be -300 for all we know, you never intialized it. Hence, occasically you get lucky and the number that is initalized by the compiler does not cause a segmentation fault, and other times you are not so lucky and the value intialized by the compiler does cause a segmentation fault.
long int A[n+2]={0}; is not legal in Standard C++. There are a bunch of reasons for this and I think you stumbled over one of them.
Compilers that allow Variable Length Arrays follow the example of C99 and the array is allocated on the stack. Stack is a limited resource, usually between 1 and 10 MB for a desktop computer. If the user inputs an n of sufficient size, the array will take up too much of the stack or breach the bounds of the stack resulting in Undefined Behaviour. of then this behaviour manifests in a segmentation fault from accessing memory that is so far off the end of the stack that it's not controlled by the program. There are typically no warnings when you overflow the stack. Often a program crash or corrupted data is the the way you find out, and it's too late to salvage the program by then.
On the other hand, a vector allocates it's internal buffer from the freestore, and on a modern PC with virtual memory and 64 bit addressing the freestore is fantastically huge and throws an exception if you attempt to exceed what it can allocate.
Another important difference is
long int A[n+2]={0};
likely did not zero initialize the array. This is the case with g++. The first byte will be set to zero and the remainder are uninitialized. Such is the curse of using non-Standard extensions. You cannot count on the behaviour guaranteed by the Standard.
std::vector will zero initialize the whole array or set the array to whatever value you tell it to use.
Debugging an application and experimenting a bit I came to a quite strange behaviour that can be reproduced with a following code:
#include <iostream>
#include <memory>
int main()
{
std::unique_ptr<int> p(new int);
*p = 10;
int& ref = *p;
int* direct_p = &(*p);
p.reset();
std::cout << *p << "\n"; // a) SIGSEGV
std::cout << ref << "\n"; // b) 0
std::cout << *direct_p << "\n"; // c) 0
return 0;
}
As I see it, all three variants have to cause undefined behaviour. Keeping that in the mind, I have these questions:
Why do ref and direct_p nevertheless point to zero? (not 10) (I mean, the mechanism of int's destruction seems strange to me, what's the point for compiler to rewrite on unused memory?)
Why don't b) and c) fire SIGSEGV?
Why does behaviour of a) differ from b) and c)?
p.reset(); is the equivalent of p.reset(nullptr);. So the unique_ptr's internal pointer is being set to null. Consequently doing *p ends up with the same result as trying to dereference a raw pointer that's null.
On the other hand, ref and direct_p are still left pointing at the memory formerly occupied by that int. Trying to use them to read that memory gets into Undefined Behavior territory, so in principle we can't conclude anything...
But in practice, there are a few things we can make educated assumptions and guesses about.
Since that memory location was valid shortly before, it's most likely still present (hasn't been unmapped from the address space, or other such implementation-specific things) when your program accesses it through ref and direct_p. C++ doesn't demand that the memory should become completely inaccessible. So in this case you simply end up "successfully" reading whatever happens to be at that memory location at that point during the program's execution.
As for why the value happens to be 0, well there are a couple possibilities. One is that you could be running in a debug mode which purposefully zeroes out deallocated memory. Another possibility is that by the time you access that memory through ref and direct_p something else has already re-used it for a different purpose which ended up leaving it with that value. Your std::cout << *p << "\n"; line could potentially have done that.
Undefined behaviour does not mean that code must trigger an abnormal termination. It means that anything can happen. Abnormal termination is only one possible result. Inconsistency of behaviour between different instances of undefined behaviour is another. Another possible (albeit rare in practice) is appearing to "work correctly" (however one defines "work correctly") until the next full moon, and then mysteriously behaving differently.
From a perspective of increasing average programmer skill and increasing software quality, electrocuting the programmer whenever they write code with undefined behaviour might be considered desirable.
As others have said undefined behavior means quite literally anything can happen. The code is unpredictable. But let me try to shed some light on question 'b' with an example.
SIGSEGV is attributed to a hardware fault reported by hardware with an MMU (Memory management unit). Your level of memory protection and therefore your level of SIGSEGV thrown can depend greatly on the MMU your hardware is using (source). If your un-allocated pointer happens to point to an ok address you will be able to read the memory their, if it points somewhere bad then your MMU will freak out and raise a SIGSEGV with your program.
Take for example though the MPC5200. This processor is quite old and has a somewhat rudimentary MMU. It can be quite difficult to get it to crash causing a segfault.
For example the following will not necessarily cause a SIGSEGV on the MPC5200:
int *p = NULL;
*p;
*p = 1;
printf("%d", *p); // This actually prints 1 which is insane
The only way i could get this to throw a segfault was with the following code:
int *p = NULL;
while (true) {
*(--p) = 1;
}
To wrap up, undefined behavior really does mean undefined.
Why nevertheless ref and direct_p point to zero? (not 10) (I mean, the
mechanism of int's destruction seems strange to me, what's the point
for compiler to rewrite on unused memory?)
It's not the compiler, it's C++/C libraries that changes memory. In your particular case, libc does something funny, as it reallocates heap data, when the value is freed:
Hardware watchpoint 3: *direct_p
_int_free (have_lock=0, p=0x614c10, av=0x7ffff7535b20 <main_arena>) at malloc.c:3925
3925 while ((old = catomic_compare_and_exchange_val_rel (fb, p, old2)) != old2);
Why b) and c) don't fire SIGSEGV?
SIGSEGV is triggered by the kernel if an attempt to access memory outside of allocated address space is made. Normally, libc won't actually remove the pages after deallocating memory - it would be too expensive. You are writing to an address that is unmapped by libc - but kernel doesn't know about that. You can use a memory barrier library (e.g. ElectricFence, great for debugging) to have that happen.
Why behavior of a) differs from b) and c)?
You made value of p point to some memory, say 100. You then effectively created aliases for that memory location, so direct_p and ref will point to 100. Note, that they aren't variable references, they are memory references. So changes you make to p have no effect on them. You then deallocated p, it's value becomes 0 (i.e. it now points to a memory address 0). Attempting to read a value from memory address 0 guarantees a SIGSEGV. Reading values from memory address 100 is bad idea, but is not fatal (as explained above).
can someone please explain to me why this works. I thought arrays were static and couldn't expand, this piece of code defies my prior knowledge.
#include <iostream>
using namespace std;
int main(){
int test[10];
int e = 14;
for(int i = 0; i < e; i++){
test[i] = i;
cout << " " << test[i];
}
return 0;
}
This code outputs this:
0 1 2 3 4 5 6 7 8 9 10 11 12 13
So basically this program uses array spaces that shouldn't exist.
Tried setting 'e' as 15, doesn't work.
The array's size is fixed, it is not expanding, and going beyond its bounds is undefined behaviour. What you observed is one possible outcome of undefined behaviour (UB). You were unlucky that in this case the UB suggests a pattern consistent with the array expanding.
It's undefined behaviour. You still have only 10 ints allocated legally. Though it seems to work in this case, your program is ill-formed.
You basically write beyond the boundary of the memory allocated by your array, but C (and C++) is compiled directly to machine code (an opposite to "managed" code executed by virtual machine, like Java or .NET), so there is nothing between your program and OS which will verify if you access memory you have not explicitly asked for. Memory is allocated in some chunks, when a process is requesting some portion of memory from OS, it does not get that precise number of bytes, but may get slightly more. In your case instead of 40 bytes you got 56. Why you did not get 60 - depends on OS memory allocation and verification mechanism. What was the symptom of not working when e was set to 15 - program crash?
It is a runtime error and not a compilation error. The reason is fails at 15 and not 14 is of because once you reach position 15 you have hit memory that has been allocated to another pointer or application. It just so happens that indexes 11, 12 ,13 and 14 are contiguous memory locations that have not yet been malloc'ed
Let's look at the following piece of code which I unintentionally wrote:
void test (){
for (int i = 1; i <=5; ++i){
float newNum;
newNum +=i;
cout << newNum << " ";
}
}
Now, this is what I happened in my head:
I have always been thinking that float newNum would create a new variable newNum with a brand-new value for each iteration since the line is put inside the loop. And since float newNum doesn't throw a compile error, C++ must be assigning some default value (huhm, must be 0). I then expected the output to be "1 2 3 4 5". What was printed was "1 3 6 10 15".
Please help me know what's wrong with my expectation that float newNum would create a new variable for each iteration?
Btw, in Java, this piece of code won't compile due to newNum not initialized and that's probably better for me since I would know I need to set it to 0 to get the expected output.
Since newNum is not initialized explicitly, it will have a random value (determined by the garbage data contained in the memory block it is allocated to), at least on the first iteration.
On subsequent iterations, it may have its earlier values reused (as the compiler may allocate it repeatedly to the same memory location - this is entirely up to the compiler's discretion). Judging from the output, this is what actually happened here: in the first iteration newNum had the value 0 (by pure chance), then 1, 3, 6 and 10, respectively.
So to get the desired output, initialize the variable explicitly inside the loop:
float newNum = 0.0;
C++ must be assigning some default
value (huhm, must be 0)
This is the mistake in your assumptions. C++ doesn't attempt to assign default values, you must explicitly initialise everything.
Most likely it will assign the same location in memory each time around the loop and so (in this simple case) newNum will probably seem to persist from each iteration to the next.
In a more complicated scenario the memory assigned to newNum would be in an essentially random state and you could expect weird behaviour.
http://www.cplusplus.com/doc/tutorial/variables/
The float you are creating is not initialised at all. Looks like you got lucky and it turned out to be zero on the first pass, though it could have had any value in it.
In each iteration of the loop a new float is created, but it uses the same bit of memory as the last one, so ended up with the old value that you had.
To get the effect you wanted, you will need to initialise the float on each pass.
float newNum = 0.0;
The mistake in thoughts you've expressed is "C++ must be assigning some default value". It will not. newNum contains dump.
You are using an uninitialized automatic stack variable. In each loop iteration it is located at the same place on the stack, so event though it has an undefined value, in your case it will be the value of the previous iteration.
Also beware that in the first iteration it could potentialliay have any value, not only 0.0.
You might have got your expected answer in a debug build but not release as debug builds sometimes initialise variables to 0 for you. I think this is undefined behaviour - because C++ doesn't auto initialise variables for you, every time around the loop it is creating a new variable but it keeps using the same memory as it was just released and doesn't scrub out the previous value. As other people have said you could have ended up with complete nonsense printing out.
It is not a compile error to use an uninitialised variable but there should usually be a warning about it. Always good to turn warnings on and try to remove all of them incase something nastier is hidden amongst them.
Using garbage value in your code invokes Undefined Behaviour in C++. Undefined Behavior means anything can happen i.e the behavior of the code is not defined.
float newNum; //uninitialized (may contain some garbage value)
newNum +=i; // same as newNum=newNum+i
^^^^^
Whoa!!
So better try this
float newNum=0; //initialize the variable
for (int i = 1; i <=5; ++i){
newNum +=i;
cout << newNum << " ";
}
C++ must be assigning some default value (huhm, must be 0).
C++ doesn't initialize things without default constructor (it may set it to something like 0xcccccccc on debug build, though) - because as any proper tool, compiler "thinks" that if you haven't provided initialization then it is what you wanted. float doesn't have default constructor, so it is unknown value.
I then expected the output to be "1 2 3 4 5". What was printed was "1 3 6 10 15".
Please help me know what's wrong with my expectation that float newNum would create a new variable for each iteration?
Variable is a block of memory. In this variable is allocated on stack. You didn't initialize it, and each iteration it just happen to be placed on the same memory address, which is why it stores previous value. Of course, you shouldn't rely on such behavior. If you want value to persist across iterations, declare it outside of loop.
Btw, in Java, this piece of code won't compile due to newNum not initialized
BTW, in C++ normal compiler would give you a warning that variable is not initialized (Example: "warning C4700: uninitialized local variable 'f' used"). And on debug build you would get crt debug error (Example: "Run-Time check failure #3 - The variable 'f' is being used without being initialized").
and that's probably better
Negative. You DO need uninitialized variables from time to time (normally - to initialize them without "standard" assignment operator - by passing into function by pointer/reference, for example), and forcing me to initialize every one of them will be a waste of my time.
I don't do too much C++ since last month, but:
The float values is newly allocated for each iteration. I'm a bit surprised about the initial zero value, though. The thing is, after each iteration, the float value runs out of scope, the next step (the reenter of the loop scope) first reallocates the float memory and this will often return the same memory block that was just freed.
(I'm waiting for any bashing :-P)