C++11 Why does cout print large integers from a boolean array? - c++

#include <iostream>
using namespace std;
int main() {
bool *a = new bool[10];
cout << sizeof(bool) << endl;
cout << sizeof(a[0]) << endl;
for (int i = 0; i < 10; i++) {
cout << a[i] << " ";
}
delete[] a;
}
The above code outputs:
1
1
112 104 151 0 0 0 0 0 88 1
The last line should contain garbage values, but why are they not all 0 or 1? The same thing happens for a stack-allocated array.
Solved: I forgot that sizeof counts bytes, not bits as I thought.

You have an array of default-initialized bools. Default-initialization for primitive types entail no initialization, thus they all have indeterminate values.
You can zero-initialize them by providing a pair of parentheses:
bool *a = new bool[10]();
Booleans are 1-byte integral types so the reason you're seeing this output is probably because that is the data on the stack at that moment that can be viewed with a single byte. Notice how they are values under 255 (the largest number that can be produced from an unsigned 1-byte integer).
OTOH, printing out an indeterminate value is Undefined Behavior, so there really is no logic to consider in this program.

sizeof(bool) on your machine returns 1.
That's 1 byte, not 1 bit, so the values you show can certainly be present.

What you are seeing is uninitialized values, different compilers generate different code. On GCC I see everything as 0 on windows i see junk values.
generally char is the smallest byte addressable- even though bool has 1/0 value- memory access wise it will be a char. Thus you will never see junk value greater than 255
Following initialization (memset fixes the things for you)
#include <iostream>
using namespace std;
int main() {
bool* a = new bool[10];
memset(a, 0, 10*sizeof(bool));
cout << sizeof(bool) << endl;
cout << sizeof(a[0]) << endl;
for (int i = 0; i < 10; ++i)
{
bool b = a[i];
cout << b << " ";
}
return 0;
}

Formally speaking, as pointed out in this answer, reading any uninitialized variable is undefined behaviour, which basically means everything is possible.
More practically, the memory used by those bools is filled with what you called garbage. ostreams operator<< inserts booleans via std::num_put::put(), which, if boolalpha is not set, converts the value present to an int and outputs the result.

I do not know why you put a * sign before variable a .
Is it a pointer to point a top element address of the array?

Related

std::array and operator []

The reference for std::array::operator[] states:
Returns a reference to the element at specified location pos. No
bounds checking is performed.
I wrote this small program to check the behavior of operator[]:
#include <array>
#include <cstddef>
#include <iostream>
using std::array;
using std::size_t;
using std::cout;
using std::endl;
#define MAX_SZ 5
int main (void){
array<int,MAX_SZ> a;
size_t idx = MAX_SZ - 1;
while(idx < MAX_SZ){
cout << idx << ":" << a[idx] << endl;
--idx;
}
cout << idx << ":" << a[idx] << endl;
return 0;
}
When compiled and run, the above program produces the following output:
4:-13104
3:0
2:-12816
1:1
0:-2144863424
18446744073709551615:0
Based on the above output, my question is:
Why doesn't above code give a segmentation fault error, when the value of idx assumes the value 18446744073709551615?
oprator[] is not required to do bound-checks. Thus, it is out of bound access. Out of bound access causes undefined behavior. Meaning anything could happen. I really do mean anything. For ex. it could order pizza.
As already said you face undefined behaviour.
Nevertheless, if you like to have a boundary check you can use .at() instead of the operator []. This will be a bit slower, since it performs the check for every access to the array.
Also a memory checker such as valgrind is able to find errors like this at runtime of the program.
>> Why doesn't above code give a segmentation fault error, when the value of idx assumes the value 18446744073709551615 ?
Because this large number is 2**64 - 1, that is 2 raised to the 64th power, minus 1.
And as far as the array indexing logic is concerned, this is exactly the same thing as -1, because the 2**64 value is outside of what the 64 bits hardware can consider. So you are accessing (illegally) a[-1], and it happens to contain 0 in your machine.
In your memory, this is the word just before a[0]. It is memory in your stack, which you are perfectly allowed by the hardware to access, so no segmentation fault is expected to occur.
Your while loop uses a size_t index, which is essentially an unsigned 64 bits quantity. So when the index is decremented and goes from 0 to -1, -1 is interpreted by the loop control test as 18446744073709551615 (a bit pattern consisting of 64 bits all set to 1), which is way bigger than MAX_SZ = 5, so the test fails and the while loop stops there.
If you have the slightest doubt about that, you can check by controlling the memory values around array a[]. To do this, you can "sandwich" array a between 2 smaller arrays, say magica and magicb, which you properly initialize. Like this:
#include <array>
#include <cstddef>
#include <iostream>
using std::array;
using std::size_t;
using std::cout;
using std::endl;
#define MAX_SZ 5
int main (void){
array<int,2> magica;
array<int,MAX_SZ> a;
size_t idx = MAX_SZ - 1;
array<int,2> magicb;
magica[0] = 111222333;
magica[1] = 111222334;
magicb[0] = 111222335;
magicb[1] = 111222336;
cout << "magicb[1] : " << magicb[1] << endl;
while (idx < MAX_SZ) {
cout << idx << ":" << a[idx] << endl;
--idx;
}
cout << idx << ":" << a[idx] << endl;
return 0;
}
My machine is a x86-based one, so its stack grows towards numerically lower memory addresses. Array magicb is defined after array a in the source code order, so it is allocated last on the stack, so it has a numerically lower address than array a.
Hence, the memory layout is: magicb[0], magicb[1], a[0], ... , a[4], magica[0], magica[1]. So you expect the hardware to give you magicb[1] when you ask for a[-1].
This is indeed what happens:
magicb[1] : 111222336
4:607440832
3:0
2:4199469
1:0
0:2
18446744073709551615:111222336
As other people have pointed out, the C++ language rules do not define what you are expected to get from negative array indexes, and hence the people who wrote the compiler were at license to return whatever value suited them as a[-1]. Their sole concern was probably to write machine code that does not decrease the performance for well-behaved source code.

Size of byte when accessed via pointer

I'm working on an Arduino project. I'm trying to pass a byte pointer to a function, and let that function calculate the size of the data that the pointer refers to. But when I let the pointer refer to a byte, sizeof() returns 2. I wrote the following snippet to try to debug:
byte b;
byte *byteptr;
byteptr = &b;
print("sizeof(b): ");
println(sizeof(b));
print("sizeof(*byteptr) pointing to byte: ");
println(sizeof(*byteptr));
print("sizeof(byteptr) pointing to byte: ");
println(sizeof(byteptr));
the printed result is:
sizeof(b): 1
sizeof(*byteptr) pointing to byte: 1
sizeof(byteptr) pointing to byte: 2
So the size of a byte is 1, but via the pointer it's 2??
It appears that on Arduino, pointers are 16 bit. I believe your confusion stems from what * means in this context.
sizeof(*byteptr) is equivalent to the sizeof(byte). The * does not indicate a pointer type, it indicates dereferencing the pointer stored in byteptr. Ergo, it is 1 byte, which you would expect from the type byte.
sizeof(byteptr) does not dereference the pointer, and as such, is the size of the pointer itself, which on this system seems to be 2 bytes/16 bits.
Consider the following:
#include "iostream"
using namespace std;
int main()
{
char a = 1;
char* aP = &a;
cout << "sizeof(char): " << sizeof(char) << endl;
cout << "sizeof(char*): " << sizeof(char*) << endl;
cout << "sizeof(a): " << sizeof(a) << endl;
cout << "sizeof(aP): " << sizeof(aP) << endl;
cout << "sizeof(*aP): " << sizeof(*aP) << endl;
}
Output (on a 64 bit OS/compiler):
sizeof(char): 1
sizeof(char*): 8
sizeof(a): 1
sizeof(aP): 8
sizeof(*aP): 1
#Maseb I think you've gotten a good discussion of the differences between the size of a dereferenced pointer and the actual size of the pointer itself. I'll just add that the sizeof(byte_pointer) must be large enough so that every address of memory space where a byte value could potentially be stored will fit into the pointer's memory width. For example, if there 32,000 bytes of storage on your Arduino then you could potentially have a pointer that needs to point to the address 32,000. Since 2^15 is about 32,000 you need 14 or 15 bits to create a unique address for each memory storage location. We set pointer address space length to blocks of four bits. Therefore, your Arduino has a 16bit address space for pointers and sizeof(byte_pointer) is 2 bytes, or 16 bits.
With that said, I'll go ahead an answer your other question too. If you need to pass an array and a size, just create your own struct that includes both of those data elements. Then you can pass the pointer to this templated struct which includes the size (This is the basic implementation for the C++ Array container).
I've written the short code sample below to demonstrate how to create your own template for an array with a size element and then use that size element to iterate over the elements.
template<int N>
struct My_Array{
int size = N;
int elem[N];
};
//create the pointer to the struct
My_Array<3>* ma3 = new My_Array<3>;
void setup() {
//now fill the array element
for(int i=0; i < ma3->size; i++) {
ma3->elem[0]=i;
}
Serial.begin(9600);
//now you can use the size value to iterate over the elements
Serial.print("ma3 is this big: ");
Serial.println(ma3->size);
Serial.println("The array values are:");
Serial.print("\t[");
for(int i=0; i<ma3->size; i++) {
Serial.print(ma3->elem[i]);
if(i < ma3->size-1) Serial.print(", ");
}
Serial.println("]");
}
void loop() {
while(true) { /* do nothing */ }
}

Why do I need to initialize an int variable to 0?

I just made this program which asks to enter number between 5 and 10 and then it counts the sum of the numbers which are entered here is the code
#include <iostream>
#include <cstdlib>
using namespace std;
int main()
{
int a,i,c;
cout << "Enter the number between 5 and 10" << endl;
cin >> a;
if (a < 5 || a > 10)
{
cout << "Wrong number" << endl;
system("PAUSE");
return 0;
}
for(i=1; i<=a; i++)
{
c=c+i;
}
cout << "The sum of the first " << a << " numbers are " << c << endl;
system("PAUSE");
return 0;
}
if i enter number 5 it should display
The sum of the first 5 numbers are 15
but it displays
The sum of the first 5 numbers are 2293687
but when i set c to 0
it works corectly
So what is the difference ?
Because C++ doesn't automatically set it zero for you. So, you should initialize it yourself:
int c = 0;
An uninitialized variable has a random number such as 2293687, -21, 99999, ... (If it doesn't invoke undefined behavior when reading it)
Also, static variables will be set to their default value. In this case 0.
If you don't set c to 0, it can take any value (technically, an indeterminate value). If you then do this
c = c + i;
then you are adding the value of i to something that could be anything. Technically, this is undefined behaviour. What happens in practice is that you cannot rely on the result of that calculation.
In C++, non-static or global built-in types have no initialization performed when "default initialized". In order to zero-initialize an int, you need to be explicit:
int i = 0;
or you can use value initialization:
int i{};
int j = int();
Non-static variables are, by definition, uninitialized - their initial values are undefined.
On another compiler, you might get the right answer, another wrong answer, or a different answer each time.
C/C++ don't do extra work (initialization to zero involves at least an instruction or two) that you didn't ask them to do.
The sum of the first 5 numbers are 2293687
This is because without initializing c you are getting value previous stored at that location (garbage value). This will make yor program's behavior undefined. You must have to initialize c before using it in your program.
int c= 0;
Because when you do:
int a,i,c;
thus instantiating and initializing c, you haven't said what you want it initialized to. The rules here are somewhat complex, but what it boils down to is two things:
For integral types, if you don't specify an initializer, the variable's value is indeterminate
When you try to read an uninitialized variable, you evoke Undefined Behavior

For loop condition is a variable without comparison

int a[5] = {1,2,3,4,5};
for (int i = 0; a[i]; i++)
{
cout << i;
}
This code produces the output of "0 1 2 3 4".
What does it compare a[i] against, and how does it know to stop at the end of the array and not go over?
You code causes undefined behaviour. The expression a[i] will evaluate as true if non-zero and as false if zero. When you run it, you're getting lucky that there is a 0 word immediately following your array in memory, so the loop stops.
It's reading past the array and the memory there just happens to be zero, by sheer luck. Reading past the end of that array is undefined behavior and the outcome might change at any time, so never rely on it.
You can think of a[i] as being compared to 0, it simply fetches the number retrieved from the location in memory and if 0 is the value that lives at that memory, then the loop exits, if it is any other number the loop continues.
Suppose an int is 4 bytes on the system. a is given an address, lets pretend it is 0xFF00 when we try to evaluate a[0] we retrieve the data value stored at memory 0xFF00. a[1] would retrieve data from memory 0xFF04, etc. Your program only assigns values to the first 5 memory locations, so when we retrieve the data at beyond these locations they could be anything from 0 to INT_MAX. If it happens to be 0 then the loop exits, however if it happens to be something else the loop continues.
Your could adjust your program like so to see it better:
#include <iostream>
using namespace std;
int main() {
int a[5] = {1,2,3,4,5};
int i;
for (i = 0; a[i]; i++)
{
cout << "At memory address: " << &a[i]
<< " lives the value: " << a[i] << endl;
}
cout << "At memory address: " << &a[i]
<< " lives the value: " << a[i]
<< ", and this is why the loop ended." << endl;
return 0;
}

Undefined bhaviour with usage of union in cpp?

I'm working on union using c++,
following is code snippet:
#include<iostream>
using namespace std;
typedef union myunion
{
double PI;
int B;
}MYUNION;
int main()
{
MYUNION numbers;
numbers.PI = 3;
numbers.B = 50;
cout <<" numbers.PI :" << numbers.PI << endl;
if(numbers.PI == 3.0)
{
cout <<"True";
if(numbers.B == 50)
{
cout <<" numbers.PI :" << numbers.PI << endl;
cout <<" numbers.B :" << numbers.B << endl;
}
}
return 0;
}
Output is:
numbers.PI :3
Even value of numbers.PI is set to 3 already, first "if" condition yields to false.
what is the reason of this behavior?
The reason is that there's no reason.
Your code invokes undefined behavior because you are setting the B member of the union:
numbers.B = 50;
but immediately after setting it, you read out the other member, PI:
cout <<" numbers.PI :" << numbers.PI << endl;
Maybe you are confusing unions and structures - unless the floating-point number 3 and the integer 50 have the very same bit representation on your architecture (which is very unlikely), the behavior you expect from your program would be reasonable only if you used a struct instead.
(union members reside at the same place in memory - setting one overwrites the other too. This is not true for a struct, which has each of its members stored at a different memory location.)
Remember that all members of a union shares the same memory. When you assign to B you change the value of PI as well.
To be safe, you should only "read" from the last field you "write" to.
It seems to me that what you want is a structure.
You're getting undefined behavior, but here's what is happening behind the scenes:
You're using a little-endian machine with sizeof(int) < sizeof(double), such as an x86. Almost certainly, the machine uses IEEE 754 format for floats/double (pretty much all machines do these days).
When you write into the B field, it overwrites the low order bits of the double in PI. So
when you initially store 3.0 in PI, that sets it to something like 0x4008000000000000. Then when you store 50 in B that changes PI to 0x4008000000000032, which happens to be 3.00000000000002220446049250313080847263336181640625. So that's not equal to 3.0, but when you print it with default precision, it rounds it to 3.0