C++ dynamic array of ints sometimes causes crash - c++

I wrote a simple code as follows:
void show(const int a[], unsigned elements);
int main()
{
show(new int[]{1, 2, 3, 45}, 4); //does not work
}
void show(const int a[], unsigned elements)
{
cout << "{ ";
for (int i = 0; i < elements; i++)
{
cout << a[i];
if (i != elements - 1)
cout << ",";
cout << " ";
}
cout << "}";
}
It should just output { 1, 2, 3, 45 }. If I include a size in the brackets
show(new int[4]{1, 2, 3, 45}, 4);
then it works. So naturally I would assume that if I write the new this way I have to specify the size (although I thought that giving it an initialization list would imply the size). But, the odd thing is that when set a breakpoint at the show function call and I run it step by step through the debugger, the program outputs everything correctly and terminates at the end of main like it should. If I don't use the debugger, it either crashes after outputting a '{' or it outputs the whole thing "{ 1, 2, 3, 45 }" and an assertion failure " Program: ... "Expression: _CrtIsValidHeapPointer(pUserData) ... "
I'm curious to know why it is behaving this way. Also, I am using Visual Studio on Windows 8.
EDIT: I am using namepsace std. Please don't comment about using namespaces or about how to better write this code. I'm solely interested in the cause of this issue.

EDIT Responding to additional question in comment.
To be quick, yes it would "still" be a pointer, and yes it compiles with clang and gcc when you add the 4.
There are a couple things going on, however, and my initial answer was a simplification. The problem is that your expression is not well-formed to begin with, so it's not clear what it should evaluate to or what the type should be. Consider
If type is an array type, all dimensions other than the first must be specified as positive integral constant expression (until C++14)converted constant expression of type std::size_t (since C++14), but the first dimension may be any expression convertible to std::size_t.
Source: http://en.cppreference.com/w/cpp/language/new
As it says, either way there must be an expression in the brackets. This makes it difficult to say whether the expression would still evaluate to a pointer. A well-formed new expression would indeed evaluate to a pointer, no matter how many dimensions it has, even if it has zero. When I say pointer here, I strictly mean the representation, not the type.
The point is that the type, at least "inside" new, is different depending on how many dimensions you have. So, whether you do
new int
new int[6]
new int[12][14]
the representation is the same (a pointer), but the type new sees is different in each case. The compiler is able to respond to the different types in new (think by analogy with function overloading). In particular, when the type is an array type, it is possible to initialize the new memory with the braced initializer list containing multiple elements.
My best guess is, since VS was accepting the brackets without an expression, it was allocating memory for either a single int or int[0]. In the former case, it was wrongly allowing you to brace initialize it as if it was an array type, and in the latter case the allocated memory was not enough anyway. Your main then wrote over a heap guard that is there to catch this sort of thing in debug mode. When this was checked at the end of main or at program termination, you saw the symptoms. The flakiness in the output was either due to different heap layouts or due to buffering in the output stream.
Original answer
Your new expression, if it was well-formed, would have scalar type, meaning that the result is a "single value". That single value is a pointer to an integer, specifically to the one at the beginning of the array you are trying to create. That is how "dynamic arrays" are represented in C++. The type system does not "know" their size.
You are trying to initialize this single pointer value with an initializer list of 4 values. This shouldn't work. I am not sure that this should compile at all. It certainly didn't compile with clang or gcc, and I'm surprised that it worked in Visual Studio.

Related

What's the meaning of -2[array]

I recently stumbled upon following code
int array[] = {10, 20, 30};
cout << -2[array];
I understand that array is a pointer to the first element of the array but then what does [pointer] means and then we write -2 in front of it which looks very alien to me.
It is the same as if you'd write
cout << -(2[array]);
which is the same as
cout << -(array[2]);
In C++ operator [] on an array simply offsets the address of the array by the number specified in the brackets. As with any addition, you can swap operands and the result will remain the same.
For example -0[array] will give you -10 in this case.
There are two ways to access an array element via a pointer offset. One is the common
int array[] = {10, 20, 30};
cout << -array[2]; // prints -30
and one is the unusual one that you posted. Both versions are equivalent.
Note that as the unary minus operator has a lower precedence compared to the subscript operator, -2[array] does not involve a negative index, but instead is the same as -(2[array]).
In C and C++ the place of array name and the index can be swapped without changing the meaning i.e. -2[array] and -array[2] are the same. Both will compile and do the same thing. Check this SO post
However, if you try this in a language like C# or Java you will get a compiler error saying Cannot apply indexing with [] to an expression of type 'int' or something similar. This is a good example to understand how code works if the language supports pointers vs if it doesn't.
Note, as pointed out in the comment, negation operator has lower precedence over array index operator so it will compute to -array[2] instead of array[-2]

Using std::sort with pointers to integer variables produces unexpected output

I decided to compile and run this piece of code (out of curiosity) and the G++ compiler successfully compiled the program. I was expecting to see a compile error or a runtime error, or at least the values of a and b swapped (as 5 > 1), since the std::sort() function is being called with two pointers to integers.
(Please note that I know this is not a good practice and I was basically just playing with pointers)
#include <iostream>
#include <algorithm>
int main() {
int a{5};
int b{4};
int c{1};
int* aptr = &a;
int* bptr = &b;
std::sort(aptr, bptr);
std::cout << a << ' ' << b << ' ' << c << '\n';
return 0;
}
However, upon executing the program, the output I got was this:
5 4 1
My question is, how did C++ allow this call to the std::sort() function? And how did it not end up actually sorting everything between the memory addresses of a and b (potentially including even garbage values in memory)?
I mean, if we tried this with C-style arrays like this (std::sort(arr, arr+n)) it would successfully sort the C-style array, because arr and arr+n are basically just pointers where n is the size of the array and arr is the pointer to the first element.
(I'm sorry if this question sounds stupid. I'm still learning C++.)
Your program is ill formed, no diagnostic required. You passed pointers that do not form a range to a std algorithm.
Any behaviour whatsoever by the program is conforming to the C++ standard.
Compilers optimize around the fact that pointers to unrelated objects are incomparable and their difference is undefined. A sort here would trip over so much UB the optimizer could eliminate branches like crazy (as any branch with UB can be eliminated and replaced with the alternative (whatever code the alternate branch is a legal result of UB)).
Good C++ coding style thus focuses on avoiding UB and IL-NDR code.
C++ accepts your code as it is syntactically right. But it doesn't work because sort(it1, it2) expects it1 one to be some starting position of an array and it2 to be the ending position of the same array. you have provided two different arrays to the sort function which can yield any of two following situations:
positionof(it1) < positionof(it2): suppose in computer's memory array a and b are stored in the like this- 5(a), -1, -2, 10, 4(b). then the sort function will sort from 5 to 4 resulting in : -2(a),-1,4,5,10(b).
positionof(it1) > positionof(it2) (your machine's case): the sort function will do nothing as left_position > right_position.

Don't we have to assign return values of the functions to variables? C/C++

I've been using C/C++ for about three years and I can't believe I've never encountered this issue before!
This following code compiles (I've just tried using gcc):
#include <iostream>
int change_i(int i) {
int j = 8;
return j;
}
int main() {
int i = 10;
change_i(10);
std::cout << "i = " << i << std::endl;
}
And, the program prints i = 10, as you might expect.
My question is -- why does this compile? I would have expected an error, or at least a warning, saying there was a value returned which is unused.
Naively, I would consider this a similar case to when you accidentally forget the return call in a non-void function. I understand it's different and I can see why there's nothing inherently wrong with this code, but it seems dangerous. I've just spotted a similar error in some very old code of mine, representing a bug which goes back a long time. I obviously meant to do:
i = change_i(10);
But forgot, so it was never changed (I know this example is silly, the exact code is much more complicated). Any thoughts would be much appreciated!
It compiles because calling a function and ignoring the return result is very common. In fact, the last line of main does so too.
std::cout << "i = " << i << std::endl;
is actually short for:
(std::cout).operator<<("i =").operator<<(i).operator<<(std::endl);
... and you are not using the value returned from the final operator<<.
Some static checkers have options to warn when function returns are ignored (and then options to annotate a function whose returns are often ignored). Gcc has an option to mark a function as requiring the return value be used (__attribute__((warn_unused_result))) - but it only works if the return type doesn't have a destructor :-(.
Ignoring the return value of a function is perfectly valid. Take this for example:
printf("hello\n");
We're ignoring the return value of printf here, which returns the number of characters printed. In most cases, you don't care how many characters are printed. If compilers warned about this, everyone's code would show tons of warnings.
This actually a specific case of ignoring the value of an expression, where in this case the value of the expression is the return value of a function.
Similarly, if you do this:
i++;
You have an expression whose value is discarded (i.e. the value of i before being incremented), however the ++ operator still increments the variable.
An assignment is also an expression:
i = j = k;
Here, you have two assignment expressions. One is j = k, whose value is the value of k (which was just assigned to j). This value is then used as the right hand side an another assignment to i. The value of the i = (j = k) expression is then discarded.
This is very different from not returning a value from a non-void function. In that case, the value returned by the function is undefined, and attempting to use that value results in undefined behavior.
There is nothing undefined about ignoring the value of an expression.
The short reason it is allowed is because that's what the standard specifies.
The statement
change_i(10);
discards the value returned by change_i().
The longer reason is that most expressions both have an effect and produce a result. So
i = change_i(10);
will set i to be 8, but the assignment expression itself also has a result of 8. This is why (if j is of type int)
j = i = change_i(10);
will cause both j and i to have the value of 8. This sort of logic can continue indefinitely - which is why expressions can be chained, such as k = i = j = 10. So - from a language perspective - it does not make sense to require that a value returned by a function is assigned to a variable.
If you want to explicitly discard the result of a function call, it is possible to do
(void)change_i(10);
and a statement like
j = (void)change_i(10);
will not compile, typically due to a mismatch of types (an int cannot be assigned the value of something of type void).
All that said, several compilers (and static code analysers) can actually be configured to give a warning if the caller does not use a value returned by a function. Such warnings are turned off by default - so it is necessary to compile with appropriate settings (e.g. command line options).
I've been using C/C++ for about three years
I can suppose that during these three years you used standard C function printf. For example
#include <stdio.h>
int main( void )
{
printf( "Hello World!\n" );
}
The function has return type that differs from void. However I am sure that in most cases you did not use the return value of the function.:)
If to require that the compiler would issue an error when the return value of a function is not used then the code similar to the shown above would not compile because the compiler does not have an access to the source code of the function and can not determine whether the function has a side effect.:)
Consider another standard C functions - string functions.
For example function strcpy is declared like
char * strcpy( char *destination, const char *source );
If you have for example the following character arrays
char source[] = "Hello World!";
char destination[sizeof( source )];
then the function usually is called like
strcpy( destination, source );
There is no sense to use its return value when you need just to copy a string. Moreover for the shown example you even may not write
destination = strcpy( destination, source );
The compiler will issue an error.
So as you can see there is sense to ignore sometimes return values of functions.
For your own example the compiler could issue a message that the function does not have a side effect so its call is obsolete. In any case it should issue a message that the function parameter is not used.:)
Take into account that sometimes the compiler does not see a function definition that is present in some other compilation unit or in a library. So the compiler is unable to determine whether a function has a side effect,
In most cases compilers deal with function declarations. Sometimes the function definitions are not available for compilers in C and C++.

Array of structs on heap not properly initialized

I thought I knew how to deal with memory management in c++ but this confused me:
Consider the following code:
struct A {
int i;
};
int main(int argc, char* argv[]) {
A a{ 5 }; //Constructs an A object on the stack
A* b = new A{ 7 }; //Constructs an A object on the heap and stores a pointer to it in b
A* c = new A[] { //Construct an array of A objects on the heap and stores a pointer to it in c
{ 3 },
{ 4 },
{ 5 },
{ 6 }
};
std::cout << "a: " << a.i << "\n"; //Prints 'a: 5'
std::cout << "b: " << b->i << "\n"; //Prints 'b: 7'
std::cout << "c: " << c[0].i << "; " << c[1].i << "; " << c[2].i << "; " << c[3].i << "\n";
//Prints 'c: -33686019; -1414812757; -1414812757; -1414812757'
delete b;
delete[] c;
return 0;
}
I don't understand why the last print-out of c prints those weird numbers. If I add a constructor to A like so:
struct A {
A(int i) : i{i} {}
int i;
};
Then the output of the last print-out becomes:
'c: 3; 4; 5; 6'
as it should be. But now delete[] c; will give me a runtime error (not an exception it seems) that says MyGame.exe has triggered a breakpoint. (I'm working in VS2013).
Furthermore, if I change the line A* c = new A[] { to A* c = new A[4] { the error disappears and everything works as expected.
So my questions are:
Why the weird numbers? Won't the A objects in the array get properly constructed somehow if I don't define a constructor?
And why do I need to specify the array size explicitly even though it will compile and link just fine without? Initializing arrays on the stack this way does not give me a runtime error (I tested it to be sure).
This is an error:
A* c = new A[] { {3}, {4}, {5}, {6} };
You must put the dimension inside the []. With new the array dimension cannot be deduced from the initializer list.
Putting 4 in here makes your code work correctly for me.
Your compiler apparently has an "extension" that treats new A[] as new A[1].
If you compile in standard mode (with gcc or clang, -std=c++14 -pedantic), which is always a good idea, the compiler will tell you about things like this. Treat warnings as errors unless you are really sure they are not errors :)
Why the weird numbers?
Because no memory was allocated to back them. The pointer is pointing at Crom knows what. That structure should not compile.
Won't the A objects in the array get properly constructed somehow if I don't define a constructor?
Without a constructor all of the members will be initialized to their defaults. int's and most Plain Old Datatypes have no defined default value. In a typical implementation they get whatever value happens to already be in their allocated memory block. If a member object is of a type that doesn't default constructor and is unable to make one, you get a compiler error.
And why do I need to specify the array size explicitly even though it will compile and link just fine without?
It shouldn't compile, mismatch between the size of the array (unspecified and an error unto itself) and the number of elements in the initializer list, so the compiler has a bug. Linker is not involved at this point.
Initializing arrays on the stack this way does not give me a runtime error (I tested it to be sure).
In the static version the compiler can count the number of elements in initialization list. Why the dynamic version with new can't, gotta say I have no good answer. You'd think it would be a simple bit of counting that initializer list, so there's something deeper preventing it. The folk who debated and then approved the standard either never considered allocating a dynamic array that way or couldn't find a good way to make it work in all cases. Same reason variable length arrays still aren't in the standard.
"And why do I need to specify the array size explicitly even though it will compile and link just fine without? It shouldn't compile, ...." To be clear: If I add the constructor to A and run it, it runs just fine up until the delete[] statement. Only then it crashes but cout << c[0] works as 'expected'
This is because you are unlucky. That constructor is writing into memory that your program owns, but didn't allocate to c. Printing those values works, but whatever was supposed to be in memory at that point has been overwritten. This will probably cause your program to crash sooner or later. This time it's later.
My suspicions, and this is guesswork based on specific because you've ventured far into the realms of the undefined, are the crash on delete[] is because
A* c = new A[]
Allocated A[1] and assigned it to c rather than failing to compile. c has one A to work with. The initializer list tries to stuff in 4 and writes 3 into c[0] and the 4,5, and 6 over the heap control information that delete needs to put the data back. All looks good until delete tries to use that overwritten information.
Oh and this:"Without a constructor all of the members will be initialized to their defaults. int's and most Plain Old Datatypes have no defined default value.". For structs a user defined ctor seems optional because you can initialize a struct by providing arguments corresponding to its data fields.
A struct has a much more permissive attitude toward data encapsulation than a class and defaults to public access where a class defaults to private. I've never tried it, but I'm betting that you can use the same struct trick to init all the public members of a class.
OK. Just tried it. Works in GCC 4.8.1. Not going to make that claim in general without looking it up in the standard. Got to get a copy of it.

Something about a completely empty class

#include <iostream>
using namespace std;
class Empty{
char omg[0];
};
int main()
{
Empty em1, em2;
Empty set[100];
cout << sizeof(Empty) << " " << sizeof(em1) << " " << sizeof(em2) << endl;
cout << (long*)&em1 << " " << (long*)&em2 << endl;
cout << "total numbers of element is: " << sizeof(set)/sizeof(*set) << endl;
return 0;
}
Its output is:
0 0 0
0xbff36ad0 0xbff36ac8
numbers of elements is: 4
The results are so surprising.
As shown above, Empty is a class, the size of it and its objects are all 0, why?
Maybe I guess, because a empty class's size is 1, and when the class is not empty, its size is decided by is members, but here its member is special, it is a Arrays of Length Zero, and this array's size is 0, so the size of class and objects are all 0.
It's just my guess. As the program running, we can see that two objects both have address, and the address is different.
Here is my question: if object of 0 size can be implemented, Why the C++ standard states that empty objects have sizeof() = 1, it is for "To ensure that the addresses of two different objects will be different"Why is the size of an empty class not zero? , but now, we do have different address as the output,how does this happen?
Further more, no matter what the size of the array set is, the last line output is always 4, why?
Thanks :)
PS: I run this program on MacOS, and the compiler is Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn)
I'll take a stab since no one more experienced has:
As shown above, Empty is a class, the size of it and its objects are all 0, why?
Zero-sized arrays are prohibited by the standard, therefore as far as the standard is concerned sizeof(Empty) is a meaningless expression, you are already in the realm of undefined behaviour.
Here is my question: if object of 0 size can be implemented, [...] Why is the size of an empty class not zero? , but now, we do have different address as the output,how does this happen?
As above, an object of size 0 cannot exist in a valid standard c++ program (with the exception of base class subobjects).
Your compiler allows this as an extension to the standard, and as long as you use this extension within the scope it was intended for (i.e. as a pre-flexible array member hack) you shouldn't have any problems, although your code is not portable. Your example above however is not how zero-sized arrays are meant to be used (not to mention there are better constructs in c++ for handling these situations anyway).
Your compiler is intelligent enough to provide separate addresses for em1 and em2, but you should find that all elements of set have in fact the same address.
Further more, no matter what the size of the array set is, the last line output is always 4, why?
Since your compiler considers sizeof(Empty) and arrays of Empty to be zero, you are dividing by zero, which is undefined behavior. You might find your program crashes if you disable optimizations, with GCC for instance your program crashes with -O0 but not with -O1.