C++ initializing the dynamic array elements - c++

const size_t size = 5;
int *i = new int[size]();
for (int* k = i; k != i + size; ++k)
{
cout << *k << endl;
}
Even though I have value initialized the dynamic array elements by using the () operator, the output I get is
135368
0
0
0
0
Not sure why the first array element is initialized to 135368.
Any thoughts ?

My first thought is: "NO...just say NO!"
Do you have some really, truly, unbelievably good reason not to use vector?
std::vector<int> i(5, 0);
Edit: Of course, if you want it initialized to zeros, that'll happen by default...
Edit2: As mentioned, what you're asking for is value initialization -- but value initialization was added in C++ 2003, and probably doesn't work quite right with some compilers, especially older ones.

I agree with litb's comment. It would appear to be a compiler bug.
Putting your code in a main function and prefixing with:
#include <iostream>
#include <ostream>
using std::cout;
using std::endl;
using std::size_t;
I got five zeros with both gcc 4.1.2 and a gcc 4.4.0 on a linux variant.
Edit:
Just because it's slightly unusual with array type: In a new expression an initializer of () means that the dynamically allocated object(s) are value initialized. This is perfectly legal even with array new[...] expressions. It's not valid to have anything other than a pair of empty parentheses as an initializer for an array new expression, although non-empty initializers are common for for non-array new epxressions.

Related

Ensuring data is properly stored in c++ array

I have a large code I'm trying to integrate into an existing program. For this, I need to work with c++ 2d and 1d arrays. I'm most familiar with python; if I tried
import numpy as np
x = np.zeros(10)
x[20] = 3
print(x[15])
Both lines 3 and 4 would cause an error to occur. In c++, this doesn't cause an error, instead a segfault will likely occur somewhere in the code (or the computed answer is meaningless). My question is, how can I ensure memory assignment/access in a c++ array is correct? Are the compiler options or debugging tools that would help with this? I am using g++ to compile the code.
With raw arrays, there is no way except if you keep track of the actual size yourself.
In C++, you only pay for what you use. In other words, the bounds checking is up to the developer to implement so that it won't penalize someone who does not actually need it.
A good practice is to not use raw arrays but use standard containers instead (std::array, std::vector, ...). They own their data and keep track of the size for you.
With standard containers, there is an at() member function that throws an std::out_of_range exception if you try to access an out of bounds index.
On the other hand, if you don't want to handle the exception, you still can do the bounds checking manually using the size() member function.
One valid implementation of your example bay be:
std::vector<int> x(10, 0); // Vector of 10 elements initialized with value 0
Bounds checking with exception handling:
try
{
x.at(12) = 3; // at()
}
catch(std::exception & e)
{
std::cout << e.what() << std::endl;
}
Manual bounds checking:
if(12 < x.size())
x[12] = 3; // operator[]
Note: std::vector requires to #include <vector> and std::array requires to #include <array>.
The compiler will warn you if you use a literal index like shown in your question:
#include <stdio.h>
int main() {
int thing[2] = {0};
printf("%d", thing[3]);
return 0;
}
test.cpp:5:15: warning: array index 3 is past the end of the array (which contains 2 elements) [-Warray-bounds]
printf("%d", thing[3]);
^ ~
test.cpp:4:2: note: array 'thing' declared here
int thing[2] = {0};
^
1 warning generated.
It will not, however, generate an error if you use a variable to index into the array, and the value of that variable is out-of-range. For example:
#include <stdio.h>
int main() {
int thing[2] = {0};
int index = 3;
printf("%d", thing[index]);
return 0;
}
The behaviour here is undefined, and the compiler won't let you know. The best you can do in these cases is put a check in place before the array access.

Why pointer can avoid the warning Warrary-bounds

For the code(Full demo) like:
#include <iostream>
struct A
{
int a;
char ch[1];
};
int main()
{
volatile A *test = new A;
test->a = 1;
test->ch[0] = 'a';
test->ch[1] = 'b';
test->ch[2] = 'c';
test->ch[3] = '\0';
std::cout << sizeof(*test) << std::endl
<< test->ch[0] << std::endl;
}
I need to ignore the compilation warning like
warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]
which is raised by gcc8.2 compiler:
g++ -O2 -Warray-bounds=2 main.cpp
A method to ignore this warning is to use pointer to operate the four bytes characters like:
#include <iostream>
struct A
{
int a;
char ch[1];
};
int main()
{
volatile A *test = new A;
test->a = 1;
// Use pointer to avoid the warning
volatile char *ptr = test->ch;
*ptr = 'a';
*(ptr + 1) = 'b';
*(ptr + 2) = 'c';
*(ptr + 3) = '\0';
std::cout << sizeof(*test) << std::endl
<< test->ch[0] << std::endl;
}
But I can not figure out why that works to use pointer instead of subscript array. Is it because pointer do not have boundary checking for which it point to? Can anyone explain that?
Thanks.
Background:
Due to padding and alignment of memory for struct, though ch[1]-ch[3] in struct A is out of declared array boundary, it is still not overflow from memory view
Why don't we just declare the ch to ch[4] in struct A to avoid this warning?
Answer:
struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.
Due to padding and alignment of memory for struct, though ch[1]
– ch[3] in struct A is out of declared array boundary, it is
still not overflow for memory view, so we want to ignore this warning.
C++ does not work the way you think it does. You are triggering undefined behavior. When your code triggers undefined behavior, the C++ standard places no requirement on its behavior. A version of GCC attempts to start some video games when certain kind of undefined behavior is encountered. Anthony Williams also knows at least one case where a particular instance of undefined behavior caused someone's monitor to catch on fire. (C++ Concurrency in Action, page 106) Your code may appear to be working at this very time and situation, but that is just an instance of undefined behavior and you cannot count on it. See Undefined, unspecified and implementation-defined behavior.
The correct way to suppress this warning is to write correct C++ code with well-defined behavior. In your case, declaring ch as char ch[4]; solves the problem.
The standard specifies this as undefined behavior in [expr.add]/4:
When an expression J that has integral type is added to or
subtracted from an expression P of pointer type, the result has the
type of P.
If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]),78 the expressions P +
J and J + P (where J has the value j) point to the
(possibly-hypothetical) array element i + j of x if
0 ≤ i + j ≤ n and the
expression P - J points to the (possibly-hypothetical) array element
i − j of x if 0 ≤ i − j ≤ n.
Otherwise, the behavior is undefined.
78) An object that is not an array element is
considered to belong to a single-element array for this purpose; see
[expr.unary.op]. A pointer past the last element of an array x of
n elements is considered to be equivalent to a pointer to a hypothetical array element n for this purpose; see
[basic.compound].
I want to avoid the warning like
warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]
Well, it is probably better to fix the warning, not just avoid it.
The warning is actually telling you something: what you are doing is undefined behavior. Undefined behavior is really bad (it allows your program to literally anything!) and should be fixed.
Let's look at your struct again:
struct A
{
int a;
char ch[1];
};
In C++, your array has only one element in it. The standard only guarantees array elements of 0 through N-1, where N is the size of the array:
[dcl.array]
...If the value of the constant expression is N, the array
has N elements numbered 0 to N-1...
So ch only has the elements 0 through 1-1, or elements 0 through 0, which is just element 0. That means accessing ch[1], ch[2] overruns the buffer, which is undefined behavior.
Due to padding and alignment of memory for struct, though ch1-ch3 in struct A is out of declared array boundary, it is still not overflow for memory view, so we want to ignore this warning.
Umm, if you say so. The example you gave only allocated 1 A, so as far as we know, there is still only space for the 1 character. If you do allocate more than 1 A at a time in your real program, then I suppose this is possible. But that's still probably not a good thing to do. Especially since you might run into int a of the next A if you're not careful.
A solution to ignore this warning is to use pointer...But I can not figure out why that works. Is it because pointer do not have boundary checking for which it point?
Probably. That would be my guess too. Pointers can point to anything (including destroyed data or even nothing at all!), so the compiler probably won't check it for you. The compiler may not even have a way of knowing whether the memory you point to is valid or not (or may just not care), and, thus, may not even have a way to warn you, much less will warn you. Its only choice is to trust you, so I'm guessing that's why there's no warning.
Why don't we just declare the ch to ch4 in struct A to avoid this warning?
Side issue: actually std::string is probably a better choice here if you don't know how many characters you want to store in here ahead of time--assuming it's different for every instance of A. Anyway, moving on:
Why don't we just declare the ch to ch4 in struct A to avoid this warning?
Answer:
struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.
I'm not sure I understand your design principle completely, but it sounds like std::vector might be a better option. Then, size is kept track of automatically by the std::vector, and you know that everything is stored in ch. To access it, it would be something like:
myVec[i].ch[0]
I don't know all your constraints for your situation, but it sounds like a better solution instead of walking the line around undefined behavior. But that's just me.
Finally, I should mention that if you are still really interested in ignoring our advice, then I should mention that you still have the option to turn off the warning, but again, I'd advise not doing that. It'd be better to fix A if you can, or get a better use strategy if you can't.
There really is no way to work with this cleanly in C++ and iirc the type (a dynamically sized struct) isn't actually properly formed in C++. But you can work with it because compilers still try to preserve compatibility with C. So it works in practice.
You can't have a value of the struct, only references or pointers to it. And they must be allocated by malloc() and released by free(). You can't use new and delete. Below I show you a way that only allows you to allocate pointers to variable sized structs given the desired payload size. This is the tricky bit as sizeof(Buf) will be 16 (and not 8) because Buf::buf must have a unique address. So here we go:
#include <cstddef>
#include <cstdint>
#include <stdlib.h>
#include <new>
#include <iostream>
#include <memory>
struct Buf {
size_t size {0};
char buf[];
[[nodiscard]]
static Buf * alloc(size_t size) {
void *mem = malloc(offsetof(Buf, buf) + size);
if (!mem) throw std::bad_alloc();
return std::construct_at(reinterpret_cast<Buf*>(mem), AllocGuard{}, size);
}
private:
class AllocGuard {};
public:
Buf(AllocGuard, size_t size_) noexcept : size(size_) {}
};
int main() {
Buf *buf = Buf::alloc(13);
std::cout << "buffer has size " << buf->size << std::endl;
}
You should delete or implement the assign/copy/move constructors and operators as desired. A another good idea would be to use std::uniq_ptr or std::shared_ptr with a Deleter that calls free() instead of returning a naked pointer. But I leave that as exercise to the reader.

Using std::sort with pointers to integer variables produces unexpected output

I decided to compile and run this piece of code (out of curiosity) and the G++ compiler successfully compiled the program. I was expecting to see a compile error or a runtime error, or at least the values of a and b swapped (as 5 > 1), since the std::sort() function is being called with two pointers to integers.
(Please note that I know this is not a good practice and I was basically just playing with pointers)
#include <iostream>
#include <algorithm>
int main() {
int a{5};
int b{4};
int c{1};
int* aptr = &a;
int* bptr = &b;
std::sort(aptr, bptr);
std::cout << a << ' ' << b << ' ' << c << '\n';
return 0;
}
However, upon executing the program, the output I got was this:
5 4 1
My question is, how did C++ allow this call to the std::sort() function? And how did it not end up actually sorting everything between the memory addresses of a and b (potentially including even garbage values in memory)?
I mean, if we tried this with C-style arrays like this (std::sort(arr, arr+n)) it would successfully sort the C-style array, because arr and arr+n are basically just pointers where n is the size of the array and arr is the pointer to the first element.
(I'm sorry if this question sounds stupid. I'm still learning C++.)
Your program is ill formed, no diagnostic required. You passed pointers that do not form a range to a std algorithm.
Any behaviour whatsoever by the program is conforming to the C++ standard.
Compilers optimize around the fact that pointers to unrelated objects are incomparable and their difference is undefined. A sort here would trip over so much UB the optimizer could eliminate branches like crazy (as any branch with UB can be eliminated and replaced with the alternative (whatever code the alternate branch is a legal result of UB)).
Good C++ coding style thus focuses on avoiding UB and IL-NDR code.
C++ accepts your code as it is syntactically right. But it doesn't work because sort(it1, it2) expects it1 one to be some starting position of an array and it2 to be the ending position of the same array. you have provided two different arrays to the sort function which can yield any of two following situations:
positionof(it1) < positionof(it2): suppose in computer's memory array a and b are stored in the like this- 5(a), -1, -2, 10, 4(b). then the sort function will sort from 5 to 4 resulting in : -2(a),-1,4,5,10(b).
positionof(it1) > positionof(it2) (your machine's case): the sort function will do nothing as left_position > right_position.

myArray[N] where N = 1,000,000 returns an error whereas myArray[1,000,000] doesn't

File extension: .cpp
I have the following code:
int main() {
int N; cin >> N;
int myArray[N];
return 0;
}
I get an error when I'm trying to run that program if I input N as 1,000,000. However, when I set myArray[N] to myArray[1000000], it doesn't. Why does this happen?
int myArray[N]; is not valid in C++. This behavior was introduced in C99 but never in C++, possibly because it causes a lot of ugly stuff to happen behind the scenes in order to make it work, and it would make the generated code less efficient as a result. In fact this feature was even reversed in C11 where its support is merely optional and not mandatory anymore. Use std::vector<int> instead, or any similar standard container of your choice.
First of all VLA (variable length arrays) is an extension to C++. Compilers are supporting that since usually they support also C which has this functionality in standard.
Second problem this array is allocated on stack.
Stack has very limited size. So when your N has very big value application may crash since stack will overflow.
In this case you should use std::vector which will allocate data on heap.
Question is why array with static array size do not crash?
There can be couple reasons.
Compiler notices that array is not used and based on "As if" rule removes array.
Compiler knows size of the array at compile time, so required stack size is know. This information may be propagated to linker and application is build with bigger stack size then default value (in case of one suorce code application it may be possible). Disclaimer: this is my guessing, I didn't verified this in any form (by testing, or compiler documentation), but I've found this SO answer which confirms my suspicions.
The size of static arrays array[N] must be known at compile time.
Use std::vector for dynamic arrays:
// Example program
#include <iostream>
#include <string>
#include <vector>
int main()
{
int N; std::cin >> N;
std::cout << N << std::endl;
std::vector<int> myArray(N);
std::cout << myArray.size() << std::endl;
return 0;
}
That happens because size of static arrays must be known at compile time.
It is strongly recommended to use std::vector instead of arrays for more flexibility, and safety (this is always the answer: Use a vector if possible). You may use std::vector::reserve to request capacity be at least the length you want it to be. Use std::vector::capacity to see the current capacity.
#include <iostream>
#include <vector>
int main () {
std::vector<int> ivec;
ivec.reserve(100);
std::cout << ivec.capacity() << std::endl;
return 0;
}
Output:
100
Only if you have a very good reason to prefer arrays over vectors, you may dynamically allocate an array. Using std::shared_ptr makes this process much safer and convenient. Here's how it's done the way you want:
#include <iostream>
#include <memory>
int main () {
int N;
std::cin >> N;
std::shared_ptr<int> arr_ptr (new int[N], std::default_delete<int[]>());
for (int i = 0; i != N; ++i) {
arr_ptr.get()[i] = i * 2;
}
for (int i = 0; i != N; ++i) {
std::cout << arr_ptr.get()[i] << std::endl;
}
return 0;
}
Input:
10
Output:
0
2
4
6
8
10
12
14
16
18
That happens because, in C++, the size of static arrays declared with array[N] must be known at compile time and thus your error is propably your compiler which tells you that he must know the size inbeforehand. As stated use std::vector when you need dynamic arrays.

C++ dynamically sized static array puzzler

While trying to explain to someone why a C++ static array could not by dynamically sized, I found gcc disagreeing with me. How does the following code even compile, given that the dimension argc of array is not known at compile time?
#include <iostream>
int main(int argc, char* argv[]) {
int array[argc];
for(int i = 0; i < argc; i++) array[i] = argv[i][0];
for(int i = 0; i < argc; i++) std::cout << i << ": " << char(array[i]) << std::endl;
//for(int i = 0; i < 100; i++) { std::cout << i << " "; std::cout.flush(); array[i] = 0; }
return 0;
}
I tested this with gcc 4.2.1, and specified -Wall, without getting so much as a dirty look from the compiler. If I uncomment the last loop, I get a segfault when I assign to array[53].
I had previously placed guard arrays before and after the declaration of array, and had filled them with zeros, certain that the program must be trashing part of its stack, but gcc reordered the variables on the stack, such that I was unable to observe any data corruption.
Obviously I am not trying to get this code to "work." I'm just trying to understand why gcc even thinks it can compile the code. Any hints or explanations would be much appreciated.
Update: Thanks to all for your helpful and ridiculously fast responses!
Variable-length arrays (VLAs) are part of C99 and have been supported by gcc for a long time:
http://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html
Note that the use of VLAs in C90 and C++ code is non-standard, but is supported by gcc as an extension.
That's a Variable Length Array, which is part of the C99 standard. It isn't part of C++ though.
You can also use the alloca function, which also isn't standard C++, but is widely supported:
int main(int argc, char* argv[])
{
int* array = (int*) alloca( argc * sizeof(int) );
array[0] = 123;
// array automatically deallocated here. Don't call free(array)!
}
These are called variable-length arrays (available since C99) and can be declared only as an automatic variable – try putting static in front and the compiler will reject it. It just involves incrementing the stack pointer with a variable rather than with a constant offset, not more than that.
Before the introduction of variable-length arrays, allocation of variable-sized objects on the stack was done with the alloca function.
Variable-sized stack-based arrays are a G++ extension and perfectly legitimate there. They are, however, not Standard. Stack-based arrays can indeed be variably-sized on most implementations, but the Standard does not mandate this.
Arrays in C++ cannot be sized except by using a constant expression. Arrays sized via a non-const are either part of C99 or a horrible extension foisted on us by GCC. You can get rid of most of the GCC crap by using the -pedantic flag when you compile C++ code.