I have long used pointers to arrays in C programs of the form:
int (*myarray)[2] = (int (*)[2]) malloc(n*sizeof(int[2]));
However, how can I do this in C++ using new? Can I do this?
int (*myarray)[2] = (int (*)[2]) new int[n][2];
EDIT:
Looks like my original post was incomplete and confusing. Here is a code snippet that I compiled and tested which appears to do the right thing but I wanted to confirm from C++ experts that I was using an appropriate C++ construct.
#include <iostream>
int main() {
int n=5;
int (*A)[2] = new int[n][2];
for (int i = 0; i < n; i++)
for (int j = 0; j < 2; j++)
A[i][j] = 2*i+j;
for (int i = 0; i < n; i++)
std::cout << A[i][0] << " " << A[i][1] << "\n";
delete myarray;
}
The joys of using C++ and STL is you get a vector class that provides array like behaviour.
This also makes it easier to manage and read...
std::vector< std::vector<int> > myarray(n);
If you don't want to use the STL then there is always...
typedef int intarray[2];
intarray* ints = new intarray[n];
ints[0][0] = 1;
...
ints[n-1][1] = 6;
I personally would write an extra line of code if it made the code easier to read.
I think the clearest way to do what you ask for is to use a typedef for the array:
typedef int array_t[2];
array_t* yourarray = new array_t[n];
Don't do that though, because it requires doing manual memory management and that tends to be tedious, error-prone and brittle, in particular with respect to exception safety. Instead, take a look at the std::array class template (new in C++11, but otherwise available via Boost) and at the std::vector class template.
To clarify the different between storing std::vector and std::array in a container, the latter is typically more efficient when there is a small and fixed number of elements involved. The reason is that the array class doesn't allocate things dynamically as the vector does. For that, vector needs three pointers (beginning, end of used storage and end of allocated storage) plus of course the storage for the data itself (plus maybe some overhead induced by the allocator) all of which need to be loaded into the CPU cache for use. Considering an LP64 system, that would require 32 bytes to store 8 bytes of data, compared to just 8 bytes using std::array.
Related
I am currently working with code that at the moment requires me to make an array of vectors (I am new to C++ - if this is an absolutely terrible idea, I would greatly appreciate the feedback).
Let's say I allocate memory on the heap for my vectors like so:
#include <iostream>
#include <vector>
#include <random>
int main() {
typedef std::vector<double> doubleVec;
long N = 1000;
long M = 1000;
doubleVec *array = new doubleVec[N];
for (long i = 0; i < N; i++) {
doubleVec currentVec = array[i];
currentVec.resize(M);
for (long j = 0; j < M; j++)
currentVec[j] = std::rand();
}
// ... do something with the data structure
delete [] array;
}
When I've done everything I need to do with the data, how should I safely deallocate this data structure?
NOTE: There were other things I did wrong in my inital post that I didn't intend to be the focus of the discussion (uninitialized variables, didn't resize vectors, etc). I fixed those now. Thank you all for pointing those out.
f this is an absolutely terrible idea, I would greatly appreciate the feedback).
Yes, this is a terribly bad idea. To be specific, owning bare pointers are a bad idea. Instead of manually allocating a dynamic array, it is usually better to use a container such as std::vector.
How to safely deallocate a heap-allocated array of vectors?
By using a vector instead of manual dynamic array. In this case, a simple solution is to use a vector of vectors.
A potentially better solution would be to allocate a single flat vector of doubles of size 1000*1000 where elements of each "subvector" is after another. This requires a bit of simple math to calculate the index of the sub vectors, but is in most use cases faster.
Other notes:
typedef std::vector<double> doubleVec;
Avoid obfuscating the program by hiding type names like this.
for (long j; j < M; j++)
^^^^^^
You leave this variable uninitialised. When the indeterminate value is used later, the behaviour of the program is undefined.
Furthermore, you forgot to include the standard headers which define std::vector and std::rand.
I got a seg fault
See the other answer regarding you not actually adding any elements to the vectors that are in the array. This, and the uninitialised variables are the most likely reason for your segfault depending on what "do something" does.
The problem is not in deallocating but in each vector allocation. Where in your code do you use the M value (except while accessing the elements)? There are other problems in your code, so the quick fix is:
for (long i; i < N; i++) {
doubleVec ¤tVec = array[i];
currentVec.resize(M);
for (long j; j < M; j++)
currentVec[j] = std::rand();
}
Pay special attention that currentVec is a reference: otherwise no changes would be stored in the array.
Anyway, the main question everybody would have is: why do you need to have an array of vectors?.. The vector of vectors is a much more elegant solution.
Update: I've missed the fact that you have forgotten to initialize both i and j. In addition to the advice to initialize them I would recommend to use the auto keyword that would make it impossible to leave the variable uninitialized:
for (auto i=0UL; i < N; i++) {
doubleVec ¤tVec = array[i];
currentVec.resize(M);
for (auto j=0UL; j < M; j++)
currentVec[j] = std::rand();
}
0UL means zero of the type unsigned long.
I am essentially trying to declare something like this but I am unable to because of "too many initializer variables".
int** a = { {1},{2,3},{3,4,5} };
As a side question, if this were to work with some slight modification would it have the size of 9 (3x3) or 6 (1+2+3)?
I can implement this behavior with vectors such as the following, but I am curious as to why can't I do it more directly.
vector<int*>a = vector<int*>();
for (int i = 0; i < 20; i++)
{
a.push_back(new int[i]);
for (int j = 0; j <= i; j++)
a[i][j] = i+j;
}
Using a double pointer in C++ statically has a different memory arrangement than using new dynamically. The difference is that a static ** takes continuous memory automatically at compile time, where a dynamic one will not. Static multidimensional arrays are stored continuously, as discussed here.
Related: my question here.
Since your array cannot be stored continuously, it cannot be declared statically.
I think it will be very easy when I do it with this:
int n = 4;
int matrix[n][n];
rather then:
p = new int *[n];
for (int i = 0; i < n; i++)
p[i] = new int [n];
So, Which is better? When do we use ** to create a matrix or a array?
int n = 4;
int matrix[n][n];
Your first example isn't c++ standard conform, the standard doesn't support variable length arrays.
int** p = new int *[n];
for (int i = 0; i < n; i++)
p[i] = new int [n];
For your second example you should better use a std::vector<int> instead and organize matrix rows and columns as sections in the vector:
int n = 4;
std::vector<int> matrix(n*n);
Using new and delete yourself is usually not necessary in C++ and peppered with pitfalls and obstacles, which are taken care of in the appropriate standard library container and smart pointer classes.
First declaration is non-standard: n must be known at compile time in order for the code to compile. Some compilers offer variable-length arrays as an extension, but the code remains non-standard.
The standard approach to situations when you need a matrix in C++ is to use std::vector<std::vector<T>> for situations when the size is not known until the runtime. When the size is known at compile time and you prefer allocation in automatic area, use std::array<N,std::array<N,T>> instead of vectors.
Both these approaches let you construct objects that behave exactly like arrays of arrays, but you don't need to manage their memory explicitly.
Inside a function, I make a 2d array that fills itself from a text file and needs to get returned to main. The array stays a constant size through the whole program.
I know this is something that gets asked a lot, but I always seem to get one of two answers:
Use std::vector or std::array or some other STD function. I don't really understand how these work, is there any site actually explaining them and how they act compared to normal arrays? Are there any special #includes that I need?
Or
Use a pointer to the array, and return the pointer. First, on some of the answers to this it apparently doesn't work because of local arrays. How do I tell when it does and doesn't work? How do I use this array back in the main function?
I'm having more trouble with the concept of pointers and std::things than with the actual code, so if there's a website you know explains it particularly well, feel free to just put that.
Not necessarily the best solution, but the easiest way to get it working with vectors. The advantages are that you don't need to delete memory (happens automatically) and the array is bounds-checked in debug mode on most compilers.
#include <vector>
#include <iostream>
using array2D = std::vector< std::vector< int > >;
array2D MyFunc(int x_size, int y_size)
{
array2D array(y_size, vector< int >(x_size));
int i = 0;
for (int y = 0; y < array.size(); y++)
{
for (int x = 0; x < array[y].size(); x++)
{
// note the order of the index
array[y][x] = i++;
}
}
return array;
}
int main()
{
array2D bob = MyFunc(10, 5);
for (int y = 0; y < bob.size(); y++)
{
for (int x = 0; x < bob[y].size(); x++)
{
cout << bob[y][x] << "\n";
}
}
}
Live example:
http://ideone.com/K4ilfX
Sounds like you are new to C++. If this is indeed the case, I would suggest using arrays for now because you probably won't be using any of the stuff that STL containers give you. Now, let's talk about pointers.
You are correct that if you declare a local array in your function, the main function won't have access to it. However, this is not the case if you dynamically allocate the array using the new keyword. When you use new to allocate your array, you essentially tell the compiler to reserve a chunk of memory for your program. You can then access it using a pointer, which is really just the address of that chunk of memory you reserved. Therefore, instead of passing the entire array to the main function, all you need to do is pass a pointer (address) to that array.
Here are some relevant explanations. I will add to them as I find more:
Dynamic Memory
The easiest way to create a 2d array is as follows:
char (*array)[10];
array = new array[5][10];
Two dimensional arrays can be tricky to declare. The parenthesis above in the variable declaration are important to tell the compiler array is a pointer to an array of 10 characters.
It is really essential to understand pointers with C and C++ unless using the std:: collections. Even then, pointers are widely prevalent, and incorrect use can be devastating to a program.
In C++ you can easily allocate one dimensional array like this:
T *array=new T[N];
And you can delete it with one statement too:
delete[] array;
The compiler will know the magic how to deallocate the correct number of bytes.
But why can't you alloc 2-dimensional arrays like this?
T *array=new T[N,M];
Or even like this?
T *array=new T[N,M,L];
If you want a multidimensional you have to do it like this:
T **array=new T*[N];
for(int i=0;i<N;i++) array[i]=new T[M];
If you want a fast program that uses matrices (matrix operations, eigenvalue algorithms, etc...) you might want to utilize the cache too for top performance and this requires the data to be in the same place. Using vector<vector<T> > is the same situation. In C you can use variable length arrays on the stack, but you can't allocate them on the heap (and stack space is quite limited), you can do variable length arrays in C++ too, but they won't be present in C++0x.
The only workaround is quite hackish and error-phrone:
T *array=new T[N*M];
for(int i=0;i<N;i++)
for(int j=0;j<M;j++)
{
T[i*N+j]=...;
}
Your workaround of doing T *array=new T[N*M]; is the closest you can get to a true multi-dimensional array. Notice that to locate the elements in this array, you need the value of M (I believe your example is wrong, it should be T[i*M+j]) which is known only at run-time.
When you allocate a 2D array at compile-time, say array[5][10], the value 10 is a constant, so the compiler simply generates code to compute i*10+j. But if you did new T[N,M], the expression i*M+j depends on the value of M at the time the array was allocated. The compiler would need some way to store the value of M along with the actual array itself, and things are only going to get messy from here. I guess this is why they decided not to include such a feature in the language.
As for your workaround, you can always make it less "hackish" by writing a wrapper class that overloads operator (), so that you could do something like array(i, j) = ....
Because multidimensional array is something different then array of arrays/pointers.
use std::vector
Why can't a multidimensional array be allocated with one new call in C++?
Because when the ISO wrote the C++ language standard, they didn't decide to add that feature to the language. I don't know why they decided not to.
If you don't like that, you can create helper functions to allocate/free multidimensional arrays, or you can switch to a language like C# or Java that does support easily allocating multidimensional arrays.
What you can do, however, is allocate an object containing a two-dimensional array off the heap. I would just write a wrapper class for it.
I was thinking about this question last night, and this solution came to me.
T * raw = new T[N*M];
T ** array = new T*[N];
for(int i=0; i<N; i++)
array[i] = raw + i * M;
Now "array" acts just like a contiguous static sized two dimensional array. You just have to take care of deleting both the raw array, and the multi-dimensional array.
I would recommend that you use a Boost::multi_array, from the library of the same name, which provides a simple interface to a multidimensional array. It can be allocated in one line, and at a sufficiently high optimization level is usually as fast as a native array.
Here's some example code from the library's website:
#include "boost/multi_array.hpp"
#include <cassert>
int
main () {
// Create a 3D array that is 3 x 4 x 2
typedef boost::multi_array<double, 3> array_type;
typedef array_type::index index;
array_type A(boost::extents[3][4][2]);
// Assign values to the elements
int values = 0;
for(index i = 0; i != 3; ++i)
for(index j = 0; j != 4; ++j)
for(index k = 0; k != 2; ++k)
A[i][j][k] = values++;
// Verify values
int verify = 0;
for(index i = 0; i != 3; ++i)
for(index j = 0; j != 4; ++j)
for(index k = 0; k != 2; ++k)
assert(A[i][j][k] == verify++);
return 0;
}
Because the comma is an operator.
int a = (3, 5, 7, 9);
The program will evaluate 3, discard the result,
evaluate 5, discard the result,
evaluate 7, discard the result,
evaluate 9, and assign it to a.
Hence the syntax you are looking for can't be use,
and retain backward compatibility to c.