CUDA C++: declare a vector with length - c++

I recently found this won't work in my global CUDA C++ code that I plan to compile and later to be called in Matlab:
int M = 10; float V[M];
or if I were to import M value from the matlab host code.
But this works:
float V[10];
I was told there exists a function called new that I can use to avoid this problem, but I read online and am still quite confused how to use this new function, and it seems only to apply to host code, is that right? If so, it won't apply to my case then, since my host code is in matlab. Is this a way to get around this, so that I don't have to change vector lengths one by one? Thank you!

I don't know anything about MATLAB or CUDA, but your problem is in C++. Arrays declared like that must have sizes fixed at compile-time.
Solution 1: Fix the size
Declare your variable M const. These are equivalent:
int const M = 10;
const int M = 10;
The compiler would then know that it can assume these variables will always have the same value no matter how you run the program.
Solution 2: C-style dynamic allocation
Dynamic allocation with new and delete. Arrays allocated on the abstract section of memory called the "free-store" (rather than on the "stack", like those arrays you have) can determine their sizes on the fly. You use it like this:
float * V = new V[M]; //V is a pointer to some freestore memory
//You use it and pass it like you would a normal array:
V[2] = 5.5;
int x = some_func(V);
//But after you're done, you should manually free the memory
delete [] V; //don't forget the [], since you used [] in the allocation
I don't recommend this, because of the possiblity of forgetting to delete the memory.
Solution 3: Automatic memory management with C++'s vector
In C++, the work of memory management can be hidden behind structures called classes.
#include<vector>
using std::vector;
vector<float> V(M); //V is a vector of floats, with initial size M
//You use it like a normal array
V[2] = 5.5;
//But to pass it as an array, you need to pass a pointer to the first element
int x = some_func(&V[0]); //read as &(V[0]): pass in the address of V[0]
Solution 3b: CUDA-compatible vector
Thrust is a C++ template library for CUDA based on the Standard Template Library (STL). Thrust allows you to implement high performance parallel applications with minimal programming effort through a high-level interface that is fully interoperable with CUDA C.
http://docs.nvidia.com/cuda/thrust/#vectors
Conclusion
If you're using fixed sizes, I recommend solution 1. If you're using sizes determined during runtime, I recommend vector.
(By the way, when you pass an ordinary array to a function, you are actually passing a pointer to the first element, NOT the array. The name of the array is automatically converted to a pointer type.)

Related

How to pass dynamic and static 2d arrays as void pointer?

for a project using Tensorflow's C API I have to pass a void pointer (void*) to a method of Tensorflow. In the examples the void* points to a 2d array, which also worked for me. However now I have array dimensions which do not allow me to use the stack, which is why I have to use a dynamic array or a vector.
I managed to create a dynamic array with the same entries like this:
float** normalizedInputs;//
normalizedInputs = new float* [noCellsPatches];
for(int i = 0; i < noCellsPatches; ++i)
{
normalizedInputs[i] = new float[no_input_sizes];
}
for(int i=0;i<noCellsPatches;i++)
{
for(int j=0;j<no_input_sizes;j++)
{
normalizedInputs[i][j]=inVals.at(no_input_sizes*i+j);
////
////
//normalizedInputs[i][j]=(inVals.at(no_input_sizes*i+j)-inputMeanValues.at(j))/inputVarValues.at(j);
}
}
The function call needing the void* looks like this:
TF_Tensor* input_value = TF_NewTensor(TF_FLOAT,in_dims_arr,2,normalizedInputs,num_bytes_in,&Deallocator, 0);
In argument 4 you see the "normalizedInputs" array. When I run my program now, the calculated results are totally wrong. When I go back to the static array they are right again. What do I have to change?
Greets and thanks in advance!
Edit: I also noted that the TF_Tensor* input_value holds totally different values for both cases (for dynamic it has many 0 and nan entries). Is there a way to solve this by using a std::vector<std::vector<float>>?
Respectively: is there any valid way pass a consecutive dynamic 2d data structure to a function as void*?
In argument 4 you see the "normalizedInputs" array. When I run my program now, the calculated results are totally wrong.
The reason this doesn't work is because you are passing the pointers array as data. In this case you would have to use normalizedInputs[0] or the equivalent more explicit expression &normalizedInputs[0][0]. However there is another bigger problem with this code.
Since you are using new inside a loop you won't have contiguous data which TF_NewTensor expects. There are several solutions to this.
If you really need a 2d-array you can get away with two allocations. One for the pointers and one for the data. Then set the pointers into the data array appropriately.
float **normalizedInputs = new float* [noCellsPatches]; // allocate pointers
normalizedInputs[0] = new float [noCellsPatches*no_input_sizes]; // allocate data
// set pointers
for (int i = 1; i < noCellsPatches; ++i) {
normalizedInputs[i] = &normalizedInputs[i-1][no_input_sizes];
}
Then you can use normalizedInputs[i][j] as normal in C++ and the normalizedInputs[0] or &normalizedInputs[0][0] expression for your TF_NewTensor call.
Here is a mechanically simpler solution, just use a flat 1d array.
float * normalizedInputs = new float [noCellsPatches*no_input_sizes];
You access the i,j-th element by normalizedInputs[i*no_input_sizes+j] and you can use it directly in the TF_NewTensor call without worrying about any addresses.
C++ standard does its best to prevent programmers to use raw arrays, specifically multi-dimensional ones.
From your comment, your statically declared array is declared as:
float normalizedInputs[noCellsPatches][no_input_sizes];
If noCellsPatches and no_input_sizes are both compile time constants you have a correct program declaring a true 2D array. If they are not constants, you are declaring a 2D Variable Length Array... which does not exist in C++ standard. Fortunately, gcc allow it as an extension, but not MSVC nor clang.
If you want to declare a dynamic 2D array with non constant rows and columns, and use gcc, you can do that:
int (*arr0)[cols] = (int (*) [cols]) new int [rows*cols];
(the naive int (*arr0)[cols] = new int [rows][cols]; was rejected by my gcc 5.4.0)
It is definitely not correct C++ but is accepted by gcc and does what is expected.
The trick is that we all know that the size of an array of size n in n times the size of one element. A 2D array of rows rows of columnscolumns if then rows times the size of one row, which is columns when measured in underlying elements (here int). So we ask gcc to allocate a 1D array of the size of the 2D array and take enough liberalities with the strict aliasing rule to process it as the 2D array we wanted. As previously said, it violates the strict aliasing rule and use VLA in C++, but gcc accepts it.

How to properly work with dynamically-allocated multi-dimensional arrays in C++ [duplicate]

This question already has answers here:
How do I declare a 2d array in C++ using new?
(29 answers)
Closed 7 years ago.
How do I define a dynamic multi-dimensional array in C++? For example, two-dimensional array? I tried using a pointer to pointer, but somehow it is failing.
The first thing one should realize that there is no multi-dimensional array support in C++, either as a language feature or standard library. So anything we can do within that is some emulation of it. How can we emulate, say, 2-dimensional array of integers? Here are different options, from the least suitable to the most suitable.
Improper attempt #1. Use pointer to pointer
If an array is emulated with pointer to the type, surely two-dimensional array should be emulated with a pointer to pointer to the type? Something like this?
int** dd_array = new int[x][y];
That's a compiler error right away. There is no new [][] operator, so compiler gladly refuses. Alright, how about that?
int** dd_array = new int*[x];
dd_array[0][0] = 42;
That compiles. When being executed, it crashes with unpleasant messages. Something went wrong, but what? Of course! We did allocate the memory for the first pointer - it now points to a memory block which holds x pointers to int. But we never initialized those pointers! Let's try it again.
int** dd_array = new int*[x];
for (std::size_t i = 0; i < x; ++i)
dd_array[i] = new int[y];
dd_array[0][0] = 42;
That doesn't give any compilation errors, and program doesn't crash when being executed. Mission accomplished? Not so fast. Remember, every time we did call a new, we must call a delete. So, here you go:
for (std::size_t i = 0; i < x; ++i)
delete dd_array[i];
delete dd_array;
Now, that's just terrible. Syntax is ugly, and manual management of all those pointers... Nah. Let's drop it all over and do something better.
Less improper attempt #2. Use std::vector of std::vector
Ok. We know that in C++ we should not really use manual memory management, and there is a handy std::vector lying around here. So, may be we can do this?
std::vector<std::vector<int> > dd_array;
That's not enough, obviously - we never specified the size of those arrays. So, we need something like that:
std::vector<std::vector<int> > dd_array(x);
for(auto&& inner : dd_array)
inner.resize(y);
dd_array[0][0] = 42;
So, is it good now? Not so much. Firstly, we still have this loop, and it is a sore to the eye. What is even more important, we are seriously hurting performance of our application. Since each individual inner vector is independently allocated, a loop like this:
int sum = 0;
for (auto&& inner : dd_array)
for (auto&& data : inner)
sum += data;
will cause iteration over many independently allocated inner vectors. And since CPU will only cache continuous memory, those small independent vectors cann't be cached altogether. It hurts performance when you can't cache!
So, how do we do it right?
Proper attempt #3 - single-dimensional!
We simply don't! When situation calls for 2-dimensional vector, we just programmatically use single-dimensional vector and access it's elements with offsets! This is how we do it:
vector<int> dd_array(x * y);
dd_array[k * x + j] = 42; // equilavent of 2d dd_array[k][j]
This gives us wonderful syntax, performance and all the glory. To make our life slightly better, we can even build an adaptor on top of a single-dimensional vector - but that's left for the homework.

How can I allocate memory for a data structure that contains a vector?

If I have a struct instanceData:
struct InstanceData
{
unsigned usedInstances;
unsigned allocatedInstances;
void* buffer;
Entity* entity;
std::vector<float> *vertices;
};
And I allocate enough memory for an Entity and std::vector:
newData.buffer = size * (sizeof(Entity) + sizeof(std::vector<float>)); // Pseudo code
newData.entity = (Entity *)(newData.buffer);
newData.vertices = (std::vector<float> *)(newData.entity + size);
And then attempt to copy a vector of any size to it:
SetVertices(unsigned i, std::vector<float> vertices)
{
instanceData.vertices[i] = vertices;
}
I get an Access Violation Reading location error.
I've chopped up my code to make it concise, but it's based on Bitsquid's ECS. so just assume it works if I'm not dealing with vectors (it does). With this in mind, I'm assuming it's having issues because it doesn't know what size the vector is going to scale to. However, I thought the vectors might increase along another dimension, like this?:
Am I wrong? Either way, how can I allocate memory for a vector in a buffer like this?
And yes, I know vectors manage their own memory. That's besides the point. I'm trying to do something different.
It looks like you want InstanceData.buffer to have the actual memory space which is allocated/deallocated/accessed by other things. The entity and vertices pointers then point into this space. But by trying to use std::vector, you are mixing up two completely incompatible approaches.
1) You can do this with the language and the standard library, which means no raw pointers, no "new", no "sizeof".
struct Point {float x; float y;} // usually this is int, not float
struct InstanceData {
Entity entity;
std::vector<Point> vertices;
}
This is the way I would recommend. If you need to output to a specific binary format for serialization, just handle that in the save method.
2) You can manage the memory internal to the class, using oldschool C, which means using N*sizeof(float) for the vertices. Since this will be extremely error prone for a new programmer (and still rough for vets), you must make all of this private to class InstanceData, and do not allow any code outside InstanceData to manage them. Use unit tests. Provide public getter functions. I've done stuff like this for data structures that go across the network, or when reading/writing files with a specified format (Tiff, pgp, z39.50). But just to store in memory using difficult data structures -- no way.
Some other questions you asked:
How do I allocate memory for std::vector?
You don't. The vector allocates its own memory, and manages it. You can tell it to resize() or reserve() space, or push_back, but it will handle it. Look at http://en.cppreference.com/w/cpp/container/vector
How do I allocate memory for a vector [sic] in a buffer like this?
You seem to be thinking of an array. You're way off with your pseudo code so far, so you really need to work your way up through a tutorial. You have to allocate with "new". I could post some starter code for this, if you really need, which I would edit into the answer here.
Also, you said something about vector increasing along another dimension. Vectors are one dimensional. You can make a vector of vectors, but let's not get into that.
edit addendum:
The basic idea with a megabuffer is that you allocate all the required space in the buffer, then you initialize the values, then you use it through the getters.
The data layout is "Header, Entity1, Entity2, ..., EntityN"
// I did not check this code in a compiler, sorry, need to get to work soon
MegaBuffer::MegaBuffer() {AllocateBuffer(0);}
MegaBuffer::~MegaBuffer() {ReleaseBuffer();}
MegaBuffer::AllocateBuffer(size_t size /*, whatever is needed for the header*/){
if (nullptr!=buffer)
ReleaseBuffer();
size_t total_bytes = sizeof(Header) + count * sizeof(Entity)
buffer = new unsigned char [total_bytes];
header = buffer;
// need to set up the header
header->count = 0;
header->allocated = size;
// set up internal pointer
entity = buffer + sizeof(Header);
}
MegaBuffer::ReleaseBuffer(){
delete [] buffer;
}
Entity* MegaBuffer::operator[](int n) {return entity[n];}
The header is always a fixed size, and appears exactly once, and tells you how many entities you have. In your case there's no header because you are using member variables "usedInstances" and "allocatednstances" instead. So you do sort of have a header but it is not part of the allocated buffer. But you don't want to allocate 0 bytes, so just set usedInstances=0; allocatedInstances=0; buffer=nullptr;
I did not code for changing the size of the buffer, because the bitsquid ECS example covers that, but he doesn't show the first time initialization. Make sure you initialize n and allocated, and assign meaningful values for each entity before you use them.
You are not doing the bitsquid ECS the same as the link you posted. In that, he has several different objects of fixed size in parallel arrays. There is an entity, its mass, its position, etc. So entity[4] is an entity which has mass equal to "mass[4]" and its acceleration is "acceleration[4]". This uses pointer arithmetic to access array elements. (built in array, NOT std::Array, NOT std::vector)
The data layout is "Entity1, Entity2, ..., EntityN, mass1, mass2, ..., massN, position1, position2, ..., positionN, velocity1 ... " you get the idea.
If you read the article, you'll notice he says basically the same thing everyone else said about the standard library. You can use an std container to store each of these arrays, OR you can allocate one megabuffer and use pointers and "built in array" math to get to the exact memory location within that buffer for each item. In the classic faux-pas, he even says "This avoids any hidden overheads that might exist in the Array class and we only have a single allocation to keep track of." But you don't know if this is faster or slower than std::Array, and you're introducing a lot of bugs and extra development time dealing with raw pointers.
I think I see what you are trying to do.
There are numerous issues. First. You are making a buffer of random data, telling C++ that a Vector sized piece of it is a Vector. But, at no time do you actually call the constructor to Vector which will initialize the pointers and constructs inside to viable values.
This has already been answered here: Call a constructor on a already allocated memory
The second issue is the line
instanceData.vertices[i] = vertices;
instanceData.vertices is a pointer to a Vector, so you actually need to write
(*(instanceData.vertices))[i]
The third issue is that the contents of *(instanceData.vertices) are floats, and not Vector, so you should not be able to do the assignment there.

Passing a 3-dimensional variable size array by reference in C++

I've been working off of Passing a 2D array to a C++ function , as well as a few other similar articles. However, I'm running into a problem wherein the array I'm creating has two dimensions of variable size.
The initialization looks like:
int** mulePosition;
mulePosition = new int *[boardSize][boardSize][2];
The function looks like:
int moveMule (int boardSize, int ***mulePosition)
And the references look like
moveMule (boardSize, mulePosition)
Boardsize is defined at the beginning of the function, but may change per execution.
The array, properly sized, would be int [boardSize][boardSize][2].
Either use a plain '3-dimensional' array via
int* mulePosition = new int[boardsize*boardsize*2];
and address its elements calculating the offset from the beginning: mulePosition[a][b][c] is mulePosition[boardSize*2*a + 2*b + c],
or use array of arrays of arrays (which would correspond to your int*** declaration) or better (and simpler) vector of vectors of vectors, although the initialization would be a little more complex (you would need to initialize every array/vector).
Either use a std::vector<std::vector<int>> if boardSize is not a const or std::array<std::array<boardSize>, boardSize> (see Multidimensional std::array for how to initialize the std::array).
That being said, it looks like a good idea to hide this in a class Board which provides a nice interface.

What does a C++ vector relate to in Objective-C?

I'm moving from Objective-C to C++ and am not sure what vectors are. I've read documentation about them, I'm not quite clear though. How would you explain C++ vectors using Objective-C analogies?
They're pretty similar to NSMutableArrays but vector is a template class and so can be instanciated for any (standard-template-library compatible) type. NSArrays always hold NSObjects.
That is, assuming you mean std::vector.
They're like NSMutableArrays but can hold any data type - pointer or not. However, each vector can only ever hold one type at a time. Also as it's C++ there are fewer functions e.g. no plist loading/saving.
A C++ vector (presumably you mean something like std::vector) is basically an NSArray, except it holds any type you want (which is the template parameter, e.g. a std::vector<int> holds ints). Of course, it doesn't do memory management (which NSArray does), because arbitrary types aren't retain-counted. So for example a std::vector<id> would be rather inappropriate (assuming Obj-C++).
NSArray is a wrapper around CFArray. CFArray can store any data type.
I don't know much about C++, but it sounds like CFArray can do everything you'd use a vector for? When creating a CFArray you give it a CFArrayCallBacks pointer, which contains any logic that is specific to the data type being stored.
Of course, you could always just rename your Obj-C file to *.mm, and mix C++ into your objective-c.
http://developer.apple.com/library/mac/#documentation/CoreFOundation/Reference/CFArrayRef/Reference/reference.html
In C++ an Array is basically just a pointer to a contiguous block of data---a series of elements. It offers no built-in methods, or higher functionality.
int intArr[] = {0,1,2,3};
is the same as
int *intArr = (int *)malloc(4*sizeof(int));
for(int i = 0; i < 4; i++) { intArr[i] = i; }
A vector (std::vector), on the other hand, is a container for elements (basically like an array) which also offers additional built in methods (see: http://www.cplusplus.com/reference/stl/vector/) such as
vector<int> intArr;
for(int i = 0; i < 4; i++) { intArr.push_back(i); }
// this yields the same content; i.e. intArr = {0,1,2,3}
Both arrays and vectors can be used on any type of objects, int, double, 'MySpacePirateWizardClass' etc. The big bonus of vectors is the additional functionality from built-in functions like:
int arrSize = intArr.size(); // vector also includes useful information
int *firstElement = intArr.begin(); // methods for creating pointers to elements
intArr.delete(0); // modify elements
intArr.insert(0, 2); // modify vector
// now: intArr = {2,1,2,3}
etc etc.
When I know I'm not going to be short on space (or looking at massive amounts of data), I always use vectors because they're so more convenient (even just the size() method alone is reason enough).
Think about vectors as advanced arrays.
If you are new to C++, this page will be your best friend:
http://www.cplusplus.com/reference/stl/vector/