I am having a tough time getting my head wrapped around how to initialize a vector of vectors.
typedef vector< vector < vector < vector< float > > > > DataContainer;
I want this to conform to
level_1 (2 elements/vectors)
level_2 (7 elements/vectors)
level_3 (480 elements/vectors)
level_4 (31 elements of float)
Addressing the elements isn't the issue. That should be as simple as something like
dc[0][1][2][3];
The problem is that I need to fill it with data coming in out of order from a file such that successive items need to be placed something like
dc[0][3][230][22];
dc[1][3][110][6]; //...etc
So I need to initialize the V of V beforehand.
Am I psyching myself out or is this as simple as
for 0..1
for 0..6
for 0..479
for 0..30
dc[i][j][k][l] = 0.0;
It doesn't seem like that should work. Somehow the top level vectors must be initialized first.
Any help appreciated. I am sure this must be simpler than I am imagining.
Please do not use nested vectors if the size of your storage is known ahead of time, i.e. there is a specific reason why e.g. the first index must be of size 6, and will never change. Just use a plain array. Better yet, use boost::array. That way, you get all the benefits of having a plain array (save huge amounts of space when you go multi-dimensional), and the benefits of having a real object instantiation.
Please do not use nested vectors if your storage must be rectangular, i.e. you might resize one or more of the dimensions, but every "row" must be the same length at some point. Use boost::multi_array. That way, you document "this storage is rectangular", save huge amounts of space and still get the ability to resize, benefits of having a real object, etc.
The thing about std::vector is that it (a) is meant to be resizable and (b) doesn't care about its contents in the slightest, as long as they're of the correct type. This means that if you have a vector<vector<int> >, then all of the "row vectors" must maintain their own separate book-keeping information about how long they are - even if you want to enforce that they're all the same length. It also means that they all manage separate memory allocations, which hurts performance (cache behaviour), and wastes even more space because of how std::vector reallocates. boost::multi_array is designed with the expectation that you may want to resize it, but won't be constantly resizing it by appending elements (rows, for a 2-dimensional array / faces, for a 3-dimensional array / etc.) to the end. std::vector is designed to (potentially) waste space to make sure that operation is not slow. boost::multi_array is designed to save space and keep everything neatly organized in memory.
That said:
Yes, you do need to do something before you can index into the vector. std::vector will not magically cause the indexes to pop into existence because you want to store something there. However, this is easy to deal with:
You can default-initialize the vector with the appropriate amount of zeros first, and then replace them, by using the (size_t n, const T& value = T()) constructor. That is,
std::vector<int> foo(10); // makes a vector of 10 ints, each of which is 0
because a "default-constructed" int has the value 0.
In your case, we need to specify the size of each dimension, by creating sub-vectors that are of the appropriate size and letting the constructor copy them. This looks like:
typedef vector<float> d1;
typedef vector<d1> d2;
typedef vector<d2> d3;
typedef vector<d3> d4;
d4 result(2, d3(7, d2(480, d1(31))));
That is, an unnamed d1 is constructed of size 31, which is used to initialize the default d2, which is used to initialize the default d3, which is used to initialize result.
There are other approaches, but they're much clumsier if you just want a bunch of zeroes to start. If you're going to read the entire data set from a file, though:
You can use .push_back() to append to a vector. Make an empty d1 just before the inner-most loop, in which you repeatedly .push_back() to fill it. Just after the loop, you .push_back() the result onto the d2 which you created just before the next-innermost loop, and so on.
You can resize a vector beforehand with .resize(), and then index into it normally (up to the amount that you resized to).
You would probably have to set a size or reserve memory
Could you do a for-each or a nested for that would call
myVector.resize(x); //or size
on each level.
EDIT: I admit this code is not elegant. I like #Karl answer which is the right way to go.
This code is compiled and tested. It printed 208320 zeroes which is expected (2 * 7 * 480 * 31)
#include <iostream>
#include <vector>
using namespace std;
typedef vector< vector < vector < vector< float > > > > DataContainer;
int main()
{
const int LEVEL1_SIZE = 2;
const int LEVEL2_SIZE = 7;
const int LEVEL3_SIZE = 480;
const int LEVEL4_SIZE = 31;
DataContainer dc;
dc.resize(LEVEL1_SIZE);
for (int i = 0; i < LEVEL1_SIZE; ++i) {
dc[i].resize(LEVEL2_SIZE);
for (int j = 0; j < LEVEL2_SIZE; ++j) {
dc[i][j].resize(LEVEL3_SIZE);
for (int k = 0; k < LEVEL3_SIZE; ++k) {
dc[i][j][k].resize(LEVEL4_SIZE);
}
}
}
for (int i = 0; i < LEVEL1_SIZE; ++i) {
for (int j = 0; j < LEVEL2_SIZE; ++j) {
for (int k = 0; k < LEVEL3_SIZE; ++k) {
for (int l = 0; l < LEVEL4_SIZE; ++l) {
dc[i][j][k][l] = 0.0;
}
}
}
}
for (int i = 0; i < LEVEL1_SIZE; ++i) {
for (int j = 0; j < LEVEL2_SIZE; ++j) {
for (int k = 0; k < LEVEL3_SIZE; ++k) {
for (int l = 0; l < LEVEL4_SIZE; ++l) {
cout << dc[i][j][k][l] << " ";
}
}
}
}
cout << endl;
return 0;
}
Related
I hope to use vector to process the 2d array data obtained by calling a third-party library.
Although I can simply use the loop to assign values one by one, But I prefer to use methods such as insert and copy to deal with this.
I found that reserve doesn't seem to work here. So I used resize instead.
double **a = new double *[1024];
for (int i = 0; i < 1024; ++i) {
a[i] = new double[512];
}
std::vector<std::vector<double>> a_v;
a_v.resize(1024, std::vector<double>(512));
// Copy a -> a_v
I made these attempts:
// Not Working, just 0 in vector
for (int i = 0; i < 1024; ++i){
a_v[i].insert(a_v[i].end(), a[i], a[i] + 512);
}
Is there any good way to solve this problem.
For a 1D array I write like this:
double *b = new double[1024];
std::vector<double> b_v;
b_v.reserve(1024);
b_v.insert(b_v.end(), b, b + 1024);
If the size of the source array is fixed, it is strongly recommended to use std::array instead of std::vector. std::array has continuous memory layout for multidimensional structures, thus std::memcpy can be used for copy if the source array is also continuous in memory.
Look back to the original question. If you want to construct a std::vector<std::vector<double>> from the source array, use a single loop to construct 1D vectors from the source:
std::vector<std::vector<double>> a_v;
a_v.reserve(1024);
for (int i = 0; i < 1024; ++i) {
a_v.emplace_back(std::vector<double>(&(a[i][0]), &(a[i][512])));
}
If there is already a std::vector<std::vector<double>> with the proper size, and you literally just want to do a copy from the source, use the assign member function:
for (int i = 0; i < 1024; ++i) {
a_v[i].assign(&(a[i][0]), &(a[i][512]));
}
std::vector<std::vector<double>> a_v;
a_v.resize(1024, std::vector<double>(512));
is just
std::vector<std::vector<double>> a_v{1024, std::vector<double>(512)};
Unfortunately there is no vector constructor that takes over ownership of a C-style array. So you have to copy all 1024 * 512 doubles. And with the above definition of the vector you needlessly initialize all the doubles before you overwrite them.
You can do it with reserve so none of the double get initialized before you overwrite them and no vector gets copied or moved:
std::vector<std::vector<double>> a_v;
a_v.reserve(1024);
for (std::size_t i = 0; i < 1024; ++i) {
a_v.emplace_back();
std::vector<double> &b_v = a_v.back();
b_v.reserve(512);
b_v.insert(b_v.end(), a[i], a[i] + 512);
}
I have a 2048x2048 matrix of grayscale image,i want to find some points which value are > 0 ,and store its position into an array of 2 columns and n rows (n is also the number of founded points) Here is my algorithm :
int icount;
icount = 0;
for (int i = 0; i < 2048; i++)
{
for (int j = 0; j < 2048; j++)
{
if (iout.at<double>(i, j) > 0)
{
icount++;
temp[icount][1] = i;
temp[icount][2] = j;
}
}
}
I have 2 problems :
temp is an array which the number of rows is unknown 'cause after each loop the number of rows increases ,so how can i define the temp array ? I need the exact number of rows for another implementation later so i can't give some random number for it.
My algorithm above doesn't work,the results is
temp[1][1]=0 , temp[1][2]=0 , temp[2][1]=262 , temp[2][2]=655
which is completely wrong,the right one is :
temp[1][1]=1779 , temp[1][2]=149 , temp[2][1]=1780 , temp[2][2]=149
i got the right result because i implemented it in Matlab, it is
[a,b]=find(iout>0);
How about a std::vector of std::pair:
std::vector<std::pair<int, int>> temp;
Then add (i, j) pairs to it using push_back. No size needed to be known in advance:
temp.push_back(make_pair(i, j));
We'll need to know more about your problem and your code to be able to tell what's wrong with the algorithm.
When you define a variable of pointer type, you need to allocate memory and have the pointer point to that memory address. In your case, you have a multidimensional pointer so it requires multiple allocations. For example:
int **temp = new int *[100]; // This means you have room for 100 arrays (in the 2nd dimension)
int icount = 0;
for(int i = 0; i < 2048; i++) {
for(int j = 0; j < 2048; j++) {
if(iout.at<double>(i, j) > 0) {
temp[icount] = new int[2]; // only 2 variables needed at this dimension
temp[icount][1] = i;
temp[icount][2] = j;
icount++;
}
}
}
This will work for you, but it's only good if you know for sure you're not going to need any more than the pre-allocated array size (100 in this example). If you know exactly how much you need, this method is ok. If you know the maximum possible, it's also ok, but could be wasteful. If you have no idea what size you need in the first dimension, you have to use a dynamic collection, for example std::vector as suggested by IVlad. In case you do use the method I suggested, don't forget to free the allocated memory using delete []temp[i]; and delete []temp;
I am trying to understand how a 3 dimensional array is stored in memory and the difference between how std:vector is stored.
This is the way I understand that they are stored, and std::vectors, same way, with the difference that they make full use of memory blocks
a[0][0][0] a[0][0][1] a[0][0][2]... a[0][1][0] a[0][1][1] ... a[1][0][0] a[1][0][1]...
My goal is to find which is the most efficient way to traverse and array.
For example, I have array:
v[1000][500][3];
so how is more efficient to traverse it?
for(i = 0; i < 1000; i++)
{
for(j = 0; j < 500; j++)
{
for(k = 0; k < 3; ++k)
{
//some operation
}
}
}
or may be it would be more efficient to declare the array as;
v[3][500][1000]
and to traverse as
for(i = 0; i < 3; i++) {
for(j = 0; j < 500; j++)
{
for(k = 0; k < 1000; ++k)
{
//some operation
}
} }
Is there any CL tool to visualize how arrays are stored?
You're right in your representation of arrays in memory values are contiguous. So an int v[2][2][2] initialized to 0 would look like:
[[[0, 0], [0, 0]], [[0, 0], [0, 0]]]
As far as performance goes you want to access data as close to each other as possible to avoid data cache misses so iterating on the outer most dimension first is a good thing since they are located next to each other.
Something that might happen though with your first example is the compiler might optimize the inner loop(if right conditions are met) and unroll it so you would save some time there by skipping branching.
Since both your example are already iterating in the right way, I would say profile it and see which is faster.
std::vector also store its element contiguous in memory but since it is 1 dimension, locality apply by default(provided you aren't iterating randomly). The good side of vector is they can grow whereas an array can't(automatically anyway).
When the memory address is continuous (e.g., complied time array a[][][]), the most efficient way to traverse a multidimensional array is use a pointer. The a[i][j][k] actually is &a[0][0][0]+(i*j*k + j*k + k). Thus, initialize a pointer p to the beginning address, then calls *(p++)
int main() {
int a[2][3]={{1,2,3},{4,5,6}};
int *p = &a[0][0];
for( int i=0; i<6; ++i ){
cout<<*(p++)<<endl;
}
return 0;
}
To make it visible:
#include <iostream>
int main()
{
int a[][3] = { { 0, 1, 2 }, { 3, 4, 5 } };
int* p = reinterpret_cast<int*>(a);
for(unsigned i = 0; i < 6; ++i) {
std::cout << *(p + i);
}
std::cout << std::endl;
return 0;
}
Shows a row major order - see: http://en.wikipedia.org/wiki/Row-major_order
Having this, you should iterate per row to utilize the cache. In higher dimension N you will get similar, where each element represents a block of data with a dimension N-1
In my Algorithm, I need to keep all the combinations of (3 bytes of) extended ASCII characters. Following is my code But when i run this code, the program gets killed on terminal when the last step occurs(BigVector.pushback).Why is this so and what can be the alternative in my case?
vector<set<vector<int> > > BigVector;
set<vector<int> > SmallSet;
for(int k=0; k <256; k++)
{
for(int j=0; j <256; j++)
{
for(int m=0; m <256; m++)
{
vector<int> temp;
temp.push_back(k);
temp.push_back(j);
temp.push_back(m);
SmallSet.insert(temp);
}
}
}
BigVector.push_back(SmallSet);
P.S: I have to keep the ascii characters like this:
{ {(a,b,c) ,(a,b,d),...... (z,z,z)} }
Please note that 256^3 = 16,777,216. This is huge, especially when you use vector and set!
Because you only need to record 256 = 2^8 information, you can store this in a char ( one byte). You can store each combination in one tuple of three chars. The memory is now 16,777,216 / 1024 / 1024 = 16 MB. On my computer, it finishes in 1 second.
If you accept C++11, I would suggest using std::array, instead of writing a helper struct like Info in my old code.
C++11 code using std::array.
vector<array<char,3>> bs;
.... for loop
array<char,3> temp;
temp[0]=k; temp[1]=j; temp[2]=m;
bs.push_back(temp);
C++98 code using home-made struct.
struct Info{
char chrs[3];
Info ( char c1, char c2, char c3):chrs({c1,c2,c3}){}
};
int main() {
vector<Info> bs;
for (int k = 0; k < 256; k++) {
for (int j = 0; j < 256; j++) {
for (int m = 0; m < 256; m++) {
bs.push_back(Info(k,j,m));
}
}
}
return 0;
}
Ways to use the combinations. (You can write wrapper method for Info).
// Suppose s[256] contains the 256 extended chars.
for( auto b : bs){
cout<< s[b.chrs[0]] << " " << s[b.chrs[1]] << " "<< s[b.chrs[2]] << endl;
}
First: your example doesn't correspond with the actual code.
You are creating ( { (a,a,a), ..., (z,z,z) } )
As already mentioned you will have 16'777'216 different vectors. Every vector will hold the 3 characters and typically ~20 bytes[1] overhead because of the vector object.
In addition a typical vector implementation will reserve memory for future push_backs.
You can avoid this by specifying the correct size during initialization or using reserve():
vector<int> temp(3);
(capacity() tells you the "real" size of the vector)
push_back makes a copy of the object you are pushing [2], which might be too much memory and therefore crashing your program.
16'777'216 * (3 characters + 20 overhead) * 2 copy = ~736MiB.
(This assumes that the vectors are already initialized with the correct size!)
See [2] for a possible solution to the copying problem.
I do agree with Potatoswatter: your data structure is very inefficient.
[1] What is the overhead cost of an empty vector?
[2] Is std::vector copying the objects with a push_back?
I need help in using the boost multidimensional array. I have to construct a two dimensional array where: (0 <= j <= 1) and (i) grows dynamically according to:
long boostArray[i][j];
Thus, It's like constructing a table of (unknown) columns and two rows.
I started already with the example provided at the Boost Library website:
#include "boost/multi_array.hpp"
#include <cassert>
int main () {
// 3 x 4 x 2
typedef boost::multi_array<double, 3> array_type;
typedef array_type::index index;
array_type A(boost::extents[3][4][2]);
int values = 0;
for(index i = 0; i != 3; ++i)
for(index j = 0; j != 4; ++j)
for(index k = 0; k != 2; ++k)
A[i][j][k] = values++;
int verify = 0;
for(index i = 0; i != 3; ++i)
for(index j = 0; j != 4; ++j)
for(index k = 0; k != 2; ++k)
assert(A[i][j][k] == verify++);
return 0;
}
The problem is that i didn't thoroughly understand the above code in order to tweak on its structure and build up my desired array. I don't know precisely how to add/delete elements to/from my array while using the Boost Library especially if this array grows dynamically as i described above.
For example, when dealing with vectors, i tend to use: push_back and pop_back after resizing the vector.
For your particular usecase, you're probably better off using vector<pair<T,T>> or vector<array<T,2>>. You can then use push_back, and it's efficient. boost::multi_array sounds like overkill, otoh:
You can't use something like push_back there, because whenever you extend one dimension of an N-dimensional array, you'd need to supply a slice of N-1 dimensions of initial data. That is usually not very efficient, esp. since you can only add to the dimension with the largest stride in this way. What you need to use instead is resize and assignment.
// std::vector<> equivalent (with vector<>, it's considered bad style)
v.resize( v.size() + 1 );
v[v.size()-1] = newElement;
// boost::multi_array (from the tutorial)
typedef boost::multi_array<int, 3> array_type;
array_type::extent_gen extents;
array_type A(extents[3][3][3]);
A[0][0][0] = 4;
A[2][2][2] = 5;
// here, it's the only way:
A.resize(extents[2][3][4]);
assert(A[0][0][0] == 4);
// A[2][2][2] is no longer valid.
To reiterate: N-dimensional arrays, N>2, are inherently much less dynamic than one-dimensional ones (because of the stride factor). The above resize requires a lot of copying of the data, unlike the vector case, which only needs to copy data when size()>capacity().