I have a vector of vector in C++ defined with: vector < vector<double> > A;
Let's suppose that A has been filled with some values. Is there a quick way to extract a row vector from A ?
For instance, A[0] will give me the first column vector, but how can I get quicky the first row vector?
There is no "quick" way with that data structure, you have to iterate each column vector and get the value for desired row and add it to temporary row vector. Wether this is fast enough for you or not depends on what you need. To make it as fast as possible, be sure to allocate right amount of space in the target row vector, so it doesn't need to be resized while you add the values to it.
Simple solution to performance problem is to use some existing matrix library, such as Eigen suggested in comments.
If you need to do this yourself (because it is assignment, or because of licensing issues, or whatever), you should probably create your own "Matrix 2D" class, and hide implementation details in it. Then depending on what exactly you need, you can employ tricks like:
have a "cache" for rows, so if same row is fetched many times, it can be fetched from the cache and a new vector does not need to be created
store data both as vector of row vectors, and vector of column vectors, so you can get either rows or columns at constant time, at the cost of using more memory and making changes twice as expensive due to duplication of data
dynamically change the internal representation according to current needs, so you get the fixed memory usage, but need to pay the processing cost when you need to change the internal representation
store data in flat vector with size of rows*columns, and calculate the correct offset in your own code from row and column
But it bears repeating: someone has already done this for you, so try to use an existing library, if you can...
There is no really fast way to do that. Also as pointed out, I would say that the convention is the other way around, meaning that A[0] is actually the first row, rather than the first column. However even trying to get a column is not really trivial, since
{0, 1, 2, 3, 4}
{0}
{0, 1, 2}
is a very possible vector<vector<double>> A, but there is no real column 1, 2, 3 or 4. If you wish to enforce behavior like same length columns, creating a Matrix class may be a good idea (or using a library).
You could write a function that would return a vector<double> by iterating over the rows, storing the appropriate column value. But you would have to be careful about whether you want to copy or point to the matrix values (vector<double> / vector<double *>). This is not very fast as the values are not next to each other in memory.
The answer is: in your case there is no corresponding simple option as for columns. And one of the reasons is that vector> is a particular poorly suited container for multi-dimensional data.
In multi-dimension it is one of the important design decisions: which dimension you want to access most efficiently, and based on the answer you can define your container (which might be very specialized and complex).
For example in your case: it is entirely up to you to call A[0] a 'column' or a 'row'. You only need to do it consistently (best define a small interface around which makes this explicit). But STOP, don't do that:
This brings you to the next level: for multi-dimensional data you would typically not use vector> (but this is a different issue). Look at smart and efficient solutions already existing e.g. in ublas https://www.boost.org/doc/libs/1_65_1/libs/numeric/ublas/doc/index.html or eigen3 https://eigen.tuxfamily.org/dox/
You will never be able to beat these highly optimized libraries.
Related
pros, I need some performance-opinions with the following:
1st Question:
I want to store objects in a 3D-Grid-Structure, overall it will be ~33% filled, i.e. 2 out of 3 gridpoints will be empty.
Short image to illustrate:
Maybe Option A)
vector<vector<vector<deque<Obj>> grid;// (SizeX, SizeY, SizeZ);
grid[x][y][z].push_back(someObj);
This way I'd have a lot of empty deques, but accessing one of them would be fast, wouldn't it?
The Other Option B) would be
std::unordered_map<Pos3D, deque<Obj>, Pos3DHash, Pos3DEqual> Pos3DMap;
where I add&delete deques when data is added/deleted. Probably less memory used, but maybe less fast? What do you think?
2nd Question (follow up)
What if I had multiple containers at each position? Say 3 buckets for 3 different entities, say object types ObjA, ObjB, ObjC per grid point, then my data essentially becomes 4D?
Another illustration:
Using Option 1B I could just extend Pos3D to include the bucket number to account for even more sparse data.
Possible queries I want to optimize for:
Give me all Objects out of ObjA-buckets from the entire structure
Give me all Objects out of ObjB-buckets for a set of
grid-positions
Which is the nearest non-empty ObjC-bucket to
position x,y,z?
PS:
I had also thought about a tree based data-structure before, reading about nearest neighbour approaches. Since my data is so regular I had thought I'd save all the tree-building dividing of the cells into smaller pieces and just make a static 3D-grid of the final leafs. Thats how I came to ask about the best way to store this grid here.
Question associated with this, if I have a map<int, Obj> is there a fast way to ask for "all objects with keys between 780 and 790"? Or is the fastest way the building of the above mentioned tree?
EDIT
I ended up going with a 3D boost::multi_array that has fortran-ordering. It's a little bit like the chunks games like minecraft use. Which is a little like using a kd-tree with fixed leaf-size and fixed amount of leaves? Works pretty fast now so I'm happy with this approach.
Answer to 1st question
As #Joachim pointed out, this depends on whether you prefer fast access or small data. Roughly, this corresponds to your options A and B.
A) If you want fast access, go with a multidimensional std::vector or an array if you will. std::vector brings easier maintenance at a minimal overhead, so I'd prefer that. In terms of space it consumes O(N^3) space, where N is the number of grid points along one dimension. In order to get the best performance when iterating over the data, remember to resolve the indices in the reverse order as you defined it: innermost first, outermost last.
B) If you instead wish to keep things as small as possible, use a hash map, and use one which is optimized for space. That would result in space O(N), with N being the number of elements. Here is a benchmark comparing several hash maps. I made good experiences with google::sparse_hash_map, which has the smallest constant overhead I have seen so far. Plus, it is easy to add it to your build system.
If you need a mixture of speed and small data or don't know the size of each dimension in advance, use a hash map as well.
Answer to 2nd question
I'd say you data is 4D if you have a variable number of elements a long the 4th dimension, or a fixed large number of elements. With option 1B) you'd indeed add the bucket index, for 1A) you'd add another vector.
Which is the nearest non-empty ObjC-bucket to position x,y,z?
This operation is commonly called nearest neighbor search. You want a KDTree for that. There is libkdtree++, if you prefer small libraries. Otherwise, FLANN might be an option. It is a part of the Point Cloud Library which accomplishes a lot of tasks on multidimensional data and could be worth a look as well.
It seems that in general, vectors are to be preferred over lists, see for example here, when appending simple types.
What if I want to fill a matrix with simple types? Every vector is a column, so I am going to go through the outer vector, and append 1 item to each vector, repeatedly.
Do the latter vectors of the outer vector always have to be moved when the previous vectors increase their reserved space? As in is the whole data in one continuous space? Or do the vectors all just hold a pointer to their individual memory regions, so the outer vector's memory size remains unchanged even as the individual vectors grow?
Taken from the comments, it appears vectors of vectors can happily be used.
For small to medium applications, the efficiency of the vectors will seldom be anything to worry about.
There a couple of cases where you might worry, but they will be uncommon.
class CData {}; // define this
typedef std::vector<CData> Column;
typedef std::vector<Column> Table;
Table tab;
To add a new row, you will append an item to every column. In a worst case, you might cause a reallocation of each column. That could be a problem if CData is extremely complex and the columns currently hold a very large number of CData cells (I'd say 10s of thousands, at least)
Similarly, if you add a new column and force the Table vector to reallocate, it might have to copy each column and again, for very large data sets, that might be a bit slow.
Note, however, that a new-ish compiler will probably be able to move the columns from the old table to the new (rather than copying them), making that trivially fast.
As #kkuryllo said in a comment, it generally isn't anything to worry about.
Work on making your code as clean, simple and correct as possible. Only if profiling reveals a performance problem should you worry about optimising for speed.
I'm using a the Yale representation of sparse-matrix in power iteration algorithm, everything goes well and fast.
But, now I have a problem, my professor will send the sparse-matrix in a data file unordered, and since the matrix is symmetric only one pair of index will be there.
The problem is, in my implementation I need to insert the elements in order.
I tried somethings to read and after that insert into my sparse-matrix:
1) Using a dense matrix.
2) Using another sparse-matrix implementation, I tried with std::map.
3) Priority queue, I made a array of priority_queues. I insert the element i,j in the priority_queue[i], so when I pop the priority_queue[i] I take the lowest j-index of the row i.
But I need something really fast and memory efficient, because the largest matrix I'll use will be like 100k x 100k, and the tries I made was so slow, almost 200 times slower than the power iteration itself.
Any suggestions? Sorry for the poor english :(
The way many sparse loaders work is that you use an intermediate pure triples structure. I.e. whatever the file looks like, you load it into something like vector< tuple< row, column, value> >.
You then build the sparse structure from that. The reason is precisely what you're running into. Your sparse matrix structure likely has constraints, like you need to know the number of elements in each row/column, or the input needs to be sorted, etc. You can massage your triples array into whatever you need (i.e. by sorting it).
This also makes it trivial to solve your symmetry dilemma. For every triple in the source file, you insert both (row, column, value) and (column, row, value) into your intermediate structure.
Another option is to simply write a script that will sort your professor's file.
FYI, in the sparse world the number of elements (nonzeros) is what matters, not the dimensions of the matrix. 100k-by-100k is a meaningless piece of information. That entire matrix could be totally empty, for example.
i want to store some kind of distance-matrix (2D), where each entry has some alternatives (different coordinates). So i want to access the distance for example x=1 with x_alt=3 and y=3 with y_alt=1, looking in a 4-dim multi-array with array[1][3][3][1].
The important thing to notice is the following: the 2 most inner arrays/vectors don't have the same size for different values of the outer ones.
After a first init step, where i calculate the values, no more modifying is needed!
This should be easily possible with the use of stl-vectors:
vector<vector<vector<vector<double> > > >`extended_distance_matrix;
where i can dynamically iterate over the outer 2 dimensions and fill only as much alternatives to the inner 2 dimensions as i need (e.g. with push_back()).
Questions:
Is this kind of data-structure definition possible with Boost.MultiArray? How?
Is it a good idea to use Boost.MultiArray instead of the nested vectors? Performance (especially lookups! (Memory-layout))? Easy of use?
Thanks for any input!
sascha
PS: The boost documentation didn't help me. Maybe one can use multi_array_ref to get already sized arrays into the whole 4D-structure?
Edit:
At the moment i'm thinking of another approach: flattening the alternatives -> one bigger matrix with all the distances between the alternatives. Then i only need to calc the number of alternatives per node, build up the prefix sum (which is describing the matrix position/shift) and can then access the information in a 2-step-way.
But my questions are still open.
it sounds like you need:
multi_array<ublas::matrix<type>,2>
Boost.MultiArray deals with contiguous memory (arranged logically in many dimensions) so it is difficult to add elements in the inner dimensions. MultiArrays can be dynamically resized, e.g. to add elements in any dimension, but this is a costly operation that almost certainly need (internally) reallocation and copying.
Because of that requirement MultiArray is not the best option. But from what you say it looks like a combination of the two would be appropriate to you.
boost::multi_array<std::vector<std::vector<type>>, 2> data
The very nice thing is that the indexing interface doesn't change with respect to boost::multi_array<type, 4>. For example data[1][2][3][4] still makes sense.
I don't know from your post how you handle the inner dimension but it could even make sense to use this:
boost::multi_array<boost::multi_array<type>, 2>, 2> data
In any case, unless you really need to do linear algebra I would stay away from boost::ublas::array, or at most use it for the internal array if type is numeric. boost::multi_array<boost::ublas::array<type>, 2> data which is mentioned in the other answer.
I have a large matrix M implemented as vector<vector<double> with m rows, i.e. the matrix is a vector of m vectors of n column elements.
I have to create two subsets of the rows of this matrix, i.e. A holds k rows, and B the other m-k rows. The rows must be selected at random.
I do not want to use any libraries other than STL, so no boost either.
Two approaches that I considered are:
generate a std::random_shuffle of row indices, copy the rows indicated by the first k indices to A and the rows indicated by the other m-k to B
do a std::random_shuffle of M. copy k rows to A, and m-k rows to B
Are there other options, and how do the two options above compare in terms of memory consumption and processing time?
Thanks!
If you don't need B to be in random order, then random_shuffle does more work than you need.
If by "STL" you mean SGI's STL, then use random_sample.
If by "STL" you mean the C++ standard libraries, then you don't have random_sample. You might want to copy the implementation, except stop after the first n steps. This will reduce the time.
Note that these both modify a sequence in place. Depending where you actually want A and B to end up, and who owns the original, this might mean that you end up doing 2 copies of each row - once to get it into a mutable container for the shuffle, then again to get it into its final destination. This is more memory and processing time than is required. To fix this you could maybe swap rows out of the temporary container, and into A and B. Or copy the algorithm, but adapt it to:
Make a list of the indexes of the first vector
Partially shuffle the list of indexes
Copy the rows corresponding to the first n indexes to A, and the rest to B.
I'm not certain this is faster or uses less memory, but I suspect so.
The standard for random_shuffle says that it performs "swaps". I hope that means it's efficient for vectors, but you might want to check that it is actually using an optimised swap, not doing any copying. I think it should mean that, especially since the natural implementation is as Fisher-Yates, but I'm not sure whether the language in the standard should be taken to guarantee it. If it is copying, then your second approach is going to be very slow. If it's using swap then they're roughly comparable. swap on a vector is going to be slightly slower than swap on an index, but there's not a whole lot in it. Swapping either a vector or an index is very quick compared with copying a row, and there are M of each operation, so I doubt it will make a huge difference to total run time.
[Edit: Alex Martelli was complaining recently about misuse of the term "STL" to mean the C++ standard libraries. In this case it does make a difference :-)]
I think that the random_shuffle of indices makes sense.
If you need to avoid the overhead of copying the individual rows, and don't mind sharing data, you might be able to make the A and B matrices be vectors of pointers to rows in the original matrix.
Easiest way: use a random whole number generator, and queue up the offsets of each row in a separate container (assuming that a row is of the same offset in each column vector). The container you use will depend more on its eventual use. (Remember to take care of size_t limit, and tying in the offset container's life to the Matrix itself).
Edit: replaced pointers with offsets - makes more sense and is safer.
Orig: Quick Q:is each (inner) vector a row or a column?
i.e. is M a vector of columns or a vector of rows?