struct Face
{
// Matrixd is 1D representation of 2D matrix
std::array < Matrixd<5,5>, 2 > M;
};
std::vector <Face> face;
I have a distributed for-loop among nodes. After all nodes finish working on their elements I would like to transfer corresponding elements among nodes. But AFAIK to use MPI_Allgatherv the data should be contiguous. First of all, I switched to 1D representation of 2D matrices (I was using [][] notation before). Now I want to make face.M to be contiguous. I am thinking to copy all elements of say, M[0] to an std::array an transfer that among nodes. Is this way efficient? To give an idea of number of data I work with, if I have 20k cells, at maximum I have 20k*3=60k faces. I might have a million of cells, too.
A true 2D array in C/C++, e.g. int foo[5][5] is already contiguous in memory; it's basically just syntactic sugar for int foo[25] where accesses like foo[3][2] implicitly look up foo[3*5 + 2] in the flat equivalent. Switching to a Matrixd defined in a single dimension won't change the actual memory layout.
std::array is (mostly) just a wrapper for C-style arrays as well; with no virtual members, and compile time defined size with no internal pointers (just the raw array), it's also going to be contiguous. I strongly suspect if you checked the assembly produced, you'd find that the array of Matrixds is already contiguous.
In short, I don't think you need to change anything; you're already contiguous, so MPI should be fine.
Let's say I have a 2D array and I want to pass it's i th column to a sort function that takes in a 1D array and sorts it.
Can it be done without copying the column to another array in C/C++ language. I am concerned about reducing time and space used. (Ofcourse the complexity remains same)
I suppose that by sort you mean std::sort from STL, which takes random access iterators. So all you need to do is provide column iterators.
You can either implement one by yourself (example), use some iterator library (ie. Boost.Iterator) or use some matrix implementation which provides row/column iterators.
If you can write your own sort function, it's rather easy; you just make the interface like this:
void Sort (T a [], size_t n, size_t stride);
The key is in the stride parameter, which is the distance between the elements of this "virtual" array. For example, if you have a float x [10][20]; and you want to send its column #2, you'd do this: (some casts omitted for clarity)
Sort (x[0] + 2, 10, 20); // Usually, stride is the width of the 2D array
Inside the Sort function, you access the ith element of an array that has a stride like this:
a[i * stride] = 42;
That's it.
You can use the same principle to write your own MatrixColumnView class that wraps up this concept and can be passed into templated functions that take arrays.
If you want to work with STL or STL-like libraries, you can simply write your own MatrixColumnIterator iterator class that essentially uses an stride internally and gives iteration over a column of a 2D array.
As far as i know, the multidimensional array storage in C/C++ is actually a 1D-arrary,
which you can refer a very good explanation in this post : How to get column of a multidimensional array in C/C++?
Therefore I do not think there's any default / easy method to extract a particular column of a 2D array and pass it to another function.
Simple question:
For my assignment I am asked to count the words in a file and keep track of their frequency. I am to create a parallel int array for the frequency.
Is a parallel array a special data structure, or does it simply mean I am creating 2 arrays, where one is dependent on the other. For example, I create 2 dynamic arrays and update both inside the loop with respect to my i variable from the for loop.
A parallel array is basically what you posit in your question. It's two distinct arrays connected by the index.
For example, a parallel array counting frequencies of temperatures may be:
int tempVal [100];
size_t tempCount[100];
and the temperature value at index 42 has a frequency given by tempCount[42].
Purists will argue (and they do have a point) that it's better to provide a single array of a structure such as:
typedef struct {
int val;
size_t count;
} tFreq;
tFreq tempFreq[100];
and C++ has collections that will do this for you, such as std::pair. But, if your assignment specifically calls for parallel arrays, I suspect std::pair would not be considered thus.
There isn't a parallel array data structure as such.
You can create two arrays and address them in parallel.
There are some alternatives, such as creating an array of std::pair, or (probably the "right" one for the task at hand) an std::unordered_map (or possibly an std::map instead).
No structure is special, it's always composed of primitives.
I have an assignment in which I must read a list of 4000 names from a text file and sort then into a C style array as they're being read in (rather than reading them all then sorting). Since this is going involve a lot elements changing indices would it be possible to use bitshifting to rearrange large quantities of elements simultaneously?For example,
declare a heap based array of 20 size
place variable x index 10
perform a bitshift on index 9 with the size of the array data type so that x is now in index 11
Also, if you have any tips on the task in general I'd appreciate it.
No, that doesn't sound at all like something you'd use bitshifting for.
You will have distinct elements (the names) stored in an array, and you need to change the order of entire elements. This is not what bitshifting is used for; it is used to move the bits in a single integer to the left or to the right.
You should just learn qsort().
Not sure about the "sort as they're being read in" requirement, but the easiest solution would be to just call qsort() as each name is added. If that's not allowed or deemed too expensive, think about how to do a "sorted insert" against an array.
By the way, a typical approach in C would be to work with an array of pointers to strings, rather than an array of actual strings. This is good, since sorting an array of pointers is much easier.
So you would have:
char *names[4000];
instead of
char names[4000][64 /* or whatever */];
This would require you to dynamically allocate space for each name as it's loaded though, which isn't to hard. Especially not if you have strdup(). :)
If using qsort() is not allowed(would be pretty stupid to do so after every insert), you could write your own insertion sort. It's not exactly a very efficient way of sorting large arrays but I suppose it's what your teacher is expecting for.
This is my little big question about containers, in particular, arrays.
I am writing a physics code that mainly manipulates a big (> 1 000 000) set of "particles" (with 6 double coordinates each). I am looking for the best way (in term of performance) to implement a class that will contain a container for these data and that will provide manipulation primitives for these data (e.g. instantiation, operator[], etc.).
There are a few restrictions on how this set is used:
its size is read from a configuration file and won't change during execution
it can be viewed as a big two dimensional array of N (e.g. 1 000 000) lines and 6 columns (each one storing the coordinate in one dimension)
the array is manipulated in a big loop, each "particle / line" is accessed and computation takes place with its coordinates, and the results are stored back for this particle, and so on for each particle, and so on for each iteration of the big loop.
no new elements are added or deleted during the execution
First conclusion, as the access on the elements is essentially done by accessing each element one by one with [], I think that I should use a normal dynamic array.
I have explored a few things, and I would like to have your opinion on the one that can give me the best performances.
As I understand there is no advantage to use a dynamically allocated array instead of a std::vector, so things like double** array2d = new ..., loop of new, etc are ruled out.
So is it a good idea to use std::vector<double> ?
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > my_array that can be indexed like my_array[i][j], or is it a bad idea and it would be better to use std::vector<double> other_array and acces it with other_array[6*i+j].
Maybe this can gives better performance, especially as the number of columns is fixed and known from the beginning.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
Another option, the one that I am using so far is to use Blitz, in particular blitz::Array:
typedef blitz::Array<double,TWO_DIMENSIONS> store_t;
store_t my_store;
Where my elements are accessed like that: my_store(line, column);.
I think there are not much advantage to use Blitz in my case because I am accessing each element one by one and that Blitz would be interesting if I was using operations directly on array (like matrix multiplication) which I am not.
Do you think that Blitz is OK, or is it useless in my case ?
These are the possibilities I have considered so far, but maybe the best one I still another one, so don't hesitate to suggest me other things.
Thanks a lot for your help on this problem !
Edit:
From the very interesting answers and comments bellow a good solution seems to be the following:
Use a structure particle (containing 6 doubles) or a static array of 6 doubles (this avoid the use of two dimensional dynamic arrays)
Use a vector or a deque of this particle structure or array. It is then good to traverse them with iterators, and that will allow to change from one to another later.
In addition I can also use a Blitz::TinyVector<double,6> instead of a structure.
So is it a good idea to use std::vector<double> ?
Usually, a std::vector should be the first choice of container. You could use either std::vector<>::reserve() or std::vector<>::resize() to avoid reallocations while populating the vector. Whether any other container is better can be found by measuring. And only by measuring. But first measure whether anything the container is involved in (populating, accessing elements) is worth optimizing at all.
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > [...]?
No. IIUC, you are accessing your data per particle, not per row. If that's the case, why not use a std::vector<particle>, where particle is a struct holding six values? And even if I understood incorrectly, you should rather write a two-dimensional wrapper around a one-dimensional container. Then align your data either in rows or columns - what ever is faster with your access patterns.
Do you think that Blitz is OK, or is it useless in my case?
I have no practical knowledge about blitz++ and the areas it is used in. But isn't blitz++ all about expression templates to unroll loop operations and optimizing away temporaries when doing matrix manipulations? ICBWT.
First of all, you don't want to scatter the coordinates of one given particle all over the place, so I would begin by writing a simple struct:
struct Particle { /* coords */ };
Then we can make a simple one dimensional array of these Particles.
I would probably use a deque, because that's the default container, but you may wish to try a vector, it's just that 1.000.000 of particles means about a single chunk of a few MBs. It should hold but it might strain your system if this ever grows, while the deque will allocate several chunks.
WARNING:
As Alexandre C remarked, if you go the deque road, refrain from using operator[] and prefer to use iteration style. If you really need random access and it's performance sensitive, the vector should prove faster.
The first rule when choosing from containers is to use std::vector. Then, only after your code is complete and you can actually measure performance, you can try other containers. But stick to vector first. (And use reserve() from the start)
Then, you shouldn't use an std::vector<std::vector<double> >. You know the size of your data: it's 6 doubles. No need for it to be dynamic. It is constant and fixed. You can define a struct to hold you particle members (the six doubles), or you can simply typedef it: typedef double particle[6]. Then, use a vector of particles: std::vector<particle>.
Furthermore, as your program uses the particle data contained in the vector sequentially, you will take advantage of the modern CPU cache read-ahead feature at its best performance.
You could go several ways. But in your case, don't declare astd::vector<std::vector<double> >. You're allocating a vector (and you copy it around) for every 6 doubles. Thats way too costly.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
(other_array[i,j] won't work too well, as i,j employs the comma operator to evaluate the value of "i", then discards that and evaluates and returns "j", so it's equivalent to other_array[i]).
You will need to use one of:
other_array[i][j]
other_array(i, j) // if other_array implements operator()(int, int),
// but std::vector<> et al don't.
other_array[i].identifier // identifier is a member variable
other_array[i].identifier() // member function getting value
other_array[i].identifier(double) // member function setting value
You may or may not prefer to put get_ and set_ or similar on the last two functions should you find them useful, but from your question I think you won't: functions are prefered in APIs between parts of large systems involving many developers, or when the data items may vary and you want the algorithms working on the data to be independent thereof.
So, a good test: if you find yourself writing code like other_array[i][3] where you've decided "3" is the double with the speed in it, and other_array[i][5] because "5" is the the acceleration, then stop doing that and give them proper identifiers so you can say other_array[i].speed and .acceleration. Then other developers can read and understand it, and you're much less likely to make accidental mistakes. On the other hand, if you are iterating over those 6 elements doing exactly the same things to each, then you probably do want Particle to hold a double[6], or to provide an operator[](int). There's no problem doing both:
struct Particle
{
double x[6];
double& speed() { return x[3]; }
double speed() const { return x[3]; }
double& acceleration() { return x[5]; }
...
};
BTW / the reason that vector<vector<double> > may be too costly is that each set of 6 doubles will be allocated on the heap, and for fast allocation and deallocation many heap implementations use fixed-size buckets, so your small request will be rounded up t the next size: that may be a significant overhead. The outside vector will also need to record a extra pointer to that memory. Further, heap allocation and deallocation is relatively slow - in you're case, you'd only be doing it at startup and shutdown, but there's no particular point in making your program slower for no reason. Even more importantly, the areas on the heap may just around in memory, so your operator[] may have cache-faults pulling in more distinct memory pages than necessary, slowing the entire program. Put another way, vectors store elements contiguously, but the pointed-to-vectors may not be contiguous.