first i would like to say i am Newbie in C++.
As part of my master thesis i am writing a program in C++ which also get as parameters the variables m and d (both integers). Were d is the power of 2 (this means 2^d elements). Parameter m define the number of possible interactions between one element and the total group (2^d elements).
The number of possible interactions is computed as following:
\kappa = \sum_{i=0}^m\binom{d}{i}
(at present i generate vector of vectors for 2^d x \kappa, but my Prof. would like me to create different statistics to different m's. My first though was to to generate a dynamic array of m arrays of different sizes... Then i though of defining a 3-dim array with the biggest needed 2d array, but also program speed is important (e.g d = 20).
i would like to ask for your advice how to define such kind of dynamic array that will also be fast.
Regards
Use boost mutlidimensional arrays
http://www.boost.org/doc/libs/1_41_0/libs/multi_array/doc/index.html
for example look at this code. This is very easy
#include "boost/multi_array.hpp"
#include <cassert>
int main () {
// Create a 3D array that is 3 x 4 x 2
typedef boost::multi_array<double, 3> array_type;
typedef array_type::index index;
array_type A(boost::extents[3][4][2]);
// Assign values to the elements
int values = 0;
for(index i = 0; i != 3; ++i)
for(index j = 0; j != 4; ++j)
for(index k = 0; k != 2; ++k)
A[i][j][k] = values++;
// Verify values
int verify = 0;
for(index i = 0; i != 3; ++i)
for(index j = 0; j != 4; ++j)
for(index k = 0; k != 2; ++k)
assert(A[i][j][k] == verify++);
return 0;
}
Related
I want to be able to concatenate nested vectors in c++ efficiently in a specific way as shown below:
std::vector<std::vector<float>> a = {{1,2,3},{6,7,8},{9,10,11}};
std::vector<float> b;
//wanted: b = {1,6,9,
// 2,7,10,
// 3,8,11};
uint32_t inside = 3;
//this variable would also be known
This result would be achieved doing the following:
uint32_t inside = 3;
std::vector<float> b(inside*a.size());
uint32_t counter = 0;
for(uint32_t i = 0; i < inside; i++){
for(uint32_t j = 0; j < a.size(); j++){
b.at(counter) = a.at(j).at(i);
counter++;
}
}
However, I would like it if I could achieve it using something faster than nested for loops, and that is more memory efficient. Something like the following, (but this obviously would not work):
std::vector<float> b;
b.reserve(inside*a.size());
for(uint32_t i = 0; i < inside; i++){
std::move(a.begin()[i], a.end()[i], std::back_inserter(b));
}
Is there anything built-in to c++ vectors that might be able to do this more efficiently than the code I have above?
~EDIT~
So to explain better, I basically want to take the first element from each vector inside a, and then add it to b. Then, I want to do that for the second element inside each vector in a, and so on. As shown with the desired result in the first block of code.
First of all: at checks if the index exists. We can just go for it and use operator[] which doesn't perform such a check.
Next it's important to realize what the memory structure of nested vectors looks like. The inner elements will be stored sequentially so we should swap the loops, to avoid cache misses:
uint32_t inside = 3;
std::vector<float> b(inside*a.size());
for(uint32_t j = 0; j < a.size(); j++){
for(uint32_t i = 0; i < inside; i++){
b[j + i * inside] = a[j][i];
}
}
I didn't time it but I suspect this should already go faster.
Finally,
if the inner size is really only 3, we should unroll the loop altogether.
I'm focusing on 3x3 matrices for now as my code is volatile. I read the matrix from a text file and print to the console, based on its dimensions I generate the identity matrix.
const int m = 3;
const int n = 3;
int ID[m][n] = {};
for (i = 1; i <= n; ++i){
ID[i][i] = 1;
}
For some reason ID(2)[3] gets printed as 4227276 so I have to force it to zero manually after the fact.
Aside from other elementary row operations like swapping rows based on leading entry position, the main chunk of my code consists of the following:
float matrix[m][n];
int i,j,k,p,s;
for(s = 1;s <= m;++s){
j = s;
k = j + 1;
p = j;
for(i = n;i >= j;--i){ // makes leading entries 1
ID[j][i] = ID[j][i]/matrix[j][j];
matrix[j][i] = matrix[j][i]/matrix[j][j];
}
for(j = k;j <= m;++j){ //converts to upper triangular
for(i = n;i >= 1;--i){
ID[j][i] = ID[j][i] - matrix[j][i]*matrix[p][i];
matrix[j][i] = matrix[j][i] - matrix[j][i]*matrix[p][i];
}
}
}
for(j = (m-1);j >= 1;--j){ //makes entries above diagonal zero
for(i = n;i > j;--i){
ID[j][i] = ID[j][i] - matrix[j][i]*matrix[i][i];
matrix[j][i] = matrix[j][i] - matrix[j][i]*matrix[i][i];
}
}
I'm basically doing to the identity matrix whatever I do to matrix[m][n] to reduce it to row echelon form as you would with the augmented matrix. The row operations are pretty haphazard as I was just doing whatever worked to make matrix[m][n] an identity matrix. Afterwards, I just slotted ID[m][n] in there... not really sure what's happening but the result is half right.
my result
right answer
I realize that I the term I subtract from ID might need to be a multiple of ID but that makes it even worse. What mistakes have I made?
In C++ the indexes of a n-dimensional array start from 0 to n-1: so the first element of the array a is a[0], the second element is a[1], ..., the n-th element is a[n-1].
When you use the for
for (i = 1; i <= n; ++i){
ID[i][i] = 1;
}
you are discarding the first elements of each row and each column, accessing moreover to memory positions that do not belong to ID (e.g. ID[n][n]) which contain some unknown values.
You have to iterate over your arrays using for cycles such as
for (i = 0; i < n; ++i){
ID[i][i] = 1;
}
or if you desire
for (i = 1; i <= n; ++i){
ID[i-1][i-1] = 1;
}
but I found last solution quite confusing.
I have a struct:
struct xyz{
int x,y,z;
};
and I initialize a struct xyz type vector:
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
{
for (int k = 0; k < N; k++)
{
v.x=i;
v.y=j;
v.z=k;
vect.push_back(v);
}
}
}
then I want to transform that vector to array because array is 2 time faster than vector to manipulate, so I do
xyz arr[vect.size()];
std::copy(vect.begin(), vect.end(), arr);
when I run this program it shows me segmentation fault which I think is because vect.size() is too large.
So I am wondering is there any way to convert that large size vector to array without that problem.
I appreciate for any help
My overly pedantic comment got too big, so instead I'll try to make this a somewhat roundabout answer. The short answer is probably just to stick with vector but make sure to use reserve; oh, and benchmark.
You didn't say what compiler or C++ version you're using, so I'll just go with my current gcc.godbolt.org default of gcc 4.9.2, C++14. I'm also assuming that you really want this as a 1-dimension array, rather than the more natural (for your example) 3.
If you know N at compile time, you could do something like this (assuming I got the array offset calculation correct):
#include <array>
...
std::array<xyz, N*N*N> xyzs;
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
for (int k = 0; k < N; k++) {
xyzs[i*N*N+j*N+k] = {i, j, k};
}
}
}
The biggest downsides, IMO:
error-prone offset calculation
depending on N, where the code is run, etc, this can blow the stack
On the compilers I tried this on, the optimizers seem to understand that we're moving through the array in contiguous order, and the generated machine code is more sensible, but it could also be written like so, if you prefer:
#include <array>
...
std::array<xyz, N*N*N> xyzs;
auto p = xyzs.data();
for (int i = 0; i < N; ++i) {
for (int j = 0; j < N; ++j) {
for (int k = 0; k < N; ++k) {
(*p++) = {i, j, k};
}
}
}
Of course, if you actually know N at compile time, and it won't blow the stack, you might consider a 3-dimensional array xyz xyzs[N][N][N]; since this might be more natural for the way these things are being ultimately being used.
As pointed out in comments, variable length arrays aren't legal C++, but they are legal in C99; if you don't know N at compile time you should be allocating off the heap.
A vector and an array will wind up being identical in terms memory layout; they differ in that vector allocates memory from the heap, and the array (as you are writing it) would be on the stack. The only recommendation I'd make is to call reserve before entering your loop:
vect.reserve(N*N*N);
This means you'll only be doing a single memory allocation up front, rather than grow-and-copy mechanism that you'll get from a default constructed vector.
Assuming xyz is as simple as you declare here, you could also do something like the second example above:
std::vector<xyz> xyzs{N*N*N};
auto p = xyzs.data();
for (int i = 0; i < N; ++i) {
for (int j = 0; j < N; ++j) {
for (int k = 0; k < N; ++k) {
(*p++) = {i, j, k};
}
}
}
You lose the safety of push_back, and it is less efficient if xyz default constructor needs to do anything (like if xyz members were changed to have default values).
Having said all that, you really should benchmark. But then, you should probably be benchmarking the code that ultimately uses this array, rather than the code to construct it; I'd have other concerns if construction was dominating usage.
I am trying to follow the Guassian Elimination algorithm in https://courses.engr.illinois.edu/cs554/fa2015/notes/06_lu_8up.pdf in order to implement LU factorization and eventually parallelize it with openmp. Does the following algorithm look correct, where l is the multiplier and m is the matrix?
void decompose2(double **m) {
begin =clock();
int i=0, j=0, k=0;
for(k = 1; k < size - 1; k++)
{
for(i = k + 1; i < size; i++)
{
l[i][k] = m[i][k]/m[k][k];
}
for(j = k + 1; j < size; j++)
{
for(i = k + 1; k < size; k++)
{
m[i][j] = m[i][j] - (l[i][k]*m[k][j]);
}
}
}
end = clock();
}
I don't think it is correct because according to a different paper the times I am getting after parallelization on the same number of processors are completely different.
"Does the following algorithm look correct, …" -- No, because
arrays are 0-index in C++,
double[size][size] (which you are likely using) is not convertible to double**,
int is not a good type for iterators (use size_t instead),
you don't check if m[k][k] might be (close to) zero, when you might have to swap rows.
Please notice that I only looked at the obvious implementation errors, not at possible instances to make the code better, e.g. increasing the stability of the calculation.
I have a vector of N objects, and I would like to iterate through all neighbor permutations of this vector. What I call a neighbor permutation is a permutation where only two elements of the original vector would be changed :
if I have a vector with 'a','b','c','d' then :
'b','a','c','d' //is good
'a','c','b','d' //is good
'b','a','d','c' //is not good (2 permutations)
If I use std::next_permutation(myVector.begin(), myVector.end() then I will get all the possible permutations, not only the "neighbor" ones...
Do you have any idea how that could be achieved ?
Initially, I thought I would filter the permutations that have a hamming distance greater than 2.
However, if you really only need to generate all the vectors resulting by swapping one pair, it would be more efficient if you do like this:
for(int i = 0; i < n; i++)
for(int j = i + 1; j < n; j++)
// swap i and j
Depending on whether you need to collect all the results or not, you should make a copy or the vector before the swap, or swap again i and j after you processed the current permutation.
Collect all the results:
std::vector< std::vector<T> > neighbor_permutations;
for(int i = 0; i < n; i++) {
for(int j = i + 1; j < n; j++) {
std::vector<T> perm(v);
std::swap(perm[i], perm[j]);
neighbor_permutations.push_back(perm);
}
}
Faster version - do not collect results:
for(int i = 0; i < n; i++) {
for(int j = i + 1; j < n; j++) {
std::swap(v[i], v[j]);
process_permutation(v);
std::swap(v[i], v[j]);
}
}
Perhaps it's a good idea to divide this into two parts:
How to generate the "neighbor permutations"
How to iterate over them
Regarding the first, it's easy to write a function:
std::vector<T> make_neighbor_permutation(
const std::vector<T> &orig, std::size_t i, std::size_t j);
which swaps i and j. I did not understand from your question if there's an additional constraint that j = i + 1, in which case you could drop a parameter.
Armed with this function, you now need an iterator that iterates over all legal combinations of i and j (again, I'm not sure of the interpretation of your question. It might be that there are n - 1 values).
This is very easy to do using boost::iterator_facade. You simply need to define an iterator that takes in the constructor your original iterator, and sets i (and possibly j) to initial values. As it is incremented, it needs to update the index (or indices). The dereference method needs to call the above function.
Another way to get it, just a try.
int main()
{
std::vector<char> vec={'b','a','c','d'};
std::vector<int> vec_in={1,1,0,0};
do{
auto it =std::find(vec_in.begin(),vec_in.end(),1);
if( *(it++) ==1)
{
for(auto &x : vec)
{
std::cout<<x<<" ";
}
std::cout<<"\n";
}
} while(std::next_permutation(vec_in.begin(),vec_in.end()),
std::next_permutation(vec.begin(),vec.end()) );
}