I want to dynamically allocate an array in a for loop using pointers. As the for loop proceeds, the size of the array should increase by one and a new element should be added then. The usual method involves using the new operator, but this always allocated a fixed memory at the time of declaration. Is there any way to do so?
I tried to do so using the following code (simplified for explaing the problem):
sameCombsCount = 0;
int **matchedIndicesArray;
for(int i = 0; i<1000; i++) //loop condition a variable
{
sameCombsCount++;
matchedIndicesArray = new int*[sameCombsCount]; // ??
// Now add an element in the new block created...
}
The thing is, I do not know the size of the for loop during execution time. It can vary depending upon execution conditions and inputs given. I don't think this is the correct way to do so. Can someone suggest a way to do so?
std::vector handles the resizing for you:
sameCombsCount = 0;
std::vecotr<int> matchedIndicesArray;
for(int i = 0; i<1000; i++) //loop condition a variable
{
sameCombsCount++;
#if 0
matchedIndicesArray.resize(sameCombsCount);
matchedIndicesArray.back() = someValue;
#else
matchedIndicesArray.push_back(someValue);
#endif
}
The first version does what you wanted and resizes the vector then sets the value. The second version just adds the element directly at the end of the array and should be marginally more efficient.
Related
In the constructor of a class called Stamp have two vectors:
vector<double> data has a size of data.size()
vector<vector <double>> collectedData has a size of [nsamples][data.size()/nsamples]
I need to cycle them in order to have something like this:
collectedData[0][0->nsamples] has the first "nsamples" elements of data
'collectedData[1][0->nsamples]' has the second "nsamples" elements of data
...
'collectedData[i][0->nsamples]' has the i'th "nsamples" elements of data (and '0' if I went over data.size())
This is the C++ code I'm trying, but I receive a segmentation fault. And I don't understand if it is an algorithmic problem, or a wrong usage of vector problem:
Stamp(vector<double> data, int nsamples) : data(data), nsamples(nsamples){
long ll = data.size()/nsamples;
int row;
//Reserve space in collectedData:
collectedData.reserve(nsamples);
for (int i=0;i<nsamples;i++){
collectedData[i].reserve(ll);
}
for (int i=0;i<=data.size();i+=nsamples){
for (int j=0;(j<=nsamples)&&((row*i)<data.size());j++){
collectedData[i].push_back(data[i]);
}
row++;
}
}
Firstly, the reserve() function only reserves space without actualy adding elements. You should use resize() to add (or remove) elements.
Secondly, the condition i<=data.size() in the for loop is wrong. You cannot use data[data.size()]. It should be i<data.size().
Thirdly, there doesn't seem a guarantee that data.size() <= nsamples, so you should add elements again not to cause out-of-range access by collectedData[i] in the second for loop.
Finally, as #molbdnilo pointed out in the comment, the value of non-static local variable row is used without initialization. You should set some proper value to the variable before using that.
Stamp(vector<double> data, int nsamples) : data(data), nsamples(nsamples){
long ll = data.size()/nsamples;
int row = 0; // initialize row
//Reserve space in collectedData:
if (collectedData.size() < nsamples){
collectedData.resize(nsamples); // use resize() to add elements
}
if (collectedData.size() < data.size()){
collectedData.resize(data.size()); // use resize() again to allocate enough size
}
for (int i=0;i<nsamples;i++){
collectedData[i].reserve(ll); // using reserve() here because elements are added via push_back() later
}
for (int i=0;i<data.size();i+=nsamples){ // use correct condition
for (int j=0;(j<=nsamples)&&((row*i)<data.size());j++){
collectedData[i].push_back(data[i]);
}
row++;
}
}
I'm working with a huge amount of data stored in an array, and am trying to optimize the amount of time it takes to access and modify it. I'm using Window, c++ and VS2015 (Release mode).
I ran some tests and don't really understand the results I'm getting, so I would love some help optimizing my code.
First, let's say I have the following class:
class foo
{
public:
int x;
foo()
{
x = 0;
}
void inc()
{
x++;
}
int X()
{
return x;
}
void addX(int &_x)
{
_x++;
}
};
I start by initializing 10 million pointers to instances of that class into a std::vector of the same size.
#include <vector>
int count = 10000000;
std::vector<foo*> fooArr;
fooArr.resize(count);
for (int i = 0; i < count; i++)
{
fooArr[i] = new foo();
}
When I run the following code, and profile the amount of time it takes to complete, it takes approximately 350ms (which, for my purposes, is far too slow):
for (int i = 0; i < count; i++)
{
fooArr[i]->inc(); //increment all elements
}
To test how long it takes to increment an integer that many times, I tried:
int x = 0;
for (int i = 0; i < count; i++)
{
x++;
}
Which returns in <1ms.
I thought maybe the number of integers being changed was the problem, but the following code still takes 250ms, so I don't think it's that:
for (int i = 0; i < count; i++)
{
fooArr[0]->inc(); //only increment first element
}
I thought maybe the array index access itself was the bottleneck, but the following code takes <1ms to complete:
int x;
for (int i = 0; i < count; i++)
{
x = fooArr[i]->X(); //set x
}
I thought maybe the compiler was doing some hidden optimizations on the loop itself for the last example (since the value of x will be the same during each iteration of the loop, so maybe the compiler skips unnecessary iterations?). So I tried the following, and it takes 350ms to complete:
int x;
for (int i = 0; i < count; i++)
{
fooArr[i]->addX(x); //increment x inside foo function
}
So that one was slow again, but maybe only because I'm incrementing an integer with a pointer again.
I tried the following too, and it returns in 350ms as well:
for (int i = 0; i < count; i++)
{
fooArr[i]->x++;
}
So am I stuck here? Is ~350ms the absolute fastest that I can increment an integer, inside of 10million pointers in a vector? Or am I missing some obvious thing? I experimented with multithreading (giving each thread a different chunk of the array to increment) and that actually took longer once I started using enough threads. Maybe that was due to some other obvious thing I'm missing, so for now I'd like to stay away from multithreading to keep things simple.
I'm open to trying containers other than a vector too, if it speeds things up, but whatever container I end up using, I need to be able to easily resize it, remove elements, etc.
I'm fairly new to c++ so any help would be appreciated!
Let's look from the CPU point of view.
Incrementing an integer means I have it in a CPU register and just increments it. This is the fastest option.
I'm given an address (vector->member) and I must copy it to a register, increment, and copy the result back to the address. Worst: My CPU cache is filled with vector pointers, not with vector-member pointers. Too few hits, too much cache "refueling".
If I could manage to have all those members just in a vector, CPU cache hits would be much more frequent.
Try the following:
int count = 10000000;
std::vector<foo> fooArr;
fooArr.resize(count, foo());
for (auto it= fooArr.begin(); it != fooArr.end(); ++it) {
it->inc();
}
The new is killing you and actually you don't need it because resize inserts elements at the end if the size it's greater (check the docs: std::vector::resize)
And the other thing it's about using pointers which IMHO should be avoided until the last moment and it's uneccesary in this case. The performance should be a little bit faster in this case since you get better locality of your references (see cache locality). If they were polymorphic or something more complicated it might be different.
I have 2 large 2d arrays which is 100s*100s. which has one big loop to do the operation for several times. Inside it there is 3 loops; first loop store in arr1 the sum of each cell in arr2 multiplied by number, 2nd loop stream the 2 arrays to a file and the third loop store in arr2 the sum of the two arrays divided by number.
The code explains better:
for(int i=1;i<x+1;i++) {//initialize
for(int j=1;j<y+1;j++) {
arr1[i][j]=i*j*5.5;
arr2[i][j]=0.;
}
}
for (int i=0;i<x+2;i++) {//padding
vi[i][0]=5;
vi[i][y+1]=-5;
}
for (int j=0;j<y+2;j++) {//padding
vi[0][j]=10.;
vi[x+1][j]=-10.;
}
for(int t=0;t<times;++t) {
for(int i=1;i<x+1;++i) {
for(int j=1;j<y+1;j++) {
arr2[i][j]=(arr1[i+1][j]+arr1[i-1][j]+arr1[i][j-1]+arr1[i][j+1])*1.5;
}
}
arr2[1][1]=arr2[1][y]=arr2[x][1]=arr2[x][y]=0.;
for(int i=1;i<x+1;++i) {
for(int j=1;j<y+1;j++) {
arr1[i][j]=(arr1[i][j]+arr2[i][j])*0.5;
if(arr2[i][j]+arr1[i][j]>5.)
cout<<"\n"<<t<<" "<<i-1<<" "<<j-1<<" "<<arr1[i][j]<<" "<<arr2[i][j];
}
}
}
the whole code works in more then 14s. How should I optimize the code to work in a fastest time possible.
You could use a 3rd array to temporary store the array values of arr2 for the next run.
After the first loop is done, you overwrite arr2 with the temporary array - like this you don't need the second loop. You will save half of the time.
for (n=0;n<x;n++)
{
for (i=0;i<maxi;i++)
{
for (j=0;j<maxj;j++)
{
arr1[i][j]=(arr2[i+1][j]+arr2[i-1][j]+arr2[i][j+1]+arr2[i][j-1])*1.5;
arr_tmp[i][j] = (arr1[i][j]+arr2[i][j])*0.5;
}
}
arr2 = arr_tmp;
}
Note: The OP's code has changed dramatically in response to comments about padding and such. There wasn't really anything wrong with the original code -- which is what I have based this answer on.
Assuming that your 2D arrays are indexed row-major (the first index is the row, and the second index is the column), your memory accesses are already in the correct order for best cache utilization (you are accessing nearby elements as you progress). Your latest code calls this assumption into question since you seem have renamed 'maxi' to be 'x' which would suggest that you are indexing a column-major 2D array (which is very non-standard for C/C++).
It wasn't specified how you were declaring your 2D arrays, and that could make a difference, but I got a big improvement by converting your implementation to use raw pointers. I also eliminated the second loop (from your original post) by combining the operations and alternating the direction for each iteration. I changed the weighting coefficients so that they added up to 1.0 so that I could test this more easily (by generating an image output).
typedef std::vector< std::vector<double> > Array2D;
void run( int x, Array2D & arr2 )
{
Array2D temp = arr2; // easy way to create temporary array of the correct size
int maxi=arr2.size(), maxj=arr2[0].size();
for (int n=0;n<x;n++)
{
Array2D const & src = (n&1)?temp:arr2; // alternate direction
Array2D & dst = (n&1)?arr2:temp;
for (int i=1;i<maxi-1;i++)
{
double const * sp0=&src[i-1][1], * sp1=&src[i][1], * sp2=&src[i+1][1];
double * dp=&dst[i][1];
for (int j=1;j<maxj-1;j++)
{
dp[0]=(sp0[0]+sp1[-1]+4*sp1[0]+sp1[+1]+sp2[0])*0.125;
dp++, sp0++, sp1++, sp2++;
}
}
}
if ( (x&1) ) arr2=temp; // copy the result back if the iteration count was odd
} /**/
Other things you could look into (somewhat platform-dependent):
restrict keyword for pointers (not standard C++)
prefetch requests -- a compiler/processor specific way of reducing memory access latency
make sure you have enabled optimizations when you compile
depending on the size of the array, you might find it advantageous to columnize your algorithm to make better use of available cache
Take advantage of available compute resources (very platform-dependent):
Create a SIMD-based implementation
Take advantage of your multi-core CPU -- OpenMP
Take advantage of your GPU -- OpenCL
I am trying to insert data into a leaf node (an array) of a B-Tree. Here is the code I have so far:
void LeafNode::insertCorrectPosLeaf(int num)
{
for (int pos=count; pos>=0; pos--) // goes through values in leaf node
{
if (num < values[pos-1]) // if inserting num < previous value in leaf node
{continue;} // conitnue searching for correct place
else // if inserting num >= previous value in leaf node
{
values[pos] = num; // inserts in position
break;
}
}
count++;
} // insertCorrectPos()
Before the line values[pos] = num, I think need to write some code that shifts the existing data instead of overwriting it. I am trying to use memmove but have a question about it. Its third parameter is the number of bytes to copy. If I am moving a single int on a 64 bit machine, does this mean I would put a "4" here? If I am going about this completely wrong any any help would be greatly appreciated. Thanks
The easiest way (and probably the most efficient) would be to use one of the standard libraries predefined structures to implement "values". I suggest either list or vector. This is because both list and vector has an insert function that does it for you. I suggest the vector class specifically is because it has the same kind of interface that an array has. However, if you want to optimize for speed of this action specifically, then I would suggest the list class because of the way it is implemented.
If you would rather to it the hard way, then here goes...
First, you need to make sure that you have the space to work in. You can either allocate dynamically:
int *values = new int[size];
or statically
int values[MAX_SIZE];
If you allocate statically, then you need to make sure that MAX_SIZE is some gigantic value that you will never ever exceed. Furthermore, you need to check the actual size of the array against the amount of allocated space every time you add an element.
if (size < MAX_SIZE-1)
{
// add an element
size++;
}
If you allocate dynamically, then you need to reallocate the whole array every time you add an element.
int *temp = new int[size+1];
for (int i = 0; i < size; i++)
temp[i] = values[i];
delete [] values;
values = temp;
temp = NULL;
// add the element
size++;
When you insert a new value, you need to shift every value over.
int temp = 0;
for (i = 0; i < size+1; i++)
{
if (values[i] > num || i == size)
{
temp = values[i];
values[i] = num;
num = temp;
}
}
Keep in mind that this is not at all optimized. A truly magical implementation would combine the two allocation strategies by dynamically allocating more space than you need, then growing the array by blocks when you run out of space. This is exactly what the vector implementation does.
The list implementation uses a linked list which has O(1) time for inserting a value because of it's structure. However, it is much less space inefficient and has O(n) time for accessing an element at location n.
Also, this code was written on the fly... be careful when using it. There might be a weird edge case that I am missing in the last code segment.
Cheers!
Ned
I'm making a C++ game which requires me to initialize 36 numbers into a vector. You can't initialize a vector with an initializer list, so I've created a while loop to initialize it faster. I want to make it push back 4 of each number from 2 to 10, so I'm using an int named fourth to check if the number of the loop is a multiple of 4. If it is, it changes the number pushed back to the next number up. When I run it, though, I get SIGABRT. It must be a problem with fourth, though, because when I took it out, it didn't give the signal.
Here's the program:
for (int i; i < 36;) {
int fourth = 0;
fourth++;
fourth%=4;
vec.push_back(i);
if (fourth == 0) {
i++;
}
}
Please help!
You do not initialize i. Use for (int i = 0; i<36;). Also, a new variable forth is allocated on each iteration of the loop body. Thus the test fourth==0 will always yield false.
I want to make it push back 4 of each number from 2 to 10
I would use the most straight forward approach:
for (int value = 2; value <= 10; ++value)
{
for (int count = 0; count < 4; ++count)
{
vec.push_back(value);
}
}
The only optimization I would do is making sure that the capacity of the vector is sufficient before entering the loop. I would leave other optimizations to the compiler. My guess is, what you gain by omitting the inner loop, you lose by frequent modulo division.
You did not initialize i, and you are resetting fourth in every iteration. Also, with your for loop condition, I do not think it will do what you want.
I think this should work:
int fourth = 0;
for (int i = 2; i<=10;) {
fourth++;
fourth%=4;
vec.push_back(i);
if (fourth==0) {
i++;
}
}
I've been able to create a static array declaration and pass that array into the vector at initialization without issue. Pretty clean too:
const int initialValues[36] = {0,1,2...,35};
std::vector foo(initialValues);
Works with constants, but haven't tried it with non const arrays.