I've got code which needs to find at what points in time a laser has been fired. The laser is indicated by a DC of above 500 on the dataset, and comes in bulks of 3 lasers at a time, in rather quick, but not entirely deterministic, time gaps.
The code I am using right now:
//std::vector<short>& laserData; this one comes from the function call, this is the definition
int count = 0;
for(unsigned int index = 0; index < laserData.size(); ++index) {
if(laserData.at(index) > 500) {
times.push_back(index);
count = (count + 1)%3;
if(count == 0) {
int dif1 = times.at(times.size()-1) - times.at(times.size()-2);
int dif2 = times.at(times.size()-1) - times.at(times.size()-3);
if(dif1 > 60000 || dif2 > 60000) {
times.erase(times.begin() + times.size() - 2);
times.erase(times.begin() + times.size() - 2);
count = 1;
}
}
switch(count) {
case 0: index += 90000;
default: index += 2000;
}
}
}
I can't be entirely sure that all 3 laser impulses always happen, and if they don't, the complete set of those 1 or 2 laser impulses needs to be removed.
The dataset is 130,000,000 samples long, and I get about 3300 laser impulses in total, so that works fine, its just darned slow. It takes about 45 seconds to parse that vector, and I wonder if there is a faster way to do that.
First: Unless you intended the switch statement to fall-through, add in a break:
switch(count)
{
case 0:
index += 90000;
break;
default:
index += 2000;
}
Ok. now we have a potential error out of the way, we can look at speeding up your code.
The first thing to do is to eliminate as much of the vector resizing as possible.
you said there were about 3300 laser pulses in total. Lets add ~10% margin of error to that and resize the vector of pulses in advance:
times.reserve(3600);
Now the vector should not need to be resized multiple times. If there are more, we should only have the vector realovating once.
Next, we want to get rid of the times.erase() function calls.
To do this, we cache the three most recent values separately, and only push them into the vector once they have been validified:
const int pulse_interval = 2000;
const int burst_interval = 90000;
int cache[3];
times.reserve(3600);
for(unsigned int index = 0; index < laserData.size(); ++index)
{
if(laserData[index] > 500)
{
//this next if block acts as our guard clause
if (count > 0)
{
diff = index - cache[count-1];
if (diff > 60000)
{
count = 1;
cache[0]=index;
index+= pulse_interval;
continue;
// if gap between pulses is too big reset and start again, setting most recent value to first of next sequence.
}
}
//now we actually store data in the cache and if needed add to the vector. No invalid data (not part of a three burst set) should reach here due to guard
cache[count]=index;
if (count == 2)
{
for (int i=0; i<3; i++)
{times.push_back(cache[i]);}
count = 0;
index += burst_interval;
}
else
{
count++;
index += pulse_interval;
}
//updating count at the end is easier to follow
//goes to 0 after 3rd pulse detected
}
}
This removes vector access with invalid data and should speed up the code as much as a quick answer here can do.
edit: added in your index skipping parameters. If I got the logic wrong, let me know. In this case, the switch isnt needed as the logic could be encapsulated in the existing logic from the algorithm.
If you can't turn optimisation on, then you can try unrolling the push_back loop. Cache array can be reduced to two cells, and storing of the index can be moved to the else (for the third value just push_back(index);
This removes the loop overhead and one assignment for each time you find a full burst. Your compiler would handle this normally.
If still slow. then you need to profile. Make sure that your index skips are of the right size (too small means you search too much, but too large and you risk loosing valid data)
You could also do this in parallel as a commenter suggested. You could do this splitting the search space into a number of sections and spawning a thread for each section.
Related
I'm trying to compare two decks of cards, yet every time I try another method of doing it, I get the same result... Everything before the code outputs, and it just freezes as soon as it hits the comparison code, as if it's stuck in an infinite loop.
I've tried for loops, static variables, do-while loops, etc. This is my first time leaving the loop at the client code.
The code that supposedly throws the program into an infinite loop.
while (repeatLoop == false)
{
deck1.shuffleDeck();
counter++;
repeatLoop = deck1.compareDecks();
}
compareDecks function.
bool deck::compareDecks()
{
int deckCount = 0;
suitType tempOriginalSuit;
suitType tempShuffleSuit;
rankType tempOriginalRank;
rankType tempShuffleRank;
while (index < 52)
{
tempOriginalSuit = originalCardDeck[index].getSuit();
tempShuffleSuit = shuffledCardDeck[index].getSuit();
if (int(tempOriginalSuit) == int(tempShuffleSuit))
{
tempOriginalRank = originalCardDeck[index].getRank();
tempShuffleRank = shuffledCardDeck[index].getRank();
if (int(tempOriginalRank) == int(tempShuffleRank))
{
deckCount++;
if (deckCount == 52)
return true;
}
}
else
{
return false;
index++;
}
}
}
The shuffleDeck function
(This function pushes back the first card from the first half of the deck and the first card from the second half of the deck towards the end until all 52 cards have been pushed in this pattern. This makes the deck have 52 x 2 cards (with the second half of the deck being the perfect shuffle), so I delete the first half of the cards using .erase as it is not needed)
void deck::shuffleDeck()
{
for (int a = 0, b = 2; a < 2 && b < 4; a++, b++)
{
for (int i = 2; i < 15; i++)
{
shuffledCardDeck.push_back(card{ static_cast<cardSpace::suitType>(a),
static_cast<cardSpace::rankType>(i) });
shuffledCardDeck.push_back(card{ static_cast<cardSpace::suitType>(b),
static_cast<cardSpace::rankType>(i) });
}
}
shuffledCardDeck.erase(shuffledCardDeck.begin(),
shuffledCardDeck.begin() + (shuffledCardDeck.size() / 2));
}
The two decks initialized by this constructor.
deck::deck()
{
for (int i = 0; i < 4; i++)
{
for (int j = 2; j < 15; j++)
{
originalCardDeck.push_back(card{ static_cast<cardSpace::suitType>(i),
static_cast<cardSpace::rankType>(j) });
shuffledCardDeck.push_back(card{ static_cast<cardSpace::suitType>(i),
static_cast<cardSpace::rankType>(j) });
}
}
}
Also note that I've done a perfect shuffle on the shuffledCardDeck vector in another function. I'm trying to repeat the perfectShuffle function until it reaches it's original state and output how many times it took to do this.
I get an infinite loop.
EDIT: I've decided to add the return false; statement in the compareDecks function into the if-else. Also, I think what's causing the problem is that my index i is reset to zero everytime it is called again. Are there any solutions you guys could propose to this? I've tried using static variables, but they just would not increment in the for loop.
EDIT 2: I enclosed my if statements within the curly braces, per users' request, as it's a flaw in my code.
EDIT 3: After commenting out
deck1.shuffleDeck()
The compareDecks function returned true, stating that the decks are equal, which isn't supposed to happen... This caused the loop to end after only one loop.
I was expecting you to actually shuffle the deck.
Your code was pushing a specific, newly synthesized card onto the end of the deck:
shuffledCardDeck.push_back(card{ static_cast<cardSpace::suitType>(a),
static_cast<cardSpace::rankType>(i) });
For example, the first card it will push is always the 2 of 0's (Whatever the 0th suit is). That's not what you want. You actually want to push a copy of the card that is at a specific position index in the deck. For example, loop index from 0 to 25 and then push shuffledCardDeck[index] and shuffledCardDeck[26 + index].
Then you can still wrap up by using your technique of erasing the first half of the deck.
void deck::shuffleDeck()
{
for (int index = 0; index < 26; ++index) {
shuffledCardDeck.push_back(shuffledCardDeck[index]);
shuffledCardDeck.push_back(shuffledCardDeck[26 + index]);
}
shuffledCardDeck.erase(shuffledCardDeck.begin(),
shuffledCardDeck.begin() + 52);
}
You are not modifying the value in the loop, you're using a double equals sign:
repeatLoop == deck1.compareDecks();
That would explain your observed behavior.
This portion of my code takes too long to run and I was looking for a way to optimize it. I think a lookup table would be the fastest way but I could be wrong. My program has a main for loop and for each iteration in the main for loop, a nested loop goes through 1,233,487 iterations and then goes through the if statements if the conditions are met. The main for loop goes through 898,281 iterations so it must go through 898,281 * 1,233,487 calculations. How would I go about creating a lookup table to optimize these calculations/is there a better way to optimize my code.
for (int i = 0; i < all_neutrinos.size(); i++)
{ //all_neutrinos.size() = 898281
int MC_count = 0; //counts per window in the Monte Carlo simulation
int count = 0; //count per window for real data
if (cosmic_ray_events.size() == MC_cosmic_ray_events.size())
{
for (int j = 0; j < cosmic_ray_events.size(); j++)
{ //cosmic_ray_events.size() = 1233487
if ((MC_cosmic_ray_events[j][1] >= (all_neutrinos[i][3] - band_width))
&& (MC_cosmic_ray_events[j][1] <= (all_neutrinos[i][3] + band_width)))
{
if ((earth_radius * fabs(all_neutrinos[i][2] - MC_cosmic_ray_events[j][0]))
<= test_arc_length)
{
MC_count++;
}
}
if ((cosmic_ray_events[j][7] >= (all_neutrinos[i][3] - band_width))
&& (cosmic_ray_events[j][7] <= (all_neutrinos[i][3] + band_width)))
{
if(earth_radius * fabs(all_neutrinos[i][2] - cosmic_ray_events[j][6])
<= test_arc_length)
{
count++;
}
}
}
MCcount_out << i << " " << MC_count << endl;
count_out << i << " " << count << endl;
}
}
First cosmic_raw_events and MC_cosmic_ray_events are utterly unrelated. Make it two loops.
Sort MC_cosmic_ray_events by [1]. Sort cosmic_ray_events by [7]. Sort all_neutrinos by [3].
This doesn't have to be in-place sorting -- you can sort an array of pointers or indexes into them if you want.
Start with a highwater and lowwater index into your cosmic ray events set to 0.
Now, walk over all_neutrinos. For each one, advance highwater until
MC_cosmic_ray_events[highwater][1] > all_neutrinos[i][3] + band_width). Then advance lowwater until MC_cosmic_ray_events[lowwater][1] >= all_neutrinos[i][3] - band_width).
On the half-open range j = lowwater upto but not including highwater, run:
if (
(earth_radius * fabs(all_neutrinos[i][2] - MC_cosmic_ray_events[j][0]))
<= test_arc_length
) {
MC_count++;
}
Now repeat until i reaches the end of all_neutrinos.
Then repeat this process, using cosmic_ray_events and [7].
Your code takes O(NM) time. This code takes O(N lg N + M lg M + N * (average bandwidth intersect rate) time. If relatively few pass the bandwidth test, you are going to be insanely faster.
Assuming you get an average of 0.5 intersects per all_neutrinos, this will be on the order of 100000x faster.
There is not much to optimize. The counts are really high, and there is not much hard computation going on. There are some obvious optimizations you could do, such as storing (all_neutrinos[i][3] +/- bandwitdth) in local variables before entering the j-loop. You compiler probably already does this, though, but this would certainly improve performance in debug mode.
Have you tried separating the two halves of the j-loop and have two j-loops? as in:
auto all_neutrinos_2 = all_neutrinos[i][2];
//... precompute bandwidth limits
for (int j = 0; j < cosmic_ray_events.size(); j++)
{ //cosmic_ray_events.size() = 1233487
auto MC_events = MC_cosmic_ray_events[j][1];
if ((all_neutrinos_lower <= MC_events) &&(MC_cosmic_ray_events[j][1] <= all_neutrinos_higher))
{
if ((earth_radius * fabs(all_neutrinos_2 - MC_cosmic_ray_events[j][0]))
<= test_arc_length)
{
MC_count++;
}
}
}
for (int j = 0; j < cosmic_ray_events.size(); j++)
{ //cosmic_ray_events.size() = 1233487
auto events = cosmic_ray_events[j][7];
if ((all_neutrinos_lower <= events) && (events <= all_neutrinos_higher))
{
if(earth_radius * fabs(all_neutrinos_2 - cosmic_ray_events[j][6])
<= test_arc_length)
{
count++;
}
}
}
I have the feeling you could get some improvement from improved memory cache hits this way.
Any improvement beyond that would involve packing the input data to reduce memory cache misses, and would involve modifying the structure and code generating the MC_cosmic_ray_events and cosmic_ray_events arrays
Slicing the counts in severals smaller tasks running on different threads is also a route I would look at seriously at this point. Data access is read only, and each thread can have its own counter, which can all be summed in the end.
Im positioning sprites in a sliding puzzle game, but I have trouble to randomise the tile position
How can I check if a random move (arc4random) has already been made, and to ignore previous move in the randomisation process?
the tiles do randomise/reshuffle, but sometimes the repeat the random move made
eg tile 23 slides to tile 24 position and back several times, counting as a random move
(which means the board doesn't shuffle properly)
int count = 0;
int moveArray[5];
int GameTile;
int EmptySq;
//loop through the board and find the empty square
for (GameTile = 0; GameTile <25; ++GameTile) {
if (boardOcc[GameTile]== kEMPTY) {
EmptySq = GameTile;
break;
}
}
int RowEmpty = RowNumber[GameTile];
int colEmpty = ColHeight[GameTile];
if (RowEmpty <4) moveArray[count++] = (GameTile +5);//works out the current possible move
if (RowEmpty >0) moveArray[count++] = (GameTile -5);//to avoid unsolvable puzzles
if (colEmpty <4) moveArray[count++] = (GameTile +1);
if (colEmpty >0) moveArray[count++] = (GameTile -1);
int RandomIndex = arc4random()%count;
int RandomFrom = moveArray[RandomIndex];
boardOcc[EmptySq] = boardOcc[RandomFrom];
boardOcc[RandomFrom] = kEMPTY;
There are few - if not many possibilities.
One possibility would be to create a stack-like buffer array, which would contain, let's say 10 steps. (Stack - goes in from one end, goes out from other end)
For example:
NSMutableArray *_buffer = [NSMutableArray new];
So - game starts, buffer array is empty. You generate first random move, and also insert it into the buffer array:
[_buffer insertObject:[NSNumber numberWithInt:RandomIndex] atIndex:0];
Then run a check if our array contains more that 10 elements and remove last one if so:
if([_buffer count] > 10)
{
[_buffer removeObjectAtIndex:10];
}
We need to remove only one item, as we only add one object each time.
And then we add the checking, so that next 'RandomIndex' would be something else than previous 10 indexes. We set 'RandomIndex' to some neutral value (-1) and then launch a while loop (to set 'RandomIndex' to some random value, and second time check if '_buffer' contains such value. If it contains, it will regenerate 'RandomIndex' and check again.. it could do so indefinitely, but if the 'count' is a much bigger number, then it will take 2-3 while loops, tops. No worries.
int RandomIndex = -1;
while(RandomIndex == -1 || [_buffer containsObject:[NSNumber numberWithInt:RandomIndex]])
{
RandomIndex = arc4random()%count;
}
But You could add some safety, to allow it to break out of the loop if after, say, 5 cycles: (But then it will keep the repeating value..)
int RandomIndex = -1;
int safetyCounter = 0;
while(RandomIndex == -1 || [_buffer containsObject:[NSNumber numberWithInt:RandomIndex]])
{
RandomIndex = arc4random()%count;
if(safetyCounter == 5)
{
break;
}
safetyCounter++;
}
You could also decrease the buffer size to - 3 or five, then it will work perfectly in 99.9999999% cases or even 100%. Just to disable that case, when randomly it picks the same number each second time as You described. Anyways - no worries.
But still. Let's discuss another - a bit more advanced and safer way.
Other possibility would be to create two separate buffers. One - as in previous example - would be used to store last 10 values, and second would have all the other possible unique moves.
So:
NSMutableArray *_buffer = [NSMutableArray new];
NSMutableArray *_allValues = [NSMutableArray new];
At the beginning '_buffer' is empty, but for '_allValues', we add all possible moves:
for(int i = 0; i < count; i++)
{
[_allValues addObject:[NSNumber numberWithInt:i]];
}
and again - when we calculate a random value - we add it to the '_buffer' AND remove from '_allValues'
[_buffer insertObject:[NSNumber numberWithInt:RandomIndex] atIndex:0];
[_allValues removeObject:[NSNumber numberWithInt:RandomIndex]];
after that - we again check if _buffer is not larger than 10. If Yes, we remove last object, and add back to _allValues:
if([_buffer count] > 10)
{
[_allValues addObject:[_buffer objectAtIndex:10]];
[_buffer removeObjectAtIndex:10];
}
And most importantly - we calculate 'RandomIndex from the count of _allValues and take corresponding object's intValue:
RandomIndex = [[_allValues objectAtIndex:(arc4random()%[_allValues count])] intValue];
Thus - we don't need any safety checking, because in this way, each time calculated value will be unique for the last 10 moves.
Hope it helps.. happy coding!
In my code below I have an array of objects - tArray.
I am trying to find the 'buyer names' that have the top five total 'num shares',
the calctotal, and calcstring arrays work in tandem to store the buyer and his total value.
However, I have stepped through the code when running and my code is essentially replacing the values that are smaller that the current 'numshares' in the loop. This means even if a buyer that was just replaced comes up again his total starts new and is not added, which is want I want.
How would I change this code so when a larger value is found that smaller value is pushed further down into the array and not replaced?
Thanks - I am bound to this 'format' of solving the problem (assignment) so achieving the functionality is the goal so I can progress.
So, essentially the second if statement is were the issue lies:
for (int i = 0; i<nTransactions; i++)
{
//compares with arrays
for(int j =0; j<sSize; j++)
{
if(tArray[i].buyerName == calcString[j])
{
calcTotal[j] += tArray[i].numShares;
break;
}
else{
//checks if shares is great then current total then replaces
if(tArray[i].numShares > calcTotal[j])
{
calcTotal[j] = tArray[i].numShares;
calcString[j] = tArray[i].buyerName;
break;
}
}
}
}
return calcString;
}
It seems like you are trying to find the largest totals only looking at 1 transaction at a time. You need to aggregate the totals for all the buyers first. Then it is a simple matter to find the 5 highest totals.
I finally determined that this function is responsible for the majority of my bottleneck issues. I think its because of the massively excessive random access that happens when most of the synapses are already active. Basically, as the title says, I need to somehow optimize the algorithm so that I'm not randomly checking a ton of active elements before landing on one of the few that are left.
Also, I included the whole function in case of other flaws that can be spotted.
void NetClass::Explore(vector <synapse> & synapses, int & n_syns) //add new synapses
{
int size = synapses.size();
assert(n_syns <= size );
//Increase the age of each active synapse by 1
Age_Increment(synapses);
//make sure there is at least one inactive vector left
if(n_syns == size)
return;
//stochastically decide whether a new connection is added
if((rand_r(seedp) %1000) < ( x / (1 +(n_syns * ( y / 100)))))
{
n_syns++; //a new synapse has been created
//main inefficiency here
while(1)
{
int syn = rand_r(seedp) % (size);
if (!synapses[syn].active)
{
synapses[syn].active = true;
synapses[syn].weight = .04 + (float (rand_r(seedp) % 17) / 100);
break;
}
}
}
}
void NetClass::Age_Increment(vector <synapse> & synapses)
{
for(int q=0, int size = synapses.size(); q < size; q++)
if(synapses[q].active)
synapses[q].age++;
}
Pass a random number, k, in the range [0, size-n_syns) to Age_Increment. Have Age_Increment return the kth empty slot.
Since you're already traversing the whole list in Age_Increment, update that function to return the list of the indexes of inactive synapses.
You can then pick a random item from that list directly.
This is similar to the problem of finding free blocks in memory management, so I would take a look at algorithms used in that domain, specifically free lists, which is a list of free positions. (These are usually implemented as linked lists to be able to pop elements off an end efficiently. Random access in a linked list would still be O(n) - with a smaller n, but still not the best choice for your use case.)