I am implementing a pattern search algorithm that has a vital if statement that seems to be unpredictable in it's result. Random files are searched and thus sometimes the branch predictions are okay and sometimes they can be terrible if the file is completely random. My goal is to eliminate the if statement and I have tried but it has yielded slow results like preallocating a vector. The number of pattern possibilities can be very large so preallocating takes up a lot of time. I therefore have the dynamic vector where I initialize them all with NULL up front and then check with the if statement if a pattern is present. The if seems to be killing me and specifically the cmp assembly statement. Bad branch predictions are scrapping the pipeline a lot and causing huge slow downs. Any ideas would be great as to eliminate the if statement at line 17...stuck in a rut.
for (PListType i = 0; i < prevLocalPListArray->size(); i++)
{
vector<vector<PListType>*> newPList(256, NULL);
vector<PListType>* pList = (*prevLocalPListArray)[i];
PListType pListLength = (*prevLocalPListArray)[i]->size();
PListType earlyApproximation = ceil(pListLength/256);
for (PListType k = 0; k < pListLength; k++)
{
//If pattern is past end of string stream then stop counting this pattern
if ((*pList)[k] < file->fileStringSize)
{
uint8_t indexer = ((uint8_t)file->fileString[(*pList)[k]]);
if(newPList[indexer] != NULL) //Problem if statement!!!!!!!!!!!!!!!!!!!!!
{
newPList[indexer]->push_back(++(*pList)[k]);
}
else
{
newPList[indexer] = new vector<PListType>(1, ++(*pList)[k]);
newPList[indexer]->reserve(earlyApproximation);
}
}
}
//Deallocate or stuff patterns in global list
for (int z = 0; z < newPList.size(); z++)
{
if(newPList[z] != NULL)
{
if (newPList[z]->size() >= minOccurrence)
{
globalLocalPListArray->push_back(newPList[z]);
}
else
{
delete newPList[z];
}
}
}
delete (*prevLocalPListArray)[i];
}
Here is the code without indirection with the changes proposed...
vector<vector<PListType>> newPList(256);
for (PListType i = 0; i < prevLocalPListArray.size(); i++)
{
const vector<PListType>& pList = prevLocalPListArray[i];
PListType pListLength = prevLocalPListArray[i].size();
for (PListType k = 0; k < pListLength; k++)
{
//If pattern is past end of string stream then stop counting this pattern
if (pList[k] < file->fileStringSize)
{
uint8_t indexer = ((uint8_t)file->fileString[pList[k]]);
newPList[indexer].push_back((pList[k] + 1));
}
else
{
totalTallyRemovedPatterns++;
}
}
for (int z = 0; z < 256; z++)
{
if (newPList[z].size() >= minOccurrence/* || (Forest::outlierScans && pList->size() == 1)*/)
{
globalLocalPListArray.push_back(newPList[z]);
}
else
{
totalTallyRemovedPatterns++;
}
newPList[z].clear();
}
vector<PListType> temp;
temp.swap(prevLocalPListArray[i]);
}
Here is the most up to date program that manages to not use 3 times the memory and does not require an if statement. The only bottleneck seems to be the newPList[indexIntoFile].push_back(++index); statement. This bottleneck could be cache coherency issues when indexing the array because the patterns are random. When i search a binary files with just 1s and 0s I don't have any latency with indexing the push back statement. That is why I believe it is cache thrashing. Do you guys see any room for optimization in this code still? You guys have been a great help so far. #bogdan #harold
vector<PListType> newPList[256];
PListType prevPListSize = prevLocalPListArray->size();
PListType indexes[256] = {0};
PListType indexesToPush[256] = {0};
for (PListType i = 0; i < prevPListSize; i++)
{
vector<PListType>* pList = (*prevLocalPListArray)[i];
PListType pListLength = (*prevLocalPListArray)[i]->size();
if(pListLength > 1)
{
for (PListType k = 0; k < pListLength; k++)
{
//If pattern is past end of string stream then stop counting this pattern
PListType index = (*pList)[k];
if (index < file->fileStringSize)
{
uint_fast8_t indexIntoFile = (uint8_t)file->fileString[index];
newPList[indexIntoFile].push_back(++index);
indexes[indexIntoFile]++;
}
else
{
totalTallyRemovedPatterns++;
}
}
int listLength = 0;
for (PListType k = 0; k < 256; k++)
{
if( indexes[k])
{
indexesToPush[listLength++] = k;
}
}
for (PListType k = 0; k < listLength; k++)
{
int insert = indexes[indexesToPush[k]];
if (insert >= minOccurrence)
{
int index = globalLocalPListArray->size();
globalLocalPListArray->push_back(new vector<PListType>());
(*globalLocalPListArray)[index]->insert((*globalLocalPListArray)[index]->end(), newPList[indexesToPush[k]].begin(), newPList[indexesToPush[k]].end());
indexes[indexesToPush[k]] = 0;
newPList[indexesToPush[k]].clear();
}
else if(insert == 1)
{
totalTallyRemovedPatterns++;
indexes[indexesToPush[k]] = 0;
newPList[indexesToPush[k]].clear();
}
}
}
else
{
totalTallyRemovedPatterns++;
}
delete (*prevLocalPListArray)[i];
}
Here are the benchmarks. I didn't think it would be readable in the comments so I am placing it in the answer category. The percentages to the left define how much time is spent percentage wise on a line of code.
Related
The int winner should be set to 2 under certain conditions but it's somehow being set to a variety of higher values, most commonly 6. I have no idea how this is happening, as there is no other function in my class that affects winner, and the variable isn't even mentioned anywhere else in the program. What is most confusing to me is that I have an almost identical function (P2Move()) that is literally identical in how it sets the winner variable to P1Move(), and that function runs perfectly.
Some info: The class this is part of is called Board, which acts as a checkerboard array made up of Square class objects.
Below is the function causing the problem. Near the bottom, the statement else if((canTake.size()==0)&&(canMove.size()==0)) {Board::winner = 2;} causes the problem. Everything else seems to work when I remove the problematic part from the function, but I need that part to work in order to submit the final project.
void Board::P1Move()
{
P1pieces = 0;
std::vector <Move> canMove;
std::vector <Move> canTake;
for(int j = 0; j < bSize; j++)
{ //Start of j loop.
for(int i = 0; i < bSize; i++)
{ //Start of i loop.
Square sq = board[i][j];
bool cTakeL = canTakeL(i,j);
bool cTakeR = canTakeR(i,j);
bool cMoveL = canMoveL(i,j);
bool cMoveR = canMoveR(i,j);
if(board[i][j].getPl() == P1)
{
P1pieces++;
if(cTakeL)
{
Move a = Move(sq.getIndex(),board[i-2][j+2].getIndex(),board[i-1][j+1].getIndex(),0);
canTake.push_back(a);
}
if(cTakeR)
{
Move b = Move(sq.getIndex(),board[i+2][j+2].getIndex(),board[i+1][j+1].getIndex(),0);
canTake.push_back(b);
}
if(cMoveL)
{
Move c = Move(sq.getIndex(),board[i-1][j+1].getIndex(),0,0);
canMove.push_back(c);
}
if(cMoveR)
{
Move d = Move(sq.getIndex(),board[i+1][j+1].getIndex(),0,0);
setWinner(d.getSpos());
canMove.push_back(d);
}
}
} //End of i loop.
} //End of j loop.
if(canTake.size()!=0)
{
time_t t;
time(&t);
srand(t);
int moveNum = rand()%canTake.size();
std::string output = "p1 ";
Move out = canTake.at(moveNum);
int i = 0;
int j = 0;
for(int y = 0; y < bSize; y++)
{
for(int x = 0; x < bSize; x++)
{
if(board[x][y].getIndex()==out.getSpos())
{
i = x;
j = y;
}
}
}
if(board[i-2][j+2].getIndex()==out.getEndPos())
{
board[i-2][j+2].setOcc(true);
board[i-2][j+2].setPl(P1);
board[i-1][j+1].setOcc(false);
board[i-1][j+1].setPl(NA);
}
else if(board[i+2][j+2].getIndex()==out.getEndPos())
{
board[i+2][j+2].setOcc(true);
board[i+2][j+2].setPl(P1);
board[i+1][j+1].setOcc(false);
board[i+1][j+1].setPl(NA);
}
output = output + out.toString();
setCmove(output);
board[i][j].setOcc(false);
board[i][j].setPl(NA);
}
else if(canMove.size()!=0)
{
time_t t;
time(&t);
srand(t);
int moveNum = rand()%canMove.size();
std::string output = "p1 ";
Move out = canMove.at(moveNum);
int i = 0;
int j = 0;
for(int y = 0; y < bSize; y++)
{
for(int x = 0; x < bSize; x++)
{
if(board[x][y].getIndex()==out.getSpos())
{
i = x;
j = y;
}
}
}
if(board[i-1][j+1].getIndex()==out.getEndPos())
{
board[i-1][j+1].setOcc(true);
board[i-1][j+1].setPl(P1);
}
else if(board[i+1][j+1].getIndex()==out.getEndPos())
{
board[i+1][j+1].setOcc(true);
board[i+1][j+1].setPl(P1);
}
output = output + out.toString();
setCmove(output);
board[i][j].setOcc(false);
board[i][j].setPl(NA);
}
else if((canTake.size()==0)&&(canMove.size()==0))
{
Board::winner = 2;
}
P1pieces = canTake.size() + canMove.size();
}
You are working with std::vector, which is a good thing. (Too much beginner "C++" code uses C arrays.) The vector class template allows for a pretty easy way to find out if and where you might have an out-of-bounds access (as suggested in the comments):
Instead of accessing vector elements using operator[], change your code to use the .at() member function. .at() is bounds-checking, and will throw an exception if you access out-of-bounds (instead of silently breaking your program).
In production code, operator[] is usually preferred as omitting the bounds check is more efficient. But while learning, .at() can help you quite a bit.
Also, getting in the habit of using code checkers like valgrind or the assert macro to check your assumptions is a good thing, even when you got past the point where you wouldn't use .at() anymore.
I have a loop that looks for 3 equal cards or 3 non equal cards and erases as it finds, if it doesn't find 2 equals/non-equals for the 1st chosen element it deletes that 1st element and goes to the other and so on...
Well, I'm using goto in this code to break from inside of two for loops and keep iterating throughout while.
To me, it makes good sense to use goto in this specific situation. But, since I'm not a very experienced programmer I think there would be a better way to do it, a more efficient way.
Is there? How would that be? not using goto in this case.
unsigned int i1 = 0;
while(gameCards.size() > 2)
{
for(unsigned int i2=1; i2<gameCards.size(); i2++)
{
if(i2 == 2) continue;
if(cannotMatch(gameCards.at(i1), gameCards.at(i2)))
{
for(unsigned int i3=2; i3<gameCards.size(); i3++)
{
if(cannotMatch3(gameCards.at(i1), gameCards.at(i2), gameCards.at(i3)))
{
SetMatches++;
gameCards.erase(gameCards.begin()+i2,gameCards.begin()+i3);
goto findAnother;
}
}
} else if(canMatch(gameCards.at(i1), gameCards.at(i2)))
{
for(unsigned int i3=2; i3<gameCards.size(); i3++)
{
if(canMatch3(gameCards.at(i1), gameCards.at(i2), gameCards.at(i3)))
{
SetMatches++;
gameCards.erase(gameCards.begin()+i2,gameCards.begin()+i3);
goto findAnother;
}
}
}
}
findAnother:
gameCards.erase(gameCards.begin()+(i1++));
}
You can just set an extra bool condition to break outer for loop. You can also simplify your inner loops when you notice that they are essentially the same, just invoke different match3 functions:
while(gameCards.size() > 2)
{
auto continue_outer_loop(true);
for(unsigned int i2=1; continue_outer_loop && (i2<gameCards.size()); i2++)
{
if(i2 == 2) continue;
auto const p_match_3_func
(
cannotMatch(gameCards.at(i1), gameCards.at(i2))
?
&cannotMatch3
:
&canMatch3
);
for(unsigned int i3=2; i3<gameCards.size(); i3++)
{
if((*p_match_3_func)(gameCards.at(i1), gameCards.at(i2), gameCards.at(i3)))
{
SetMatches++;
gameCards.erase(gameCards.begin()+i2,gameCards.begin()+i3);
continue_outer_loop = false;
break;
}
}
}
gameCards.erase(gameCards.begin()+(i1++));
}
You can add guard variable and check for it in your loops. But will work, only when you do it at the end of loop as it is not real break.
while (mainLoop) {
int goMain = 0;
for (int i = 0; i < 5 && goMain == 0; i++) {
for (int j = 0; j < 5 && goMain == 0; j++) {
if (wantExit) {
goMain = 1;
}
}
}
}
(C++)
Is there any possibility to run two parallel while loops without using threads? I have tried putting them one after another, both in one for loop, but it doesn't work for me because the variable that I'm using in while condition is getting changed through 1st loop and I need it to be the same for both loops.
Here's the code:
for (size_t j = 0; j < word.length(); j++)
{
while (word[j] != tmp->data)
{
counter1++;
tmp = tmp->next;
}
while (word[j] != tmp->data)
{
counter2++;
tmp = tmp->previous;
}
}
From the comment:
I'm getting the letter from a string and trying to find out which path is shorter to get to the same letter in alphabet, going forward or backwards. I am using cyclical doubly linked list.
Sounds like you just want one while loop with two tmp pointers:
for (size_t j = 0; j < word.length(); j++)
{
while (word[j] != tmp1->data && word[j] != tmp2->data)
{
counter++;
tmp1 = tmp1->next;
tmp2 = tmp2->previous;
}
}
No this is not possible without threads (or you can use interprocess but I guess this is not your point)
You can avoid using "manual" threading though with std::future and std::async
You can make each search a function like this:
int forward(std::string word)
{
int counter = 0;
for (size_t j = 0; j < word.length(); j++)
{
while (word[j] != tmp->data)
{
counter++;
tmp = tmp->next;
}
}
return counter;
}
Or the respective backwards
And call them like this.
std::string word = //....
auto res1 = std::async(std::launch::async, forward,word);
auto res2 = std::async(std::launch::async, forward,word);
//do whatever....
int counter1 = res1.get(); //get the result
int counter2 = res2.get();
Note though that get will block until the threads are done. But they will run in parallel.
In your case depending on the size of the string/alphabet and the algorithm though I doubt you get much benefit from doing this in multiple threads. The threading overhead can take longer than the whole calculation so you should measure if this may be faster doing this single-threaded.
I've been studying this tetris tutorial and I've come across the function that deletes lines and brings the row/s down one level. I'm kind of understanding what is going on with these functions, but some parts are confusing me. I'll try and explain it best I can, but here is the link to the lesson if you need it: http://javilop.com/gamedev/tetris-tutorial-in-c-platform-independent-focused-in-game-logic-for-beginners/
This, to me, looks like a function to get the array to start at the last number of a line:
void Board::DeleteLine (int pY)
{
// Moves all the upper lines one row down
for (int j = pY; j > 0; j--)
{
for (int i = 0; i < BOARD_WIDTH; i++)
{
mBoard[i][j] = mBoard[i][j-1];
}
}
}
Then, there is the function that is causing me problems, which I will explain:
void Board::DeletePossibleLines ()
{
for (int j = 0; j < 20; j++)
{
int i = 0;
while (i < 10)
{
if (mBoard[i][j] != 1) break;
i++;
}
if (i == 10) DeleteLine (j);
}
}
In case you're not familiar, the idea here is to delete a row that consists entirely of 1. But if (mBoard[i][j] != 1) break; would stop the loop if the first line wasn't 1. How would the loop reach a 1 that is somewhere in the middle of the mBoard[][] array if break stops it from doing anything possible straight away?
Am I missing something here? This is my interpretation of it. Perhaps somebody sees something I do't?
Edit:
Thanks for the replies, appreciated.
You could structure the code like this aswell:
for (int j = 0; j < 20; j++)
{
int i = 0;
while (i < 10)
{
if (mBoard[i][j] != 1)
{
break; //only breaks the while loop and will continue with if (i == 10)
}
else
{
i++;
}
}
if (i == 10)
{
DeleteLine (j);
}
}
Now you can clearly see, that the break; is only interrupting your while loop but not your for loop.
The break will jump out of the while loop. So if you encounter a line which has a non-1 somewhere in the middle, i will be the index in the line, and the for loop will continue with the next line (j), starting with i=0 again.
break only interrupts one loop, the while loop in your case. The for loop continues happily.
On a side note, this while could easily (and should) be refactored into a for, and can be compacted according to its recognizable for-if-break pattern :
for (int j = 0; j < 20; ++j)
{
int i;
for(i = 0; i < 10 && mBoard[i][j] == 1; ++i);
if (i == 10) DeleteLine (j);
}
So, I read the problem 4.5 from Accelerated C++, and interpreted it rather wrong. I wrote a program which is supposed to display counts of a word in string. However, I have probably done something very stupid, and very wrong. I can't figure it out.
Here's the code: http://ideone.com/87zA7E.
Stackoverflow says links to ideone.com must be accompanied by code. Instead of pasting the all of it, I will just paste the function which I think is most likely at fault:
vector<str_info> words(const vector<string>& s) {
vector<str_info> rex;
str_info record;
typedef vector<string>::size_type str_sz;
str_sz i = 0;
while (i != s.size()) {
record.str = s[i];
record.count = 0;
++i; //edit
for (str_sz j = 0; j != s.size(); ++j) {
if (compare(record, s[j]))
++record.count;
}
for (vector<str_info>::size_type k = 0; k != s.size(); ++k) {
if (!compare(record, rex[k].str))
rex.push_back(record);
}
}
return rex;
}
One problem is that you have this:
str_sz i = 0;
while (i != s.size()) {
but you never increment i, leading to an endless loop. Inside of that loop, you're pushing elements into vector rex. A vector cannot contain an infinite number of elements.
Also, you are trying to access:
rex[k].str
in
for (vector<str_info>::size_type k = 0; k != s.size(); ++k) {
if (!compare(record, rex[k].str)) // rex is empty in the beginning!!
rex.push_back(record);
}
But you do not know whether rex has k+1 elements in it.
EDIT: Change your code to:
while (i != s.size()) {
// read new string into a record (initial count should be one).
str_info record;
record.str = s[i];
record.count = 1;
// check if this string already exists in rex
bool found = false;
for (vector<str_info>::size_type k = 0; k < rex.size(); ++k) {
if ( record.str == rex[k].str ) {
rex[k].count++;
found = true;
break;
}
}
i++;
if ( found )
continue;
// if it is not found then push_back to rex
rex.push_back( record );
}