how can i improve multithreading efficiency? - c++

Brief recap: I have a C++ multithreading sudoku solver, for which I want to improve the efficiency, i need your help.
I'm implementing a brute force sudoku solver in multithreading in C++. The main struct is a tree and the all logic is this: I start reading the initial matrix in input and this will be my root node, then I find the first empty cell, I find all the possible numbers that could fit that position, and for each possibility I create a sub-node of the parent node, so that each node will have a child-node for each possible number. The tree continues to grow in this way until a solution is met, that is a node has no more free cells(so it is full), and the node grid satisfy all the rules of the sudoku.
I tried to implement this algorithm in multithreading like this: I follow the exactly same logic of above sequentially but making one step each time, storing all the children I have met until that moment (each child will be a path, and so a tree) in a vector. If the children are less than the threads chosen by the user, then I solve them sequentially and I make one more step(children will grow). When I have more children than threads, then I split the children for each thread, and I start the threads each one with his part(that is a tree).
Now, taking into account that the "brute force" approch and that "only std lib" requirements are mandatory, so I can't do in a different way, but I can change of course the logic on how to implement this.
The question is: how can I improve the efficiency of this program ? All the suggestions that uses only std lib are welcome.
#define UNASSIGNED 0
#define N 9
#define ERROR_PAIR std::make_pair(-1, -1)
using namespace std;
atomic<bool> solutionFound{false};
//Each node has a sudokuMatrix and some sub-trees
struct Node {
vector<vector<int>> grid;
vector<Node *> child;
};
Node *newNode(const vector<vector<int>> &newGrid) {
Node *temp = new Node;
temp->grid = newGrid;
return temp;
}
//Check if a number can be inserted in a given position
bool canInsert(const int &val, const int &row_, const int &col_,
const vector<vector<int>> &grid) {
// Check column
for (int row = 0; row < N; row++) {
if (grid[row][col_] == val) return false;
}
// Check row
for (int col = 0; col < N; col++) {
if (grid[row_][col] == val) return false;
}
// check box
for (int row = 0; row < N; row++) {
for (int col = 0; col < N; col++) {
if (row / 3 == row_ / 3 &&
col / 3 == col_ / 3) { // they are in the same square 3x3
if ((grid[row][col] == val)) return false;
}
}
}
return true;
}
//Check if the sudoku is solved
bool isSafe(const vector<vector<int>> &grid)
{
// Hashmap for row column and boxes
unordered_map<int, int> row_[9], column_[9], box[3][3];
for (int row = 0; row < N; row++) {
for (int col = 0; col < N; col++) {
// mark the element in row column and box
row_[row][grid[row][col]] += 1;
column_[col][grid[row][col]] += 1;
box[row / 3][col / 3][grid[row][col]] += 1;
// if an element is already
// present in the hashmap
if (box[row / 3][col / 3][grid[row][col]] > 1 ||
column_[col][grid[row][col]] > 1 ||
row_[row][grid[row][col]] > 1)
return false;
}
}
return true;
}
//Find the first empty cell
pair<int, int> findCell(const vector<vector<int>> &grid) {
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
if (grid[i][j] == UNASSIGNED) {
return make_pair(i, j);
}
}
}
return ERROR_PAIR;
}
//Find all the numbers i can insert in a given position, and update the matrix with that number. Return
//the set of all the matrixes(one for each possibility).
list<vector<vector<int>>> getChoices(const int &row, const int &col,
const vector<vector<int>> &grid) {
list<vector<vector<int>>> choices;
for (int i = 1; i < 10; i++) {
if (canInsert(i, row, col, grid)) {
// cout << "posso inserire =" << i << endl;
vector<vector<int>> tmpGrid = grid;
tmpGrid[row][col] = i;
choices.push_back(tmpGrid);
}
}
return choices;
}
//Update the childreen of a node.
void addChoices(list<vector<vector<int>>> &choices, Node &node) {
while (!choices.empty()) {
node.child.push_back(newNode(choices.front()));
choices.pop_front();
}
return;
}
//Compute one step of computation for each node in input, and put all the childreen in the task vector.
void solveOneStep(vector<Node *> &nodes, const int &nw, vector<Node *> &tasks) {
if (solutionFound) return;
for (Node *&n : nodes) {
if (findCell(n->grid) != ERROR_PAIR) {
pair<int, int> freeCell = findCell(n->grid);
list<vector<vector<int>>> choices =
getChoices(freeCell.first, freeCell.second, n->grid);
if (choices.empty()) {
continue;
}
addChoices(choices, *n);
for (Node *&n : n->child) {
tasks.push_back(n);
}
continue;
} else if (isSafe(n->grid)) {
if (!solutionFound.load()) {
solutionFound.store(true);
printGrid(n->grid);
cout << "That's the first solution found !" << endl;
}
return;
}
}
}
//Compute step by step the computation until you reach a level of the entire tree of nodes where
//the nodes of that level are more than the number of worker(NW) choose by the user.
vector<Node *> splitChunks(Node *root, const int &nw) {
vector<Node *> tasks;
vector<Node *> nodes;
nodes.push_back(root);
while ((int)tasks.size() < nw && !solutionFound) {
tasks.clear();
solveOneStep(nodes, nw, tasks);
nodes = tasks;
}
return tasks;
}
//Solve recursively all the sub-trees of all the nodes given in input, until a solution is found or no
//solution exist.
void solveSubTree(vector<Node *> &nodes, const int &nw,) {
if (solutionFound) return;
for (Node *&n : nodes) {
if (findCell(n->grid) != ERROR_PAIR) {
pair<int, int> freeCell = findCell(n->grid);
list<vector<vector<int>>> choices =
getChoices(freeCell.first, freeCell.second, n->grid);
if (choices.empty()) {
continue;
}
addChoices(choices, *n);
solveSubTree(n->child, nw);
} else if (isSafe(n->grid)) {
if (!solutionFound.load()) {
solutionFound.store(true);
printGrid(n->grid);
std::cout << "That's the first solution found !" << endl;
}
return;
}
}
}
int main(int argc, char *argv[]) {
if (argc != 2) {
cout << "Usage is: nw " << endl;
return (-1);
}
//A test matrix.
vector<vector<int>> grid =
{ { 0, 1, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 8, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0 } };
Node *root = newNode(grid);
vector<thread> tids;
const int nw = atoi(argv[1]); //Number of worker
vector<vector<Node *>> works(nw, vector<Node *>());
vector<Node *> tasks = splitChunks(root, nw);
//Split the tasks for each thread, the main thread takes the last part of the work.
for (int i = 0; i < nw; i++) {
int limit = 0;
i == nw - 1 ? limit = tasks.size() : limit = tasks.size() / nw;
for (int j = 0; j < limit; j++) {
works[i].push_back(tasks.back());
tasks.pop_back();
}
}
//Start each thread, and then the main thread start his computation.
for (int i = 0; i < nw - 1; i++) {
tids.push_back(thread(solveSubTree, ref(works[i]), ref(nw)));
}
solveSubTree(works[nw - 1], nw, t1); // Main thread do last part of the work
for (thread &t : tids) {
t.join();
}
std::cout << "end" << endl;
return (0);
}

Here are several points to improve the efficiency of the reference implementation:
using vector<vector<int>> for 2D array is far from being efficient: it is not contiguous in memory and cause many slow allocations. A big flatten array should be preferred.
unordered_map<int, int> for set-like operations are not needed since the integers in the sets are in a small (contiguous) range. Using a simple array is much faster.
some copies are not needed and can be removed with std::move.
as integers in the grid are small, char can be used over int (to reduce the memory footprint, to keep data in CPU caches and to possibly make allocations faster).
I see one new but no delete in the code...
The work between threads seems clearly unbalanced in many case resulting in a slower parallel execution, a load balancing should be performed to scale better. One way is to do that is to use task scheduling.
One can use heuristics to drastically speed up the exploration. To begin with, I advise you to look constraint satisfaction problem (CSP) because CSP solvers known to be very good at solving it. More general and theoretical information can be found in the book Artificial Intelligence: a modern approach.
Here is a code applying the first remarks resulting in a 5 times faster execution on my machine (note that the grid has been modified in the main) :
#define UNASSIGNED 0
#define N 9
#define ERROR_PAIR std::make_pair(-1, -1)
using namespace std;
void printGrid(const array<char, N*N>& grid)
{
for (int row = 0; row < N; row++)
{
for (int col = 0; col < N; col++)
{
cout << (int)grid[row*N+col] << " ";
}
cout << endl;
}
}
atomic<bool> solutionFound{false};
//Each node has a sudokuMatrix and some sub-trees
struct Node {
array<char, N*N> grid;
vector<Node *> child;
};
Node *newNode(const array<char, N*N> &newGrid) {
Node *temp = new Node;
temp->grid = newGrid;
return temp;
}
//Check if a number can be inserted in a given position
bool canInsert(const int &val, const int &row_, const int &col_,
const array<char, N*N> &grid) {
// Check column
for (int row = 0; row < N; row++) {
if (grid[row*N+col_] == val) return false;
}
// Check row
for (int col = 0; col < N; col++) {
if (grid[row_*N+col] == val) return false;
}
// check box
for (int row = 0; row < N; row++) {
for (int col = 0; col < N; col++) {
if (row / 3 == row_ / 3 &&
col / 3 == col_ / 3) { // they are in the same square 3x3
if ((grid[row*N+col] == val)) return false;
}
}
}
return true;
}
//Check if the sudoku is solved
bool isSafe(const array<char, N*N> &grid)
{
// No need for a hashmap for row column and boxes,
// just an array since associated values are small integer
char row_[9][N+1] = {0};
char column_[9][N+1] = {0};
char box[3][3][N+1] = {0};
for (int row = 0; row < N; row++) {
for (int col = 0; col < N; col++) {
// mark the element in row column and box
row_[row][grid[row*N+col]] += 1;
column_[col][grid[row*N+col]] += 1;
box[row / 3][col / 3][grid[row*N+col]] += 1;
// if an element is already
// present in the hashmap
if (box[row / 3][col / 3][grid[row*N+col]] > 1 ||
column_[col][grid[row*N+col]] > 1 ||
row_[row][grid[row*N+col]] > 1)
return false;
}
}
return true;
}
//Find the first empty cell
pair<int, int> findCell(const array<char, N*N> &grid) {
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
if (grid[i*N+j] == UNASSIGNED) {
return make_pair(i, j);
}
}
}
return ERROR_PAIR;
}
//Find all the numbers i can insert in a given position, and update the matrix with that number. Return
//the set of all the matrixes(one for each possibility).
list<array<char, N*N>> getChoices(const int &row, const int &col,
const array<char, N*N> &grid) {
list<array<char, N*N>> choices;
for (int i = 1; i < 10; i++) {
if (canInsert(i, row, col, grid)) {
// cout << "posso inserire =" << i << endl;
array<char, N*N> tmpGrid = grid;
tmpGrid[row*N+col] = i;
choices.push_back(std::move(tmpGrid));
}
}
return choices;
}
//Update the childreen of a node.
void addChoices(list<array<char, N*N>> &choices, Node &node) {
while (!choices.empty()) {
node.child.push_back(newNode(choices.front()));
choices.pop_front();
}
return;
}
//Compute one step of computation for each node in input, and put all the childreen in the task vector.
void solveOneStep(vector<Node *> &nodes, const int &nw, vector<Node *> &tasks) {
if (solutionFound) return;
for (Node *&n : nodes) {
if (findCell(n->grid) != ERROR_PAIR) {
pair<int, int> freeCell = findCell(n->grid);
list<array<char, N*N>> choices =
getChoices(freeCell.first, freeCell.second, n->grid);
if (choices.empty()) {
continue;
}
addChoices(choices, *n);
for (Node *&n : n->child) {
tasks.push_back(n);
}
continue;
} else if (isSafe(n->grid)) {
if (!solutionFound.load()) {
solutionFound.store(true);
printGrid(n->grid);
cout << "That's the first solution found !" << endl;
}
return;
}
}
}
//Compute step by step the computation until you reach a level of the entire tree of nodes where
//the nodes of that level are more than the number of worker(NW) choose by the user.
vector<Node *> splitChunks(Node *root, const int &nw) {
vector<Node *> tasks;
vector<Node *> nodes;
nodes.push_back(root);
while ((int)tasks.size() < nw && !solutionFound) {
tasks.clear();
solveOneStep(nodes, nw, tasks);
nodes = tasks;
}
return tasks;
}
//Solve recursively all the sub-trees of all the nodes given in input, until a solution is found or no
//solution exist.
void solveSubTree(vector<Node *> &nodes, const int &nw) {
if (solutionFound) return;
for (Node *&n : nodes) {
if (findCell(n->grid) != ERROR_PAIR) {
pair<int, int> freeCell = findCell(n->grid);
list<array<char, N*N>> choices =
getChoices(freeCell.first, freeCell.second, n->grid);
if (choices.empty()) {
continue;
}
addChoices(choices, *n);
solveSubTree(n->child, nw);
} else if (isSafe(n->grid)) {
if (!solutionFound.load()) {
solutionFound.store(true);
printGrid(n->grid);
std::cout << "That's the first solution found !" << endl;
}
return;
}
}
}
int main(int argc, char *argv[]) {
if (argc != 2) {
cout << "Usage is: nw " << endl;
return (-1);
}
//A test matrix.
array<char, N*N> grid =
{ 0, 0, 0, 0, 0, 0, 2, 0, 0,
0, 8, 0, 0, 0, 7, 0, 9, 0,
6, 0, 2, 0, 0, 0, 5, 0, 0,
0, 7, 0, 0, 6, 0, 0, 0, 0,
0, 0, 0, 9, 0, 1, 0, 0, 0,
0, 0, 0, 0, 2, 0, 0, 4, 0,
0, 0, 5, 0, 0, 0, 6, 0, 3,
0, 9, 0, 4, 0, 0, 0, 7, 0,
0, 0, 6, 0, 0, 0, 0, 0, 0 };
Node *root = newNode(grid);
vector<thread> tids;
const int nw = atoi(argv[1]); //Number of worker
vector<vector<Node *>> works(nw, vector<Node *>());
vector<Node *> tasks = splitChunks(root, nw);
//Split the tasks for each thread, the main thread takes the last part of the work.
for (int i = 0; i < nw; i++) {
int limit = 0;
i == nw - 1 ? limit = tasks.size() : limit = tasks.size() / nw;
for (int j = 0; j < limit; j++) {
works[i].push_back(tasks.back());
tasks.pop_back();
}
}
//Start each thread, and then the main thread start his computation.
for (int i = 0; i < nw - 1; i++) {
tids.push_back(thread(solveSubTree, ref(works[i]), ref(nw)));
}
solveSubTree(works[nw - 1], nw); // Main thread do last part of the work
for (thread &t : tids) {
t.join();
}
std::cout << "end" << endl;
return (0);
}

After reading the first paragraph: Your approach is very inefficient; multithreading won't safe it.
Consider this: There are 4 x 81 questions you could ask: Where in column c does the number n go? Where in row r does the number n go? Where in 3x3 box b does the number n go? Which number goes into the cell at column c, row r?
If k numbers have been written down, 4k of these questions are already answered. For the rest, find the number of possible valid answers according to the Sudoku rules. If a question has no valid answers, there is no solution and you backtrack. If a question has one valid answer, you pick that answer. Otherwise you try the answers for a question with 2 possible answers in turn, or for a question with 3 possible answers if there are none with 1 or 2 answers, etc.
For most newspaper problems there will be very little backtracking.

Related

munmap_chunk() invalid pointer after successful merge sort

I've run into a problem with my recursive merge sort implementation. The vector I'm feeding to the function gets sorted just fine but then the program terminates with an munmap_chunk(): invalid pointer error.
#include <iostream>
#include <vector>
#include <fstream>
#include <time.h>
#include <stdlib.h>
using namespace std;
// function prototypes
void insertionSort(vector<int> &target, int first, int last);
void mergeSort(vector<int> &target);
void merge(vector<int> &target, vector<int> &left, vector<int> &right); int &i, int &j);
int medianThree(vector<int> &avector, int left, int right);
int ninther(vector<int> &avector, int left, int right);
void printVector(vector<int> &avector);
void generateRandom(vector<int> &avector, int count, char t);
void reverseVector(vector<int> &avector);
int main() {
// initialize random seed from system clock
srand(time(NULL));
vector<int> myVect = { 3, 5, 1, 9, 0, -3, -1, 44, 14, 420, 69, 305, 7 };
mergeCutoff(myVect, 5);
printVector(myVect);
// generates inversely sorted vectors of sizes 10-49 (10 times each) and uses basic quicksort with lazy
// pivot to sort it. after 10 loops at the same vector size, divide the total comparisons/swaps/memory
// costs by 10 and record this average to text file
for (int vectSize = 10; vectSize < 100; vectSize++) {
resetCounters();
for (int experiment = 1; experiment <= 4; experiment++) {
if (experiment == 1) {
for (int loop = 0; loop < 10; loop++) {
vector<int> badVect(vectSize);
generateRandom(badVect, vectSize, 'I');
quickSort(badVect, 0, badVect.size() - 1, 'l');
}
} // end experiment 1
if (experiment == 2) {
for (int loop = 0; loop < 10; loop++) {
vector<int> badVect(vectSize);
generateRandom(badVect, vectSize, 'I');
quickSort(badVect, 0, badVect.size() - 1, 'm');
}
for (int loop = 0; loop < 10; loop++) {
vector<int> badVect(vectSize);
generateRandom(badVect, vectSize, 'I');
quickSort(badVect, 0, badVect.size() - 1, 'n');
}
} // end experiment 2
if (experiment == 3) {
for (int loop = 0; loop < 10; loop++) {
vector<int> randVect(vectSize);
generateRandom(randVect, vectSize, 'R');
quickDual(randVect, 0, randVect.size() - 1);
} // dual pivot runs
for (int loop = 0; loop < 10; loop++) {
vector<int> randVect(vectSize);
generateRandom(randVect, vectSize, 'R');
quickSort(randVect, 0, randVect.size(), 'm');
}
for (int loop = 0; loop < 10; loop++) {
vector<int> randVect(vectSize);
generateRandom(randVect, vectSize, 'R');
quickSort(randVect, 0, randVect.size(), 'n');
}
for (int loop = 0; loop < 10; loop++) {
vector<int> randVect(vectSize);
generateRandom(randVect, vectSize, 'R');
mergeSort(randVect);
} // merge sort runs
} // end experiment 3
if (experiment == 4) {
for (int loop = 0; loop < 10; loop++) {
vector<int> randVect(vectSize);
generateRandom(randVect, vectSize, 'D');
quickDual(randVect, 0, randVect.size() - 1);
} // dual pivot sort runs
for (int loop = 0; loop < 10; loop++) {
vector<int> randVect(vectSize);
generateRandom(randVect, vectSize, 'D');
quickEntropy(randVect, 0, randVect.size() - 1, 'n');
} // three way partition sort runs
for (int loop = 0; loop < 10; loop++) {
vector<int> randVect(vectSize);
generateRandom(randVect, vectSize, 'D');
quickSort(randVect, 0, randVect.size() - 1, 'm');
} // median of threes normal quicksort runs
for (int loop = 0; loop < 10; loop++) {
vector<int> randVect(vectSize);
generateRandom(randVect, vectSize, 'D');
quickSort(randVect, 0, randVect.size() - 1, 'n');
} // ninther pivot normal quicksort runs
}
}
}
return 0;
} // end main method
void insertionSort(vector<int> &target, int first, int last) {
addMem(8);
for (unsigned index = first + 1; index <= last; index++) {
addMem(8);
int currentValue = target[index];
int position = index;
comparisons++;
while (position > first && target[position - 1] > currentValue) {
target[position] = target[position - 1];
position--;
swaps++;
}
target[position] = currentValue;
swaps++;
memory -= 8;
}
memory -= 8;
return;
} // end insertion sort method, some quick sort implementations will switch to this when subarrays get small enough
void mergeSort(vector<int> &target) {
addMem(4);
int size = target.size();
if (size > 1) {
addMem(4);
int mid = size / 2;
addMem(4 * size);
vector<int> leftHalf(target.begin(), target.begin() + mid);
vector<int> rightHalf(target.begin() + mid, target.end());
mergeSort(leftHalf);
mergeSort(rightHalf);
merge(target, leftHalf, rightHalf);
memory -= (4 * size);
}
memory -= 4;
return;
} // end merge sort function
void merge(vector<int> &target, vector<int> &left, vector<int> &right) {
addMem(12);
unsigned targetIndex = 0;
unsigned leftIndex = 0;
unsigned rightIndex = 0;
while (leftIndex < left.size() && rightIndex < right.size()) {
if (left[leftIndex] < right[rightIndex]) {
target[targetIndex] = left[leftIndex];
leftIndex++;
}
else {
target[targetIndex] = right[rightIndex];
rightIndex++;
}
targetIndex++;
} // loop here until one of the lists is exhausted
// then loop through the non-exhausted list adding its remaining elements to the target array
while (leftIndex < left.size()) {
target[targetIndex] = left[leftIndex];
leftIndex++;
targetIndex++;
}
while (rightIndex < right.size()) {
target[targetIndex] = right[rightIndex];
rightIndex++;
targetIndex++;
}
memory -= 12;
} // end merge method, interleaves two vectors into single sorted vector (target)
int medianThree(vector<int> &avector, int left, int right) {
addMem(12);
int center = (left + right) / 2;
comparisons += 3;
if (avector[left] > avector[center]) { swap(avector[left], avector[center]); swaps++; }
if (avector[left] > avector[right]) { swap(avector[left], avector[right]); swaps++; }
if (avector[center] > avector[right]) { swap(avector[center], avector[right]); swaps++; }
swaps++;
swap(avector[center], avector[right]);
memory -= 12;
return avector[right];
} // end helper method to get the median of three of a given (sub)vector, also places first, mid and last items in correct partitions
int ninther(vector<int> &avector, int left, int right) {
addMem(16);
unsigned firstThird = left + ((right - left) / 3);
unsigned secondThird = firstThird + ((right - left) / 3);
medianThree(avector, left, firstThird);
medianThree(avector, firstThird + 1, secondThird);
medianThree(avector, secondThird + 1, right);
memory -= 16;
return medianThree(avector, left, right);
} // end helper method to get the ninther value of an array
void printVector(vector<int> &avector) {
cout << "Vector contains:";
for (int i = 0; i < avector.size(); i++) {
cout << " " << avector[i];
if (i < avector.size() - 1) {
cout << ",";
}
}
cout << " (" << avector.size() << " items)" << endl;
} // end function to print all of a vector's elements to terminal
void generateRandom(vector<int> &avector, int count, char t) {
int min = 0;
int max = 500;
for (int i = 0; i < count; i++) {
int num = (rand() % max) + min;
// if partial sort is selected and current index isn't divisible by 3
// adjust min and max values so those items will be sequentially sorted
if (t == 'I' || (t == 'P' && (i % 3 != 0))) {
min = num;
max += 300;
}
// in any case, add value of num to array at next position
avector[i] = num;
// having reached the end of the inverse-sorted list, flip the list
if (t == 'I' && i == count - 1) { reverseVector(avector); }
}
if (t == 'D') {
for (int i = 0; i < (avector.size() / 3); i++) {
int index = rand() % avector.size();
int index2 = rand() % avector.size();
avector[index] = avector[i];
avector[index2] = avector[i];
}
}
return;
} // modifies vector to populate with [count] random numbers, possibly sorted depending on t value
void reverseVector(vector<int> &avector) {
int last = avector.size() - 1;
for (int i = 0; i < avector.size() / 2; i++) {
swap(avector[i], avector[last - i]);
}
return;
} // helper method to put ascending-ordered vectors in descending order
I used my debugger to see where the issue was occurring and it's happening after the function has finished executing. The new vectors get created just fine, the recursive mergeSort calls go through just fine and the subvectors get sorted appropriately. The merge call at the end works properly and I can see either by printing the vector contents inside the function or by checking my variable tracker in the debugger that the entire list is sorted. And then just before the function returns, execution moves back to the subvectors and throws the munmap error.
I've read on similar posts here that this error has to do with freeing memory pointed to by an invalid pointer. I don't know why my subvectors would be invalid—they don't get destroyed by the merge function, and there are no issues with the subvectors at any recursive level—or what I can do to fix this. Any thoughts?
The function also goes off without an error when called at the top of main.
Apologies for the code dump. The sections I've included earlier were either incomplete or uncompilable. I'm not sure how to whittle down to only the relevant blocks when I don't know where the issue lies. Hopefully this is more useful

memory leak with work pool in multithreading

After adding a queue for a work pool, in which i put the jobs and get them with a unique_lock, i get memory leak errors, but i can't find where i am missing to delete.
Simple logic: i got a farm, it split the work, give them to threads, threads do the computation and push into the queue the results, then the emitter node split the job if needed and send it to the threads again.
I give you the actual code and i also post the error of -fsanitize=addres, anyway the code is runnable and you can try with your best profiling tool.
#include <iostream>
#include <unistd.h>
#include <typeinfo>
#include <chrono>
#include <thread>
#include <ff/ff.hpp>
#include <ff/pipeline.hpp>
#include <ff/farm.hpp>
#include <mutex>
#include <atomic>
#include <list>
#include <array>
#include <math.h>
#define UNASSIGNED 0
#define N 9
#define ERROR_PAIR std::make_pair(-1, -1)
using namespace std;
using namespace ff;
atomic<bool> solutionFound{false};
mutex mtx;
// Declaration for a tree node
struct Node {
array<unsigned char, N * N> grid;
vector<Node *> child;
};
vector<vector<Node *>> queueWork(0, vector<Node *>(0));
// Utility function to create a new tree node
Node *newNode(const array<unsigned char, N * N> &newGrid) {
Node *temp = new Node;
temp->grid = newGrid;
return temp;
}
void printGrid(const array<unsigned char, N * N> &grid) {
for (int row = 0; row < N; row++) {
if (row == 3 || row == 6) {
cout << "---------------------" << endl;
}
for (int col = 0; col < N; col++) {
if (col == 3 || col == 6) {
cout << "| ";
}
cout << (int)grid[row + col * N] << " ";
}
cout << endl;
}
}
bool canInsert(const int &val, const int &row_, const int &col_,
const array<unsigned char, N * N> &grid) {
// Check column
for (int row = 0; row < N; row++) {
if (grid[row + col_ * N] == val) return false;
}
// check row
for (int col = 0; col < N; col++) {
if (grid[row_ + col * N] == val) return false;
}
// check box
for (int row = 0; row < N; row++) {
for (int col = 0; col < N; col++) {
if (row / 3 == row_ / 3 &&
col / 3 == col_ / 3) { // they are in the same square 3x3
if ((grid[row + col * N] == val)) return false;
}
}
}
return true;
}
// vector<vector<int>> gridTest(9, vector<int>(9,0)); il vettore deve essere
// inizializzato, cosi. n = how many numbers you want to initialize the matrix
// with
void generateMatrix(const int &seed, const int &n,
array<unsigned char, N * N> &grid) {
srand(seed);
int i = 0;
while (i < n) {
int row = rand() % 9;
int col = rand() % 9;
int val = rand() % 9 + 1;
if (grid[row + col * N] == UNASSIGNED &&
canInsert(val, row, col, grid)) {
grid[row + col * N] = val;
i++;
}
}
return;
}
bool isSolution(
const array<unsigned char, N * N> &grid) // check if the sudoku is solved
{
char row_[9][N + 1] = {0};
char column_[9][N + 1] = {0};
char box[3][3][N + 1] = {0};
for (int row = 0; row < N; row++) {
for (int col = 0; col < N; col++) {
// mark the element in row column and box
row_[row][grid[row + col * N]] += 1;
column_[col][grid[row + col * N]] += 1;
box[row / 3][col / 3][grid[row + col * N]] += 1;
// if an element is already
// present in the hashmap
if (box[row / 3][col / 3][grid[row + col * N]] > 1 ||
column_[col][grid[row + col * N]] > 1 ||
row_[row][grid[row + col * N]] > 1)
return false;
}
}
return true;
}
pair<int, int> findCell(const array<unsigned char, N * N> &grid) {
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
if (grid[i + j * N] == UNASSIGNED) {
return make_pair(i, j);
}
}
}
return ERROR_PAIR;
}
void addChoices(list<array<unsigned char, N * N>> &choices, Node &node) {
while (!choices.empty()) {
node.child.push_back(newNode(choices.front()));
choices.pop_front();
}
return;
}
list<array<unsigned char, N * N>> getChoices(
const int &row, const int &col, const array<unsigned char, N * N> &grid) {
list<array<unsigned char, N * N>> choices;
for (int i = 1; i < 10; i++) {
if (canInsert(i, row, col, grid)) {
array<unsigned char, N *N> tmpGrid = grid;
tmpGrid[row + col * N] = i;
choices.push_back(move(tmpGrid));
}
}
return choices;
}
// Compute one step of computation for each node in input, and put all the
// childreen in the task vector.
void solveOneStep(vector<Node *> &nodes, vector<Node *> &tasks) {
// std::this_thread::sleep_for(std::chrono::milliseconds(2000));
// std::this_thread::sleep_for(std::chrono::milliseconds(100));
if (solutionFound) {
for (Node *&t : nodes) {
delete t;
}
return;
}
for (Node *&n : nodes) {
if (findCell(n->grid) != ERROR_PAIR) {
pair<int, int> freeCell = findCell(n->grid);
list<array<unsigned char, N *N>> choices =
getChoices(freeCell.first, freeCell.second, n->grid);
if (choices.empty()) {
delete n;
continue;
}
addChoices(choices, *n);
for (Node *&n : n->child) { //Store all the children in tasks
tasks.push_back(n);
}
delete n;
continue;
} else if (isSolution(n->grid)) {
if (!solutionFound.load()) {
solutionFound.store(true);
printGrid(n->grid);
cout << "That's the first solution found !" << endl;
}
delete n;
return;
}
}
}
//Start the computation sequentially, until we have enough works to start all the threads togheter
vector<Node *> findChunks(Node *root, const int &nw) {
vector<Node *> tasks;
vector<Node *> nodes;
nodes.push_back(root);
while ((int)tasks.size() < nw && !solutionFound) {
tasks.clear();
solveOneStep(nodes, tasks);
if (tasks.empty()) {
vector<Node *> error;
cout << "errore" << endl;
return error;
}
nodes = tasks;
}
return tasks;
}
//Assign each part of the work to each worker
vector<vector<Node *>> splitChunks(vector<Node *> &tasks, int nw) {
int freeWorker = nw;
vector<vector<Node *>> works(nw, vector<Node *>());
for (int i = 0; i < nw; i++) {
int limit = 0;
i == nw - 1 ? limit = tasks.size()
: limit = ceil(tasks.size() / double(freeWorker));
for (int j = 0; j < limit; j++) {
works[i].push_back(tasks.back());
tasks.pop_back();
}
freeWorker--;
}
return works;
}
vector<Node *> solveTest(vector<Node *> &nodes) {
std::this_thread::sleep_for(std::chrono::milliseconds(10));
vector<Node *> results;
if (solutionFound) {
for (Node *&t : nodes) {
delete t;
}
return results;
}
for (Node *&n : nodes) {
if (findCell(n->grid) != ERROR_PAIR) { //There is an empty cell
pair<int, int> freeCell = findCell(n->grid);
list<array<unsigned char, N *N>> choices =
getChoices(freeCell.first, freeCell.second, n->grid);
if (choices.empty()) {
delete n;
continue;
}
addChoices(choices, *n); //Update the tree
for (Node *&child : n->child) {
results.push_back(child);
};
delete n;
continue;
} else if (isSolution(n->grid) && !solutionFound.load()) { //Grid is full, check for a solution
solutionFound = true;
printGrid(n->grid);
cout << "That's the first solution found !" << endl;
delete n;
return results;
} else { //Grid full but it's not a solution
delete n;
continue;
}
}
return results;
}
//Get a work from the queue
vector<Node *> getWork() {
unique_lock<mutex> lck(mtx);
auto tmp = queueWork.back();
queueWork.pop_back();
lck.unlock();
return tmp;
}
//Put a work in the queue
void pushWork(vector<Node *> &work) {
unique_lock<mutex> lck(mtx);
queueWork.push_back(work);
lck.unlock();
return;
}
struct firstThirdStage : ff_node_t<vector<Node *>> {
firstThirdStage(Node *root, const int nw) : root(root), nw(nw) {}
vector<Node *> *svc(vector<Node *> *task) {
if (task == nullptr) {
vector<Node *> tasks = findChunks(root, nw);
if (tasks.empty() && !solutionFound) { //No more moves to do, no solution.
cout << "This sudoku is unsolvable!" << endl;
delete task;
return EOS;
}
vector<vector<Node *>> works = splitChunks(tasks, nw);
for (size_t i = 0; i < works.size(); ++i) {
ff_send_out(new vector<Node *>(works[i]));
}
delete task;
return GO_ON;
}
//cout << threadSus << endl;
if (solutionFound.load()) { //After the first iteration
delete task;
return EOS;
} else {
if (!queueWork.empty()) {
vector<Node *> tmp;
tmp = getWork();
ff_send_out(new vector<Node *>(tmp));
delete task;
return GO_ON;
} else
if (++threadSus == nw) {
cout << "This sudoku is unsolvable!" << endl;
delete task;
return EOS;
}
}
delete task;
return GO_ON;
}
void svc_end() {
cout << "Done !" << endl;
}
Node *root;
const int nw;
int threadSus = 0; //Threads suspended
};
struct secondStage : ff_node_t<vector<Node *>> {
vector<Node *> *svc(vector<Node *> *task) {
vector<Node *> &t = *task;
vector<Node *> res = solveTest(t);
if (!res.empty()) {
pushWork(res);
} else {
for (auto &t : res){
delete t;
}
}
return task;
}
};
int main(int argc, char *argv[]) {
chrono::high_resolution_clock::time_point t1 =
chrono::high_resolution_clock::now();
array<unsigned char, N *N> grid = {
3, 0, 6, 5, 0, 8, 4, 0, 0, 5, 2, 0, 0, 0, 0, 0, 0, 0, 0, 8, 7,
0, 0, 0, 0, 3, 1, 0, 0, 3, 0, 1, 0, 0, 8, 0, 9, 0, 0, 8, 6, 3,
0, 0, 5, 0, 5, 0, 0, 9, 0, 6, 0, 0, 1, 3, 0, 0, 0, 0, 2, 5, 0,
0, 0, 0, 0, 0, 0, 0, 7, 4, 0, 0, 5, 2, 0, 6, 3, 0, 0};
array<unsigned char, N *N> testGrid2 = {
0, 0, 0, 5, 7, 8, 4, 9, 2, 0, 0, 0, 1, 3, 4, 7, 6, 8, 0, 0, 0,
6, 2, 9, 5, 3, 1, 2, 6, 3, 0, 1, 5, 9, 8, 7, 9, 7, 4, 8, 6, 0,
1, 2, 5, 8, 5, 1, 7, 9, 2, 6, 4, 3, 1, 3, 8, 0, 4, 7, 2, 0, 6,
6, 9, 2, 3, 5, 1, 8, 7, 4, 7, 4, 5, 0, 8, 6, 3, 1, 0};
if (argc < 2) {
std::cerr << "use: " << argv[0] << " nworkers\n";
return -1;
}
array<unsigned char, N *N> testGrid = {0};
generateMatrix(12,20, testGrid);
Node *root = newNode(testGrid);
const size_t nworkers = std::stol(argv[1]);
firstThirdStage firstthird(root, nworkers);
std::vector<std::unique_ptr<ff_node>> W;
for (size_t i = 0; i < nworkers; ++i)
W.push_back(make_unique<secondStage>());
ff_Farm<vector<Node *>> farm(std::move(W), firstthird);
farm.remove_collector(); // needed because the collector is present by
// default in the ff_Farm
farm.wrap_around(); // this call creates feedbacks from Workers to the
// Emitter
// farm.set_scheduling_ondemand(); // optional
ffTime(START_TIME);
if (farm.run_and_wait_end() < 0) {
error("running farm");
return -1;
}
ffTime(STOP_TIME);
std::cout << "Time: " << ffTime(GET_TIME) << "\n";
chrono::high_resolution_clock::time_point t2 =
chrono::high_resolution_clock::now();
chrono::duration<double> time_span =
chrono::duration_cast<chrono::duration<double>>(t2 - t1);
std::cout << "It took me " << time_span.count() << " seconds." << endl;
return (0);
}
At least one cause of a leak is found using https://github.com/vmware/chap (free open source) as follows:
Gather a live core of your program just before it returns from main (for example, by using gdb to set a breakpoint there, then using the "generate" command from gdb to generate a core.
Open the core from chap and do the following from the chap prompt:
chap> count leaked
699 allocations use 0x147a8 (83,880) bytes.
That shows you that there are 699 leaked allocations.
chap> count unreferenced
692 allocations use 0x14460 (83,040) bytes.
That shows you that of the leaked allocations, all but 7 of those allocations are not referenced by any other leaked allocations.
chap> count unreferenced /extend ~>
699 allocations use 0x147a8 (83,880) bytes.
That shows you that all the leaked allocations can be reached from those unreferenced allocations, so if we understand the unreferenced allocations we understand the whole leak.
chap> summarize unreferenced
Unrecognized allocations have 692 instances taking 0x14460(83,040) bytes.
Unrecognized allocations of size 0x78 have 692 instances taking 0x14460(83,040) bytes.
692 allocations use 0x14460 (83,040) bytes.
That shows you that the unreferenced allocations are all size 0x78.
chap> redirect on
That says that the output of any subsequent commands should be redirected to files until the next "redirect off" command.
chap> show unreferenced
Wrote results to core.21080.show_unreferenced
That command shows all the unreferenced allocations and, since redirect was on, that output went to the specified file.
If we look in that output file we see that all the allocations look like this:
Used allocation at 7f0f74009870 of size 78
0: 806070304010209 108060503040705 109020508060902 509060308030407
20: 702010501070204 907040403090608 502060508020103 108010309040607
40: 206050307080409 904050201080603 7 0
60: 0 0 80
By inspection of the code, each of these objects is a Node, taking 0x58 bytes for the 9*9 std::array at offset 0 followed by 0x18 bytes for the vector header at offset 0x58, which in the node shows here has all 0 because the vector is empty and so does not need a buffer. The other thing you can see in this node is that it is for a fully filled in grid, because the first 81 bytes are all non-zero.
The above information is sufficient to determine that at least one cause of a leak is in solveOneStep, where it mis-handles the case where the grid is fully filled in but is not a solution, because in that case it simply forgets about n.
I'll leave finding any other causes of this leak to you, keeping in mind that all the leaked objects are for fully filled in grids, but not necessary solutions.
Well the problem starts with you having to use deletes at all, using std::unique_ptr would have solved the problem for you.
Seems your forgetting to delete tmp?
if (!queueWork.empty()) {
vector<Node *> tmp;
tmp = getWork(); <--- bug part 1, tmp is never deleted
ff_send_out(new vector<Node *>(tmp));
delete task; <--- bug part 2 what is deleted here???
return GO_ON;
} else
if (++threadSus == nw) {
cout << "This sudoku is unsolvable!" << endl;
delete task;
return EOS;
}

Finding two associated indexes where the sum of two elements equals a target value

Background:
Given an array of integers, return indices of the two numbers such that they add up to a specific target.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
Example:
Given nums = [2, 7, 11, 15], target = 9,
Because nums[0] + nums[1] = 2 + 7 = 9,
return [0, 1].
Question:
I have a list of numbers 1,2,3,4,5. My target value is 8, so I should return indices 2 and 4. My first thought is to write a a double for loop that checks to see if adding two elements from the list will get my target value. Although, when checking to see if there is such a solution, my code returns that there is none.
Here is my code:
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> list;
list.push_back(1);
list.push_back(2);
list.push_back(3);
list.push_back(4);
list.push_back(5);
int target = 8;
string result;
for(int i = 0; i < list.size(); i++) {
for(int j = i+1; j < list.size(); j++) {
if(list[i] + list[j] == target) {
result = "There is a solution";
}
else {
result = "There is no solution";
}
}
}
cout << result << endl;
return 0;
}
Perhaps my approach/thinking is plain wrong. Could anyone provide any hints or suggestions to solving this problem?
Your approach is correct but you are forgetting you are in a loop that continues after finding the solution.
This will get you halfway there. I recommend putting both loops in a function, and returning once you find a match. One thing you could do is return a pair<int,int> from that function or you could simply output the results from within that point in the loop.
bool solutionFound = false;
int i,j;
for(i = 0; i < list.size(); i++)
{
for(j = i+1; j < list.size(); j++)
{
if(list[i] + list[j] == target)
{
solutionFound = true;
}
}
}
Here is what the function approach might look like:
pair<int, int> findSolution(vector<int> list, int target)
{
for (int i = 0; i < list.size(); i++)
{
for (int j = i + 1; j < list.size(); j++)
{
if (list[i] + list[j] == target)
{
return pair<int, int>(i, j);
}
}
}
return pair<int, int>(-1, -1);
}
int main() {
vector<int> list;
list.push_back(1);
list.push_back(2);
list.push_back(3);
list.push_back(4);
list.push_back(5);
int target = 8;
pair<int, int> results = findSolution(list, target);
cout << results.first << " " << results.second << "\n";
return 0;
}
Here's the C++ incorporating Dave's answer for linear execution time and a couple helpful comments:
pair<int, int> findSolution(vector<int> list, int target)
{
unordered_map<int, int> valueToIndex;
for (int i = 0; i < list.size(); i++)
{
int needed = target - list[i];
auto it = valueToIndex.find(needed);
if (it != valueToIndex.end())
{
return pair<int, int>(it->second, i);
}
valueToIndex.emplace(list[i], i);
}
return pair<int, int>(-1, -1);
}
int main()
{
vector<int> list = { 1,2,3,4,5 };
int target = 10;
pair<int, int> results = findSolution(list, target);
cout << results.first << " " << results.second << "\n";
}
You're doing this in n^2 time. Solve it in linear time by hashing each element, and checking each element to see if it's complement wrt. the total you're trying to achieve is in the hash.
E.g., for 1,2,3,4,5, with a target of 8
indx 0, val 1: 7 isn't in the map; H[1] = 0
indx 1, val 2: 6 isn't in the map, H[2] = 1
indx 2, val 3: 5 isn't in the map, H[3] = 2
indx 3, val 4: 4 isn't in the map, H[4] = 3
indx 4, val 5: 3 is in the map. H[3] = 2. Return 2,4
Code, as requested (Ruby)
def get_indices(arr, target)
value_to_index = {}
arr.each_with_index do |val, index|
if value_to_index.has_key?(target - val)
return [value_to_index[target - val], index]
end
value_to_index[val] = index
end
end
get_indices([1,2,3,4,5], 8)
Basically the same as zzxyz's most recent edit but a little quicker and dirtier.
#include <iostream>
#include <vector>
bool FindSolution(const std::vector<int> &list, // const reference. Less copying
int target)
{
for (int i: list) // Range-based for (added in C++11)
{
for (int j: list)
{
if (i + j == target) // i and j are the numbers from the vector.
// no need for indexing
{
return true;
}
}
}
return false;
}
int main()
{
std::vector<int> list{1,2,3,4,5}; // Uniform initialization Added in C++11.
// No need for push-backs of fixed data
if (FindSolution(list, 8))
{
std::cout << "There is a solution\n";
}
else
{
std::cout << "There is no solution\n";
}
return 0;
}

C++ vector values keep changing?

This is a real simple problem. I'm writing a sliding block puzzle game for an exercise.
1, 1, 1, 1, 1,
1, 0, 3, 4, 1,
1, 0, 2, 2, 1,
1, 1, 1, 1, 1,
It receives input as in the form above, with '0' representing empty spaces, '1' representing walls, and all other numbers representing blocks.
Here is the class definition and constructor for the game state:
class GameState {
public:
GameState(int hght, int wdth);
GameState(const GameState &obj);
~GameState();
int getHeight();
int getWidth();
int getElem(int i, int j);
void setElem(int i, int j, int val);
void display();
void readFile(char* filename);
bool checkSolved();
map<int, vector<int*> > blockLocations;
vector<int> blockList;
void getBlockLocations();
void findBlock(int n);
private:
int **grid;
int height, width;
void allocate() {
grid = new int*[height];
for(int i = 0; i < height; i++)
{
grid[i] = new int[width];
}
}
};
GameState::GameState(int hght, int wdth) {
height = hght;
width = wdth;
allocate();
for(int i = 0; i < hght; i++) {
for (int j = 0; j < wdth; j++) {
grid[i][j] = 0;
}
}
};
Essentially, the grid is represented by a two-dimensional pointer array of integers. height and width are self-explanatory; blockLocations is a map that maps a block number to its point-wise coordinates of the form (y, x). For the time being, if a block occupies multiple spaces only the lowest rightmost space is listed. The matrix initializes as being nothing but zeros; the actual values are read in from a csv.
All of these methods are defined, but the two methods of concern are getBlockLocations() and findBlock(int n).
void GameState::getBlockLocations() {
for (int i = 0; i < height; i++) {
for (int j = 0; j < width; j++) {
blockList.push_back(grid[i][j]);
int pos[2] = {i, j};
vector<int*> v;
v.push_back(pos);
blockLocations[grid[i][j]] = v;
}
}
}
void GameState::findBlock(int n) {
vector<int>::iterator it;
it = find(blockList.begin(), blockList.end(), n);
if (it != blockList.end()) {
vector<int*> * posList = &blockLocations[n];
for (int itr = 0; itr < posList->size(); itr++) {
vector<int*> curPos = *posList;
cout << curPos[itr][0] << ", " << curPos[itr][1] << endl;
}
}
}
The problem comes up when I actually run this. As a case example, when I run getBlockLocations(), it correctly stores the coordinate for '2' as (2, 3). However, when I ask the program to display the location of that block with findBlock(2), the resulting output is something along the lines of (16515320, 0). It's different every time but never correct. I don't see the pointer mistake I'm making to get incorrect values like this.
That is bad:
for (int j = 0; j < width; j++) {
blockList.push_back(grid[i][j]);
int pos[2] = {i, j};
vector<int*> v;
v.push_back(pos);
blockLocations[grid[i][j]] = v;
}
You create a pos variable locally and store its reference. When you go out of scope of the for loop it is invalid / data can be replaced by something else.
(actually as Barmar pointed out, since the pos address is always the same within the loop, the values change at each iteration)
You could use a std::pair<int,int> to store your values instead.
When you insert the pair in the vector, the data is copied, not only the pointer: it is safe.
typedef std::pair<int,int> IntIntPair;
IntIntPair pos(i,j);
std::vector<IntIntPair> v;

Is there any input for which selection sort outperforms bubble sort?

I mean like...partial, full or reverse sorted arrays.
I have already tried the following: random, fully sorted, almost sorted, partially sorted, rever sorted and the count of bubble is lesser when it's fully sorted. In all other cases, it's the same.
int selectionSort(int a[], int l, int r) {
int count = 0;
for (int i = l; i < r; i++) {
int min = i;
for (int j = i + 1; j <= r; j++) {
if (a[j] < a[min]) min = j;
count++;
}
if (i != min) swap(a[i], a[min]);
}
return count;
}
int bubbleSort(int a[], int l, int r) {
int count = 0;
bool flag = false;
for (int i = l; i < r; i++) {
for (int j = r; j > i; j--) {
if (a[j-1] > a[j]) {
if (flag == false) flag = true;
swap(a[j - 1], a[j]);
}
count++;
}
if (flag == false) break;
}
return count;
}
The count returns the number of comparisons BTW.
Among simple average-case Θ(n2) algorithms, selection sort almost always outperforms bubble sort.
Source: Wikipedia
I hinted at this already in comments, but here's some updated code for you that counts both comparisons and exchanges/swaps, and illustrates that for some random input the number of exchanges/swaps is where selection sort outperforms bubble sort.
#include <iostream>
#include <vector>
#include <utility>
#include <cassert>
using namespace std;
struct Stats { int swaps_ = 0, compares_ = 0; };
std::ostream& operator<<(std::ostream& os, const Stats& s)
{
return os << "{ swaps " << s.swaps_
<< ", compares " << s.compares_ << " }";
}
Stats selectionSort(std::vector<int>& a, int l, int r) {
Stats stats;
for (int i = l; i < r; i++) {
int min = i;
for (int j = i + 1; j <= r; j++) {
if (a.at(j) < a.at(min)) min = j;
++stats.compares_;
}
if (i != min) {
swap(a.at(i), a.at(min));
++stats.swaps_;
}
}
return stats;
}
Stats bubbleSort(std::vector<int>& a, int l, int r) {
Stats stats;
bool flag = false;
for (int i = l; i < r; i++) {
for (int j = r; j > i; j--) {
if (a.at(j-1) > a.at(j)) {
if (flag == false) flag = true;
swap(a.at(j - 1), a.at(j));
++stats.swaps_;
}
++stats.compares_;
}
if (flag == false) break;
}
return stats;
}
int main()
{
std::vector<int> v1{ 4, 8, 3, 8, 10, -1, 3, 20, 5 };
std::vector<int> v1s = v1;
std::cout << "sel " << selectionSort(v1s, 0, v1s.size() - 1);
std::vector<int> v1b = v1;
std::cout << ", bub " << bubbleSort(v1b, 0, v1b.size() - 1) << '\n';
assert(v1s == v1b);
// always a good idea to check the code's doing what you expect...
for (int i : v1s) std::cout << i << ' ';
std::cout << '\n';
}
Output:
sel { swaps 6, compares 36 }, bub { swaps 15, compares 36 }
-1 3 3 4 5 8 8 10 20
You can observe / copy / fork-and-edit / run the code online here.