Improving Performance of this MiniMax with AlphaBeta Pruning - c++

I have the following implementation of a alpha beta minimax for an othello (reversi) game. I've fixed a few of it's problems from this thread. This time I'd like to improve the performance of this function. It's taking a very long time with MAX_DEPTH = 8. What can be done to speed up the performance, while keeping the AI somewhat decent?
mm_out minimax(Grid& G, int alpha, int beta, Action& A, uint pn, uint depth, bool stage) {
if (G.check_terminal_state() || depth == MAX_DEPTH) {
return mm_out(A, G.get_utility(pn));
}
// add end game score total here
set<Action> succ_temp = G.get_successors(pn);
for (Action a : succ_temp) {
Grid gt(G);
a.evaluate(gt);
}
set<Action, action_greater> successors(succ_temp.begin(), succ_temp.end());
// if no successor, that player passes
if (successors.size()) {
for (auto a = successors.begin(); a != successors.end(); ++a) {
Grid gt(G);
gt.do_move(pn, a->get_x(), a->get_y(), !PRINT_ERR);
Action at = *a;
mm_out mt = minimax(gt, alpha, beta, at, pn ^ 1, depth + 1, !stage);
int temp = mt.val;
// A = mt.best_move;
if (stage == MINIMAX_MAX) {
if (alpha < temp) {
alpha = temp;
A = *a;
}
if (alpha >= beta) {
return mm_out(A, beta);
}
}
else {
if (beta > temp) {
beta = temp;
A = *a;
}
if (alpha >= beta) {
return mm_out(A, alpha);
}
}
}
return mm_out(A, (stage == MINIMAX_MAX) ? alpha : beta);
}
else {
return mm_out(A, (stage == MINIMAX_MAX) ? (std::numeric_limits<int>::max() - 1) : (std::numeric_limits<int>::min() + 1));
}
}
Utility function:
int Grid::get_utility(uint pnum) const {
if (pnum)
return wcount - bcount;
return bcount - wcount;
}

There are several ways to speed up the performance of your search function. If you implement these techniques properly, they will cause very little harm to the accuracy of the algorithm while pruning many nodes.
The first technique that you can implement are transposition table. Transposition tables store in a hashtable all previously visited nodes in your game search tree. Most game states, especially in a deep search, can be reaches through various transpositions, or orders of moves that resurt in the same final state. By storing previously searched game states, if you find a state already searched, you can use the data stored in the tables and stop deepening the search at that node. The standard technique to store game states in a hashtable is called Zobrist Hashing. Detailed information on the implementation of transposition tables is available on the web.
The second thing your program should include is move ordering.This essentially means to examine moves not in the order you generate them, but in the order that seems most likely to produce an alpha beta cutoff (ie good moves first). Obviously you can't know which moves are best, but most moves can be ordered using a naive technique. For example, in Othello a move that is in a corner or edge should be examined first. Ordering moves should lead to more cutoffs and an increase in search speed. This poses zero loss to accuracy.
You can also add opening books. Usually the opening moves take the longest to search, as the board is full of more possibilities.An opening book is a database that stores every possible move that can be made in the first few turns, and the best response to it., In Othello, with a low branching factor, this will be especially helpful in the opening game
Probcut. Im not going to go into more detail here as this is a more advanced technique. However it has had good results with othello, so I figured I'd post this link.https://chessprogramming.wikispaces.com/ProbCut

Related

Minimax Algorithm: Why make rating negative?

/* finds the best move for the current player given the state of the game.
* depth parameter and MAX_DEPTH are used to limit the depth of the search for games
* that are too difficult to analyze in full detail (like chess)
* returns best move by storing an int in variable that rating points to.
* we want to make the move that will result in the lowest best move for the position after us(our opponent)
*/
moveT findBestMove(stateT state, int depth, int &rating) {
Vector<moveT> moveList;
generateMoveList(state, moveList);
int nMoves = moveList.size();
if (nMoves == 0) cout << "no move??" << endl;
moveT bestMove;
int minRating = WINNING_POSITION + 1; //guarantees that this will be updated in for loop
for (int i = 0; i < nMoves && minRating != LOSING_POSITION; i++) {
moveT move = moveList[i];
makeMove(state, move);
int curRating = evaluatePosition(state, depth + 1);
if (curRating < minRating) {
bestMove = move;
minRating = curRating;
}
retractMove(state, move);
}
rating = -minRating;
return bestMove;
}
/* evaluates the position by finding the rating of the best move in that position, limited by MAX_DEPTH */
int evaluatePosition(stateT state, int depth) {
int rating;
if (gameIsOver(state) || depth >= MAX_DEPTH) {
return evaluateStaticPosition(state);
}
findBestMove(state, depth, rating);
return rating;
}
This is my code for implementing a minimax algorithm to play a perfect game of tic tac toe against a computer. The code works and there are many other helper functions not show here. I understand the nature of the algorithm, however I am having a hard time fully wrapping my head around the line at the end of the findBestMove() function:
rating = -minRating;
This is what my book says: The negative sign is included because the perspective has shifted: the positions were evaluated from the point- of-view of your opponent, whereas the ratings express the value of a move from your own point of view. A move that leaves your opponent with a negative position is good for you and therefore has a positive value.
But when we call the function initially, it is from the computers perspective. I guess when we evaluate each position, this function is being called from our opponent's perspective and that is why? Could someone give me more insight into what is going on recursively and exactly why the rating needs to be negative at the end.
As always thank you very much for your time.
Imagine two positions, A and B, where A is better for player a and B is better for player b. When player a evaluates these positions, eval(A) > eval(B), but when play b does, we want eval(A) < eval(B), but don't. If b instead compares -eval(A) with -eval(B), we get the desired result, for the very reasons your book says.

Chess AI with alpha beta algorithm

I have implemented the alpha beta algorithm for my chess game, however it takes a lot of time (minutes for 4-ply) to finally make a rather stupid move.
I've been trying to find the mistake (I assume I made one) for 2 days now, I would very much appreciate some outside input on my code.
getMove function: is called for the root node, it calls alphaBeta function for all it's child nodes (possible moves) and then chooses the move with the highest score.
Move AIPlayer::getMove(Board b, MoveGenerator& gen)
{
// defined constants: ALPHA=-20000 and BETA= 20000
int alpha = ALPHA;
Board bTemp(false); // test Board
Move BestMov;
int i = -1; int temp;
int len = gen.moves.getLength(); // moves is a linked list holding all legal moves
BoardCounter++; // private attribute of AIPlayer object, counts analyzed boards
Move mTemp; // mTemp is used to apply the nextmove in the list to the temporary test Board
gen.mouvements.Begin(); // sets the list counter to the first element in the list
while (++i < len && alpha < BETA){
mTemp = gen.moves.nextElement();
bTemp.cloneBoard(b);
bTemp.applyMove(mTemp);
temp = MAX(alpha, alphaBeta(bTemp, alpha, BETA, depth, MIN_NODE));
if (temp > alpha){
alpha = temp;
BestMov = mTemp;
}
}
return BestMov;
}
alphaBeta function:
int AIPlayer::alphaBeta(Board b, int alpha, int beta, char depth, bool nodeType)
{
Move m;
b.changeSide();
compteurBoards++;
MoveGenerator genMoves(b); // when the constructor is given a board, it automatically generates possible moves
// the Board object has a player attribute that holds the current player
if (genMoves.checkMate(b, b.getSide(), moves)){ // if the current player is in checkmate
return 100000;
}
else if (genMoves.checkMate(b, ((b.getSide() == BLACK) ? BLACK : WHITE), moves)){ // if the other player is in checkmate
return -100000;
}
else if (!depth){
return b.evaluateBoard(nodeType);
}
else{
int scoreMove = alpha;
int best;
genMoves.moves.Begin();
short i = -1, len = genMoves.moves.getLength();
Board bTemp(false);
if (nodeType == MAX_NODE){
best = ALPHA;
while (++i < len){
bTemp.cloneBoard(b);
if (bTemp.applyMove(genMoves.moves.nextElement())){
scoreMove = alphaBeta(bTemp, alpha, beta, depth - 1, !nodeType);
best = MAX(best, scoreMove);
alpha = MAX(alpha, best);
if (beta <= alpha){
std::cout << "max cutoff" << std::endl;
break;
}
}
}
return scoreMove;
//return alpha;
}
else{
best = BETA;
while (++i < len){
bTemp.cloneBoard(b);
if (bTemp.applyMove(genMoves.moves.nextElement())){
scoreMove = alphaBeta(bTemp, alpha, beta, depth - 1, !nodeType);
best = MIN(best, scoreMove);
beta = MIN(beta, best);
if (beta <= alpha){
std::cout << "min cutoff" << std::endl;
break;
}
}
}
return scoreMove;
//return beta;
}
return meilleur;
}
}
EDIT: I should note that the evaluateBoard only evaluates the mobility of pieces (number of possible moves, capture moves get a higher score depending on the piece captured)
Thank you.
I can see that you're trying to implement a mini-max algorithm. However, there is something in the code that makes me suspicious. We'll compare the code with the open-source Stockfish chess engine. Please refer to the search algorithm at https://github.com/mcostalba/Stockfish/blob/master/src/search.cpp
1. Passing Board b by value
You have this in your code:
alphaBeta(Board b, int alpha, int beta, char depth, bool nodeType)
I don't know what exactly "Board" is. But it doesn't look right to me. Let's look at Stockfish:
Value search(Position& pos, Stack* ss, Value alpha, Value beta, Depth
depth, bool cutNode)
The position object is passed by reference in Stockfish. If "Board" is a class, the program will need to make a new copy everytime the alpha-beta function is called. In chess, when we have to evaluate many number of nodes, this is obviously unacceptable.
2. No hashing
Hashing is done in Stockfish as:
ttValue = ttHit ? value_from_tt(tte->value(), ss->ply) : VALUE_NONE;
Without hashing, you'll need to evaluate the same position again and again and again and again. You won't go anywhere without hashing implemented.
3. Checking for checkmate
Probably not the most significant slow-down, but we should never check for checkmate in every single node. In Stockfish:
// All legal moves have been searched. A special case: If we're in check
// and no legal moves were found, it is checkmate.
if (InCheck && bestValue == -VALUE_INFINITE)
return mated_in(ss->ply); // Plies to mate from the root
This is done AFTER all possible moves are searched. We do it because we usually have many more non-checkmates node than checkmate-nodes.
4. Board bTemp(false);
This looks like a major slow-down. Let's take at Stockfish:
// Step 14. Make the move
pos.do_move(move, st, ci, givesCheck);
You should not create a temporary object in every node (creating an object of bTemp). The machine would need to allocate some stack space to save bTemp. This could be a serious performance penalty in particular if bTemp is not a primary variable (ie, not likely be cached by the processor). Stockfish simply modifies the internal data-structure without creating a new one.
5. bTemp.cloneBoard(b);
Similar to 4, even worse, this is done for every move in the node.
6. std::cout << "max cutoff" << std::endl;
Maybe it's hard to believe, printing to a terminal is much slower than processing. Here you're creating a potential slow-down that the string would need to be saved to an IO buffer. The function might (I'm not 100% sure) even block your program until the text is shown on the terminal. Stockfish only does it for statistic summary, definitely not everytime when you have a fail-high or fail-low.
7. Not sorting the PV move
Probably not something that you want to do before addressing the other issues. In Stockfish, they have:
std::stable_sort(RootMoves.begin() + PVIdx, RootMoves.end());
This is done for every iteration in an iterative-deepening framework.
I am only going to address the runtime cost problem of your algorithm, because I don't know the implementation details of your board evaluation function.
In order to keep things as simple as possible, I will assume the worst case for the algorithm.
The getMove function makes len1 calls to the alphaBeta function, which in turn makes len2 calls to itself, which in turn makes len3 calls to itself and so on until depth reaches 0 and the recursion stops.
Because of the worst case assumption, let's say n = max(len1, len2, ...), so you have
n * n * n * ... * n calls to alphaBeta with number of multiplications depending on depth d, which leads to n^d calls to alphaBeta which means that you have an exponential runtime behavior. This is ultra slow and only beaten by factorial runtime behavior.
I think you should take a look at the Big O notation for that purpose and try to optimize your algorithm accordingly to get much faster results.
Best regards,
OPM

Optimizing the Dijkstra's algorithm

I need a graph-search algorithm that is enough in our application of robot navigation and I chose Dijkstra's algorithm.
We are given the gridmap which contains free, occupied and unknown cells where the robot is only permitted to pass through the free cells. The user will input the starting position and the goal position. In return, I will retrieve the sequence of free cells leading the robot from starting position to the goal position which corresponds to the path.
Since executing the dijkstra's algorithm from start to goal would give us a reverse path coming from goal to start, I decided to execute the dijkstra's algorithm backwards such that I would retrieve the path from start to goal.
Starting from the goal cell, I would have 8 neighbors whose cost horizontally and vertically is 1 while diagonally would be sqrt(2) only if the cells are reachable (i.e. not out-of-bounds and free cell).
Here are the rules that should be observe in updating the neighboring cells, the current cell can only assume 8 neighboring cells to be reachable (e.g. distance of 1 or sqrt(2)) with the following conditions:
The neighboring cell is not out of bounds
The neighboring cell is unvisited.
The neighboring cell is a free cell which can be checked via the 2-D grid map.
Here is my implementation:
#include <opencv2/opencv.hpp>
#include <algorithm>
#include "Timer.h"
/// CONSTANTS
static const int UNKNOWN_CELL = 197;
static const int FREE_CELL = 255;
static const int OCCUPIED_CELL = 0;
/// STRUCTURES for easier management.
struct vertex {
cv::Point2i id_;
cv::Point2i from_;
vertex(cv::Point2i id, cv::Point2i from)
{
id_ = id;
from_ = from;
}
};
/// To be used for finding an element in std::multimap STL.
struct CompareID
{
CompareID(cv::Point2i val) : val_(val) {}
bool operator()(const std::pair<double, vertex> & elem) const {
return val_ == elem.second.id_;
}
private:
cv::Point2i val_;
};
/// Some helper functions for dijkstra's algorithm.
uint8_t get_cell_at(const cv::Mat & image, int x, int y)
{
assert(x < image.rows);
assert(y < image.cols);
return image.data[x * image.cols + y];
}
/// Some helper functions for dijkstra's algorithm.
bool checkIfNotOutOfBounds(cv::Point2i current, int rows, int cols)
{
return (current.x >= 0 && current.y >= 0 &&
current.x < cols && current.y < rows);
}
/// Brief: Finds the shortest possible path from starting position to the goal position
/// Param gridMap: The stage where the tracing of the shortest possible path will be performed.
/// Param start: The starting position in the gridMap. It is assumed that start cell is a free cell.
/// Param goal: The goal position in the gridMap. It is assumed that the goal cell is a free cell.
/// Param path: Returns the sequence of free cells leading to the goal starting from the starting cell.
bool findPathViaDijkstra(const cv::Mat& gridMap, cv::Point2i start, cv::Point2i goal, std::vector<cv::Point2i>& path)
{
// Clear the path just in case
path.clear();
// Create working and visited set.
std::multimap<double,vertex> working, visited;
// Initialize working set. We are going to perform the djikstra's
// backwards in order to get the actual path without reversing the path.
working.insert(std::make_pair(0, vertex(goal, goal)));
// Conditions in continuing
// 1.) Working is empty implies all nodes are visited.
// 2.) If the start is still not found in the working visited set.
// The Dijkstra's algorithm
while(!working.empty() && std::find_if(visited.begin(), visited.end(), CompareID(start)) == visited.end())
{
// Get the top of the STL.
// It is already given that the top of the multimap has the lowest cost.
std::pair<double, vertex> currentPair = *working.begin();
cv::Point2i current = currentPair.second.id_;
visited.insert(currentPair);
working.erase(working.begin());
// Check all arcs
// Only insert the cells into working under these 3 conditions:
// 1. The cell is not in visited cell
// 2. The cell is not out of bounds
// 3. The cell is free
for (int x = current.x-1; x <= current.x+1; x++)
for (int y = current.y-1; y <= current.y+1; y++)
{
if (checkIfNotOutOfBounds(cv::Point2i(x, y), gridMap.rows, gridMap.cols) &&
get_cell_at(gridMap, x, y) == FREE_CELL &&
std::find_if(visited.begin(), visited.end(), CompareID(cv::Point2i(x, y))) == visited.end())
{
vertex newVertex = vertex(cv::Point2i(x,y), current);
double cost = currentPair.first + sqrt(2);
// Cost is 1
if (x == current.x || y == current.y)
cost = currentPair.first + 1;
std::multimap<double, vertex>::iterator it =
std::find_if(working.begin(), working.end(), CompareID(cv::Point2i(x, y)));
if (it == working.end())
working.insert(std::make_pair(cost, newVertex));
else if(cost < (*it).first)
{
working.erase(it);
working.insert(std::make_pair(cost, newVertex));
}
}
}
}
// Now, recover the path.
// Path is valid!
if (std::find_if(visited.begin(), visited.end(), CompareID(start)) != visited.end())
{
std::pair <double, vertex> currentPair = *std::find_if(visited.begin(), visited.end(), CompareID(start));
path.push_back(currentPair.second.id_);
do
{
currentPair = *std::find_if(visited.begin(), visited.end(), CompareID(currentPair.second.from_));
path.push_back(currentPair.second.id_);
} while(currentPair.second.id_.x != goal.x || currentPair.second.id_.y != goal.y);
return true;
}
// Path is invalid!
else
return false;
}
int main()
{
// cv::Mat image = cv::imread("filteredmap1.jpg", CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat image = cv::Mat(100,100,CV_8UC1);
std::vector<cv::Point2i> path;
for (int i = 0; i < image.rows; i++)
for(int j = 0; j < image.cols; j++)
{
image.data[i*image.cols+j] = FREE_CELL;
if (j == image.cols/2 && (i > 3 && i < image.rows - 3))
image.data[i*image.cols+j] = OCCUPIED_CELL;
// if (image.data[i*image.cols+j] > 215)
// image.data[i*image.cols+j] = FREE_CELL;
// else if(image.data[i*image.cols+j] < 100)
// image.data[i*image.cols+j] = OCCUPIED_CELL;
// else
// image.data[i*image.cols+j] = UNKNOWN_CELL;
}
// Start top right
cv::Point2i goal(image.cols-1, 0);
// Goal bottom left
cv::Point2i start(0, image.rows-1);
// Time the algorithm.
Timer timer;
timer.start();
findPathViaDijkstra(image, start, goal, path);
std::cerr << "Time elapsed: " << timer.getElapsedTimeInMilliSec() << " ms";
// Add the path in the image for visualization purpose.
cv::cvtColor(image, image, CV_GRAY2BGRA);
int cn = image.channels();
for (int i = 0; i < path.size(); i++)
{
image.data[path[i].x*cn*image.cols+path[i].y*cn+0] = 0;
image.data[path[i].x*cn*image.cols+path[i].y*cn+1] = 255;
image.data[path[i].x*cn*image.cols+path[i].y*cn+2] = 0;
}
cv::imshow("Map with path", image);
cv::waitKey();
return 0;
}
For the algorithm implementation, I decided to have two sets namely the visited and working set whose each elements contain:
The location of itself in the 2D grid map.
The accumulated cost
Through what cell did it get its accumulated cost (for path recovery)
And here is the result:
The black pixels represent obstacles, the white pixels represent free space and the green line represents the path computed.
On this implementation, I would only search within the current working set for the minimum value and DO NOT need to scan throughout the cost matrix (where initially, the initially cost of all cells are set to infinity and the starting point 0). Maintaining a separate vector of the working set I think promises a better code performance because all the cells that have cost of infinity is surely to be not included in the working set but only those cells that have been touched.
I also took advantage of the STL which C++ provides. I decided to use the std::multimap since it can store duplicating keys (which is the cost) and it sorts the lists automatically. However, I was forced to use std::find_if() to find the id (which is the row,col of the current cell in the set) in the visited set to check if the current cell is on it which promises linear complexity. I really think this is the bottleneck of the Dijkstra's algorithm.
I am well aware that A* algorithm is much faster than Dijkstra's algorithm but what I wanted to ask is my implementation of Dijkstra's algorithm optimal? Even if I implemented A* algorithm using my current implementation in Dijkstra's which is I believe suboptimal, then consequently A* algorithm will also be suboptimal.
What improvement can I perform? What STL is the most appropriate for this algorithm? Particularly, how do I improve the bottleneck?
You're using a std::multimap for 'working' and 'visited'. That's not great.
The first thing you should do is change visited into a per-vertex flag so you can do your find_if in constant time instead of linear times and also so that operations on the list of visited vertices take constant instead of logarithmic time. You know what all the vertices are and you can map them to small integers trivially, so you can use either a std::vector or a std::bitset.
The second thing you should do is turn working into a priority queue, rather than a balanced binary tree structure, so that operations are a (largish) constant factor faster. std::priority_queue is a barebones binary heap. A higher-radix heap---say quaternary for concreteness---will probably be faster on modern computers due to its reduced depth. Andrew Goldberg suggests some bucket-based data structures; I can dig up references for you if you get to that stage. (They're not too complicated.)
Once you've taken care of these two things, you might look at A* or meet-in-the-middle tricks to speed things up even more.
Your performance is several orders of magnitude worse than it could be because you're using graph search algorithms for what looks like geometry. This geometry is much simpler and less general than the problems that graph search algorithms can solve. Also, with a vertex for every pixel your graph is huge even though it contains basically no information.
I heard you asking "how can I make this better without changing what I'm thinking" but nevertheless I'll tell you a completely different and better approach.
It looks like your robot can only go horizontally, vertically or diagonally. Is that for real or just a side effect of you choosing graph search algorithms? I'll assume the latter and let it go in any direction.
The algorithm goes like this:
(0) Represent your obstacles as polygons by listing the corners. Work in real numbers so you can make them as thin as you like.
(1) Try for a straight line between the end points.
(2) Check if that line goes through an obstacle or not. To do that for any line, show that all corners of any particular obstacle lie on the same side of the line. To do that, translate all points by (-X,-Y) of one end of the line so that that point is at the origin, then rotate until the other point is on the X axis. Now all corners should have the same sign of Y if there's no obstruction. There might be a quicker way just using gradients.
(3) If there's an obstruction, propose N two-segment paths going via the N corners of the obstacle.
(4) Recurse for all segments, culling any paths with segments that go out of bounds. That won't be a problem unless you have obstacles that go out of bounds.
(5) When it stops recursing, you should have a list of locally optimised paths from which you can choose the shortest.
(6) If you really want to restrict bearings to multiples of 45 degrees, then you can do this algorithm first and then replace each segment by any 45-only wiggly version that avoids obstacles. We know that such a version exists because you can stay extremely close to the original line by wiggling very often. We also know that all such wiggly paths have the same length.

kd-tree construction very slow

I am trying to implement a kd-tree for my C++ (DirectX) project to speed up my collision detection.
My implementation is a really primitive recursive function. The nth_element seems to be working okay (only 1 fps difference if i comment it out). I am not quite sure where the culprit it comming from.
KDTreeNode Box::buildKDTree(std::vector<Ball> balls, int depth) {
if (balls.size() < 3) {
return KDTreeNode(balls[0].getPos(), KDTreeLeaf(), KDTreeLeaf());
}
Variables::currAxis = depth % 3;
size_t n = (balls.size() / 2);
std::nth_element(balls.begin(), balls.begin() + n, balls.end()); // SORTS FOR THE ACCORDING AXIS - SEE BALL.CPP FOR IMPLEMENTATION
std::vector<Ball> leftSide(balls.begin(), balls.begin() + n);
std::vector<Ball> rightSide(balls.begin() + n, balls.end());
return KDTreeNode(balls[n].getPos(), this->buildKDTree(leftSide, depth + 1), this->buildKDTree(rightSide, depth + 1));
}
I have overwritten the bool operator in the Ball class:
bool Ball::operator < (Ball& ball)
{
if (Variables::currAxis == 0) {
return (XMVectorGetX(this->getPos()) < XMVectorGetX(ball.getPos()));
} else if (Variables::currAxis == 1) {
return (XMVectorGetY(this->getPos()) < XMVectorGetY(ball.getPos()));
} else {
return (XMVectorGetZ(this->getPos()) < XMVectorGetZ(ball.getPos()));
}
}
I am pretty sure that this is not an optimal way to handle the construction in real time.
Maybe you can help me to get on the right track.
There is one other thing what i am really wondering about: Say i have a lot of spheres in the scene and i use a kd-tree. How do i determine in what leaf they belong? Because at the contruction i am only using the center position, but not their actual diameter? How do i go about this then?
Thanks
EDIT: I've implemented all the suggested changes and it runs very good now. Thanks!
Here is what i did:
KDTreeNode Box::buildKDTree(std::vector<Ball>::iterator start, std::vector<Ball>::iterator end, int depth) {
if ((end-start) == 1) {
return KDTreeNode(balls[0].getPos(), &KDTreeLeaf(), &KDTreeLeaf());
}
Variables::currAxis = depth % 3;
size_t n = (abs(end-start) / 2);
std::nth_element(start, start + n, end); // SORTS FOR THE ACCORDING AXIS - SEE BALL.CPP FOR IMPLEMENTATION
return KDTreeNode(balls[n].getPos(), &this->buildKDTree(start, (start+n), depth + 1), &this->buildKDTree((start+n), end, depth + 1));
}
As you can see i am not copying the vectors anymore and i am also passing the left and right child as reference so that they are not copied.
I see two possible problems:
Passing the vector to the function as a value (this effectively copies the whole vector)
Creating new vectors for the smaller and bigger elements, instead of some in-place processing
Basically the function copies all balls in the initial vector for every level of your kd-tree twice. This should cause some serious slow down, so try to avoid requesting so much memory.
One way to solve it would be to access the data of the vector directly, use nth_element etc. and only pass the indices of the subvectors to the recursive call.

Improving Minimax Algorithm

Currently I'm working on an Othello/Reversi game in c++. I have it "finished" except that the Minimax algorithm I'm using for the Computer player is painfully slow when I set it at a depth that produces a semi-challenging AI.
The basic setup of my game is that the board is represented by a 2-dimensional array, with each cell on the board assigned a value in the array (xMarker, oMarker, or underscore).
Here's the minimax algorithm so far:
signed int Computer::simulate(Board b, int depth, int tempMarker) {
if (depth > MAX_DEPTH || b.gameOver()) {
int oppMarker = (marker == xMarker) ? oMarker : xMarker;
return b.countForMarker(marker) - b.countForMarker(oppMarker);
}
//if we're simulating our turn, we want to find the highest value (so we set our start at -64)
//if we're simulating the opponent's turn, we want to find the lowest value (so we set our start at 64)
signed int start = (tempMarker == marker) ? -64 : 64;
for (int x = 0; x < b.size; x++) {
for (int y = 0; y < b.size; y++) {
if (b.markerArray[x][y] == underscore) {
Board *c = b.duplicate();
if(c->checkForFlips(Point(x,y), tempMarker, true) > 0) {
int newMarker = (tempMarker == xMarker) ? oMarker : xMarker;
int r = simulate(*c, depth+1, newMarker);
//'marker' is the marker assigned to our player (the computer), if it's our turn, we want the highest value
if (tempMarker == marker) {
if(r > start) start = r;
} else {
//if it's the opponent's turn, we want the lowest value
if(r < start) start = r;
}
}
delete c;
}
}
}
return start;
}
The function checkForFlips() returns the number of flips that would result from playing at the given cell. MAX_DEPTH is set to 6 at the moment, and it's quite slow (maybe about 10-15 seconds per play)
The only idea I've come up with so far would be to store the tree each time, and then pick up from where I left off, but I'm not sure how to go about implementing that or if it would be too effective. Any ideas or suggestions would be appreciated!
Calculating minimax is slow.
The first possible optimization is alpha-beta pruning:
http://en.wikipedia.org/wiki/Alpha-beta_pruning
You shouldn't duplicate board, that's very inefficient. Make the move before you call yourself recursively, but save enough information to undo the same move after you return from the recursive call. That way you only need one board.
But Shiroko is right, alpha-beta pruning is the first step.
#Shiroko's suggestion is great, but there are more optimization opportunities.
You pass the state of the Board by value, and then copy it inside the loop. I'd pass the Board as a pointer or as const Board& b. If this is still expensive, you could use a poinger to a single board, and reverse every move after you evaluate it. In any case don't allocate it on the heap.
You can also run this algorithm on multiple cores. You will need to write a variation of the for loop at the first level using openmp (or equivalent).
The most obvious way to improve it would be through alpha-beta pruning or negascout.
However, if you want to stick with minimax, you can't make it go too fast, as it is a brute force algorithm. One way to improve it would be to change it to Negamax, which would get rid of some of the logic required in this code. Another way would be to use a one dimensional array for the board instead of Board. To make calculations easier, use a length of 100, so the positions are in row-column form(e.g. index 27 is row 2, column 7).
But if you want it to go faster, try pruning.