Currently I'm working on an Othello/Reversi game in c++. I have it "finished" except that the Minimax algorithm I'm using for the Computer player is painfully slow when I set it at a depth that produces a semi-challenging AI.
The basic setup of my game is that the board is represented by a 2-dimensional array, with each cell on the board assigned a value in the array (xMarker, oMarker, or underscore).
Here's the minimax algorithm so far:
signed int Computer::simulate(Board b, int depth, int tempMarker) {
if (depth > MAX_DEPTH || b.gameOver()) {
int oppMarker = (marker == xMarker) ? oMarker : xMarker;
return b.countForMarker(marker) - b.countForMarker(oppMarker);
}
//if we're simulating our turn, we want to find the highest value (so we set our start at -64)
//if we're simulating the opponent's turn, we want to find the lowest value (so we set our start at 64)
signed int start = (tempMarker == marker) ? -64 : 64;
for (int x = 0; x < b.size; x++) {
for (int y = 0; y < b.size; y++) {
if (b.markerArray[x][y] == underscore) {
Board *c = b.duplicate();
if(c->checkForFlips(Point(x,y), tempMarker, true) > 0) {
int newMarker = (tempMarker == xMarker) ? oMarker : xMarker;
int r = simulate(*c, depth+1, newMarker);
//'marker' is the marker assigned to our player (the computer), if it's our turn, we want the highest value
if (tempMarker == marker) {
if(r > start) start = r;
} else {
//if it's the opponent's turn, we want the lowest value
if(r < start) start = r;
}
}
delete c;
}
}
}
return start;
}
The function checkForFlips() returns the number of flips that would result from playing at the given cell. MAX_DEPTH is set to 6 at the moment, and it's quite slow (maybe about 10-15 seconds per play)
The only idea I've come up with so far would be to store the tree each time, and then pick up from where I left off, but I'm not sure how to go about implementing that or if it would be too effective. Any ideas or suggestions would be appreciated!
Calculating minimax is slow.
The first possible optimization is alpha-beta pruning:
http://en.wikipedia.org/wiki/Alpha-beta_pruning
You shouldn't duplicate board, that's very inefficient. Make the move before you call yourself recursively, but save enough information to undo the same move after you return from the recursive call. That way you only need one board.
But Shiroko is right, alpha-beta pruning is the first step.
#Shiroko's suggestion is great, but there are more optimization opportunities.
You pass the state of the Board by value, and then copy it inside the loop. I'd pass the Board as a pointer or as const Board& b. If this is still expensive, you could use a poinger to a single board, and reverse every move after you evaluate it. In any case don't allocate it on the heap.
You can also run this algorithm on multiple cores. You will need to write a variation of the for loop at the first level using openmp (or equivalent).
The most obvious way to improve it would be through alpha-beta pruning or negascout.
However, if you want to stick with minimax, you can't make it go too fast, as it is a brute force algorithm. One way to improve it would be to change it to Negamax, which would get rid of some of the logic required in this code. Another way would be to use a one dimensional array for the board instead of Board. To make calculations easier, use a length of 100, so the positions are in row-column form(e.g. index 27 is row 2, column 7).
But if you want it to go faster, try pruning.
Related
I have the following implementation of a alpha beta minimax for an othello (reversi) game. I've fixed a few of it's problems from this thread. This time I'd like to improve the performance of this function. It's taking a very long time with MAX_DEPTH = 8. What can be done to speed up the performance, while keeping the AI somewhat decent?
mm_out minimax(Grid& G, int alpha, int beta, Action& A, uint pn, uint depth, bool stage) {
if (G.check_terminal_state() || depth == MAX_DEPTH) {
return mm_out(A, G.get_utility(pn));
}
// add end game score total here
set<Action> succ_temp = G.get_successors(pn);
for (Action a : succ_temp) {
Grid gt(G);
a.evaluate(gt);
}
set<Action, action_greater> successors(succ_temp.begin(), succ_temp.end());
// if no successor, that player passes
if (successors.size()) {
for (auto a = successors.begin(); a != successors.end(); ++a) {
Grid gt(G);
gt.do_move(pn, a->get_x(), a->get_y(), !PRINT_ERR);
Action at = *a;
mm_out mt = minimax(gt, alpha, beta, at, pn ^ 1, depth + 1, !stage);
int temp = mt.val;
// A = mt.best_move;
if (stage == MINIMAX_MAX) {
if (alpha < temp) {
alpha = temp;
A = *a;
}
if (alpha >= beta) {
return mm_out(A, beta);
}
}
else {
if (beta > temp) {
beta = temp;
A = *a;
}
if (alpha >= beta) {
return mm_out(A, alpha);
}
}
}
return mm_out(A, (stage == MINIMAX_MAX) ? alpha : beta);
}
else {
return mm_out(A, (stage == MINIMAX_MAX) ? (std::numeric_limits<int>::max() - 1) : (std::numeric_limits<int>::min() + 1));
}
}
Utility function:
int Grid::get_utility(uint pnum) const {
if (pnum)
return wcount - bcount;
return bcount - wcount;
}
There are several ways to speed up the performance of your search function. If you implement these techniques properly, they will cause very little harm to the accuracy of the algorithm while pruning many nodes.
The first technique that you can implement are transposition table. Transposition tables store in a hashtable all previously visited nodes in your game search tree. Most game states, especially in a deep search, can be reaches through various transpositions, or orders of moves that resurt in the same final state. By storing previously searched game states, if you find a state already searched, you can use the data stored in the tables and stop deepening the search at that node. The standard technique to store game states in a hashtable is called Zobrist Hashing. Detailed information on the implementation of transposition tables is available on the web.
The second thing your program should include is move ordering.This essentially means to examine moves not in the order you generate them, but in the order that seems most likely to produce an alpha beta cutoff (ie good moves first). Obviously you can't know which moves are best, but most moves can be ordered using a naive technique. For example, in Othello a move that is in a corner or edge should be examined first. Ordering moves should lead to more cutoffs and an increase in search speed. This poses zero loss to accuracy.
You can also add opening books. Usually the opening moves take the longest to search, as the board is full of more possibilities.An opening book is a database that stores every possible move that can be made in the first few turns, and the best response to it., In Othello, with a low branching factor, this will be especially helpful in the opening game
Probcut. Im not going to go into more detail here as this is a more advanced technique. However it has had good results with othello, so I figured I'd post this link.https://chessprogramming.wikispaces.com/ProbCut
/* finds the best move for the current player given the state of the game.
* depth parameter and MAX_DEPTH are used to limit the depth of the search for games
* that are too difficult to analyze in full detail (like chess)
* returns best move by storing an int in variable that rating points to.
* we want to make the move that will result in the lowest best move for the position after us(our opponent)
*/
moveT findBestMove(stateT state, int depth, int &rating) {
Vector<moveT> moveList;
generateMoveList(state, moveList);
int nMoves = moveList.size();
if (nMoves == 0) cout << "no move??" << endl;
moveT bestMove;
int minRating = WINNING_POSITION + 1; //guarantees that this will be updated in for loop
for (int i = 0; i < nMoves && minRating != LOSING_POSITION; i++) {
moveT move = moveList[i];
makeMove(state, move);
int curRating = evaluatePosition(state, depth + 1);
if (curRating < minRating) {
bestMove = move;
minRating = curRating;
}
retractMove(state, move);
}
rating = -minRating;
return bestMove;
}
/* evaluates the position by finding the rating of the best move in that position, limited by MAX_DEPTH */
int evaluatePosition(stateT state, int depth) {
int rating;
if (gameIsOver(state) || depth >= MAX_DEPTH) {
return evaluateStaticPosition(state);
}
findBestMove(state, depth, rating);
return rating;
}
This is my code for implementing a minimax algorithm to play a perfect game of tic tac toe against a computer. The code works and there are many other helper functions not show here. I understand the nature of the algorithm, however I am having a hard time fully wrapping my head around the line at the end of the findBestMove() function:
rating = -minRating;
This is what my book says: The negative sign is included because the perspective has shifted: the positions were evaluated from the point- of-view of your opponent, whereas the ratings express the value of a move from your own point of view. A move that leaves your opponent with a negative position is good for you and therefore has a positive value.
But when we call the function initially, it is from the computers perspective. I guess when we evaluate each position, this function is being called from our opponent's perspective and that is why? Could someone give me more insight into what is going on recursively and exactly why the rating needs to be negative at the end.
As always thank you very much for your time.
Imagine two positions, A and B, where A is better for player a and B is better for player b. When player a evaluates these positions, eval(A) > eval(B), but when play b does, we want eval(A) < eval(B), but don't. If b instead compares -eval(A) with -eval(B), we get the desired result, for the very reasons your book says.
I'm trying to figure out how to write a loop to check the position of a circle against a variable number of rectangles so that the apple is not placed on top of the snake, but I'm having a bit of trouble thinking it through. I tried:
do
apple.setPosition(randX()*20+10, randY()*20+10); // apple is a CircleShape
while (apple.getPosition() == snakeBody[i].getPosition());
Although, in this case, if it detects a collision with one rectangle of the snake's body, it could end up just placing the apple at a previous position of the body. How do I make it check all positions at the same time, so it can't correct itself only to have a chance of repeating the same problem again?
There are three ways (I could think of) of generating a random number meeting a requirement:
The first way, and the simpler, is what you're trying to do: retry if it doesn't.
However, you should change the condition so that it checks all the forbidden cells at once:
bool collides_with_snake(const sf::Vector2f& pos, //not sure if it's 2i or 2f
const /*type of snakeBody*/& snakeBody,
std::size_t partsNumber) {
bool noCollision = true;
for( std::size_t i = 0 ; i < partsNumber && noCollision ; ++i )
noCollision = pos != snakeBody[i].getPosition()
return !noCollision;
}
//...
do
apple.setPosition(randX()*20+10, randY()*20+10);
while (collides_with_snake(apple.getCollision(), snakeBody,
/* snakeBody.size() ? */));
The second way is to try to generate less numbers and find a function which will map these numbers to the set you want. For instance, if your grid has N cells, you could generate a number between 0 and N - [number of parts of your Snake] then map this number X to the smallest number Y such that this integer doesn't refer to a cell occupied by a snake part and X = Y + S where S is the number of cells occupied by a snake part referred by a number smaller than Y.
It's more complicated though.
The third way is to "cheat" and choose a stronger requirement which is easier to enforce. For instance, if you know that the cell body is N cells long, then only spawn the apple on a cell which is N + 1 cells away of the snakes head (you can do that by generating the angle).
The question is very broad, but assuming that snakeBody is a vector of Rectangles (or derived from Rectanges), and that you have a checkoverlap() function:
do {
// assuming that randX() and randY() allways return different random variables
apple.setPosition(randX()*20+10, randY()*20+10); // set the apple
} while (any_of(snakeBody.begin(), snakeBody.end(), [&](Rectangle &r)->bool { return checkoverlap(r,apple); } );
This relies on standard algorithm any_of() to check in one simple expression if any of the snake body elements overlaps the apple. If there's an overlap, we just iterate once more and get a new random position until it's fine.
If snakebody is an array and not a standard container, just use snakeBody, snakeBody+snakesize instead of snakeBody.begin(), snakeBody.end() in the code above.
If the overlap check is as simple as to compare the postition you can replace return checkoverlap(r,apple); in the code above with return r.getPosition()==apple.getPosition();
The "naive" approach would be generating apples and testing their positions against the whole snake until we find a free spot:
bool applePlaced = false;
while(!applePlaced) { //As long as we haven't found a valid place for the apple
apple.setPosition(randX()*20+10, randY()*20+10);
applePlaced = true; //We assume, that we can place the apple
for(int i=0; i<snakeBody.length; i++) { //Check the apple position with all snake body parts
if(apple.getPosition() == snakeBody[i].getPosition()) {
applePlaced=false; //Our prediction was wrong, we could not place the apple
break; //No further testing necessary
}
}
}
The better way would be storing all free positions in an array and then pick a Position out of this array(and delete it from the array), so that no random testing is necessary. It requires also updating the array if the snakes moves.
I have implemented the alpha beta algorithm for my chess game, however it takes a lot of time (minutes for 4-ply) to finally make a rather stupid move.
I've been trying to find the mistake (I assume I made one) for 2 days now, I would very much appreciate some outside input on my code.
getMove function: is called for the root node, it calls alphaBeta function for all it's child nodes (possible moves) and then chooses the move with the highest score.
Move AIPlayer::getMove(Board b, MoveGenerator& gen)
{
// defined constants: ALPHA=-20000 and BETA= 20000
int alpha = ALPHA;
Board bTemp(false); // test Board
Move BestMov;
int i = -1; int temp;
int len = gen.moves.getLength(); // moves is a linked list holding all legal moves
BoardCounter++; // private attribute of AIPlayer object, counts analyzed boards
Move mTemp; // mTemp is used to apply the nextmove in the list to the temporary test Board
gen.mouvements.Begin(); // sets the list counter to the first element in the list
while (++i < len && alpha < BETA){
mTemp = gen.moves.nextElement();
bTemp.cloneBoard(b);
bTemp.applyMove(mTemp);
temp = MAX(alpha, alphaBeta(bTemp, alpha, BETA, depth, MIN_NODE));
if (temp > alpha){
alpha = temp;
BestMov = mTemp;
}
}
return BestMov;
}
alphaBeta function:
int AIPlayer::alphaBeta(Board b, int alpha, int beta, char depth, bool nodeType)
{
Move m;
b.changeSide();
compteurBoards++;
MoveGenerator genMoves(b); // when the constructor is given a board, it automatically generates possible moves
// the Board object has a player attribute that holds the current player
if (genMoves.checkMate(b, b.getSide(), moves)){ // if the current player is in checkmate
return 100000;
}
else if (genMoves.checkMate(b, ((b.getSide() == BLACK) ? BLACK : WHITE), moves)){ // if the other player is in checkmate
return -100000;
}
else if (!depth){
return b.evaluateBoard(nodeType);
}
else{
int scoreMove = alpha;
int best;
genMoves.moves.Begin();
short i = -1, len = genMoves.moves.getLength();
Board bTemp(false);
if (nodeType == MAX_NODE){
best = ALPHA;
while (++i < len){
bTemp.cloneBoard(b);
if (bTemp.applyMove(genMoves.moves.nextElement())){
scoreMove = alphaBeta(bTemp, alpha, beta, depth - 1, !nodeType);
best = MAX(best, scoreMove);
alpha = MAX(alpha, best);
if (beta <= alpha){
std::cout << "max cutoff" << std::endl;
break;
}
}
}
return scoreMove;
//return alpha;
}
else{
best = BETA;
while (++i < len){
bTemp.cloneBoard(b);
if (bTemp.applyMove(genMoves.moves.nextElement())){
scoreMove = alphaBeta(bTemp, alpha, beta, depth - 1, !nodeType);
best = MIN(best, scoreMove);
beta = MIN(beta, best);
if (beta <= alpha){
std::cout << "min cutoff" << std::endl;
break;
}
}
}
return scoreMove;
//return beta;
}
return meilleur;
}
}
EDIT: I should note that the evaluateBoard only evaluates the mobility of pieces (number of possible moves, capture moves get a higher score depending on the piece captured)
Thank you.
I can see that you're trying to implement a mini-max algorithm. However, there is something in the code that makes me suspicious. We'll compare the code with the open-source Stockfish chess engine. Please refer to the search algorithm at https://github.com/mcostalba/Stockfish/blob/master/src/search.cpp
1. Passing Board b by value
You have this in your code:
alphaBeta(Board b, int alpha, int beta, char depth, bool nodeType)
I don't know what exactly "Board" is. But it doesn't look right to me. Let's look at Stockfish:
Value search(Position& pos, Stack* ss, Value alpha, Value beta, Depth
depth, bool cutNode)
The position object is passed by reference in Stockfish. If "Board" is a class, the program will need to make a new copy everytime the alpha-beta function is called. In chess, when we have to evaluate many number of nodes, this is obviously unacceptable.
2. No hashing
Hashing is done in Stockfish as:
ttValue = ttHit ? value_from_tt(tte->value(), ss->ply) : VALUE_NONE;
Without hashing, you'll need to evaluate the same position again and again and again and again. You won't go anywhere without hashing implemented.
3. Checking for checkmate
Probably not the most significant slow-down, but we should never check for checkmate in every single node. In Stockfish:
// All legal moves have been searched. A special case: If we're in check
// and no legal moves were found, it is checkmate.
if (InCheck && bestValue == -VALUE_INFINITE)
return mated_in(ss->ply); // Plies to mate from the root
This is done AFTER all possible moves are searched. We do it because we usually have many more non-checkmates node than checkmate-nodes.
4. Board bTemp(false);
This looks like a major slow-down. Let's take at Stockfish:
// Step 14. Make the move
pos.do_move(move, st, ci, givesCheck);
You should not create a temporary object in every node (creating an object of bTemp). The machine would need to allocate some stack space to save bTemp. This could be a serious performance penalty in particular if bTemp is not a primary variable (ie, not likely be cached by the processor). Stockfish simply modifies the internal data-structure without creating a new one.
5. bTemp.cloneBoard(b);
Similar to 4, even worse, this is done for every move in the node.
6. std::cout << "max cutoff" << std::endl;
Maybe it's hard to believe, printing to a terminal is much slower than processing. Here you're creating a potential slow-down that the string would need to be saved to an IO buffer. The function might (I'm not 100% sure) even block your program until the text is shown on the terminal. Stockfish only does it for statistic summary, definitely not everytime when you have a fail-high or fail-low.
7. Not sorting the PV move
Probably not something that you want to do before addressing the other issues. In Stockfish, they have:
std::stable_sort(RootMoves.begin() + PVIdx, RootMoves.end());
This is done for every iteration in an iterative-deepening framework.
I am only going to address the runtime cost problem of your algorithm, because I don't know the implementation details of your board evaluation function.
In order to keep things as simple as possible, I will assume the worst case for the algorithm.
The getMove function makes len1 calls to the alphaBeta function, which in turn makes len2 calls to itself, which in turn makes len3 calls to itself and so on until depth reaches 0 and the recursion stops.
Because of the worst case assumption, let's say n = max(len1, len2, ...), so you have
n * n * n * ... * n calls to alphaBeta with number of multiplications depending on depth d, which leads to n^d calls to alphaBeta which means that you have an exponential runtime behavior. This is ultra slow and only beaten by factorial runtime behavior.
I think you should take a look at the Big O notation for that purpose and try to optimize your algorithm accordingly to get much faster results.
Best regards,
OPM
Let T(x,y) be the number of tours over a X × Y grid such that:
the tour starts in the top left square
the tour consists of moves that are up, down, left, or right one
square
the tour visits each square exactly once, and
the tour ends in the bottom left square.
It’s easy to see, for example, that T(2,2) = 1, T(3,3) = 2, T(4,3) = 0, and T(3,4) = 4. Write a program to calculate T(10,4).
I have been working on this for hours ... I need a program that takes the dimensions of the grid as input and returns the number of possible tours?
I have been working on this for hours ... I need a program that takes the dimensions of the grid as input and returns the number of possible tours?
I wrote this code to solve the problem ... I cant seem to figure out how to check all directions.
#include <iostream>
int grid[3][3];
int c = 0;
int main(){
solve (0, 0, 9);
}
int solve (int posx, int posy, steps_left){
if (grid[posx][posy] = 1){
return 0;
}
if (steps_left = 1 && posx = 0 && posy = 2){
c = c+1;
return 0;
}
grid[posx][posy] = 1;
// for all possible directions
{
solve (posx_next, posy_next, steps_left-1)
}
grid[posx][posy] = 0;
}
Algorithm by #KarolyHorvath
You need some data structure to represent the state of the cells on the grid (visited/not visited).
Your algorithm:
step(posx, posy, steps_left)
if it is not a valid position, or already visited
return
if it's the last step and you are at the target cell
you've found a solution, increment counter
return
mark cell as visited
for each possible direction:
step(posx_next, posy_next, steps_left-1)
mark cell as not visited
and run with
step(0, 0, sizex*sizey)
It's not difficult, since you've been given the algorithm. In order to
solve the problem, you'll probably want some sort of dynamic data
structure (unless you're only interested in the exact case of T(10,4)).
For the rest, left is -1 on the x index, right +1, and down is -1 on the
y dimension, up +1. Add bounds checking and verification that you've
not visited, and the job is done.
But I wonder how much time such an obvious algorithm will take. There's
a four way decision on each cell; for the fourty cells of T(10,4),
that's 4^40 decisions. Which is not feasable. Things like eliminating
already visited cells and bounds checking eliminate a lot of branches,
but still... The goal of the competition might be to make you find a
better algorithm.
You really should pick a debugger and see what's going on on a small board (2x2, 3x3).
One obvious problem is that = is assignment, not comparison. Compare with ==.
There are more problems. Find them.