I am buiding a Tic Tac Toe solving robot. For practise, I wrote a Tic Tac Toe game using the minimax algorithm which worked very well. When I wanted to port my code to the controller, I found out that none of C/C++ compilers for this controller support recursive functions. Therefore, I need help converting this recursive minimax function to one that uses iteration or an internal stack :
int miniMax (char board[BOARD_DIM][BOARD_DIM], _Bool minNode, int *xBest, int *yBest)
{
int possibleMoves[NSQUARES][2];
int nPossibleMoves = generateMoves(board, possibleMoves);
char boardChild [BOARD_DIM][BOARD_DIM];
int ind, x_ind, y_ind;
int minScore, maxScore;
if (gameOver(board))
return evaluateState(board);
else if (minNode)
{
minScore = +INFINITY;
for (ind = 0 ; ind < nPossibleMoves; ind++)
{
duplicateBoard(board, boardChild);
x_ind = possibleMoves[ind][0];
y_ind = possibleMoves[ind][1];
updateboard(boardChild, x_ind, y_ind, cPlayer);
int score = miniMax(boardChild,!minNode ,&x_ind ,&y_ind);
if (minScore > score)
minScore = score;
}
return minScore;
}
else if (!minNode)
{
maxScore = -INFINITY;
for (ind = 0 ; ind < nPossibleMoves; ind++)
{
duplicateBoard(board, boardChild);
x_ind = possibleMoves[ind][0];
y_ind = possibleMoves[ind][1];
updateboard(boardChild, x_ind, y_ind, cComputer);
int score = miniMax(boardChild,!minNode ,&x_ind ,&y_ind);
if (maxScore < score)
{
maxScore = score;
*xBest = x_ind;
*yBest = y_ind;
}
}
return maxScore;
}
I'm totally lost on how to do this.
I appreciate any help :)
If it's for embedded I would
encode positions in binary (bit matrices instead of 2dim byte arrays)
encode the full solution map, so everything is a Lookup only (linear lookup will do fine for this complexity)
Related
I am trying to develop systematic method to came up with dynamic programming (DP) solutions - following certain steps you can came up with a valid solution to a problem.
Idea, essentially, is the following: you start from recursion, tune it to minimize number of parameters, which allows you to define problem and add memoization, after that you can easily came up with a DP solution. It turns out that it is not that simple.
One can't simply transfer recursive solution into DP.
E.g. see the subset sum PS from USACO http://train.usaco.org/usacoprob2?a=9OS8tkGfsX5&S=subset
Here is the code where my approach of turning recursion into DP is used.
#include <cstdio>
#include <iostream>
#include <cstdint>
#include <cstring>
using namespace std;
const int MAX_SUBSETSUM = (39+1)*39/4;
const int MAX_N = 39;
int memo[MAX_SUBSETSUM+1][MAX_N+1];
const int NOT_VISITED = -1;
// resembles 0-1 knapsack problem
// we need to select numbers that will go to
// the first set and those sum is totalSum/2
// so we have two sets of equal sum
int recur(int sum, int num) {
int counter = 0;
// ### 1.Base case
if(sum == 0)
return 1;
if(sum < 0 || num <= 0)
return 0;
// ### Memoization
if(memo[sum][num] != NOT_VISITED)
return memo[sum][num];
// ### 2.Recursive step
for(int nextNum = 1; nextNum < num; ++nextNum) {
if(nextNum > sum)
break;
//### 3. Make a decision
const int withNumber = recur(sum - num, nextNum);
const int withOutNumber = recur(sum, nextNum);
counter = withNumber + withOutNumber;
}
memo[sum][num] = counter;
return counter;
}
int solveDP(int subsetSum, int N) {
if(subsetSum & 1)
return 0;
int dp[MAX_SUBSETSUM+1][MAX_N+1];
memset(dp,0,sizeof(dp));
// fill in base cases
for(int i = 0; i <= N; ++i) {
// when subsetsum is 0, we have one way (we pick no nubers)
dp[0][i] = 1;
}
for(int sum =1; sum <=subsetSum; ++sum) {
for(int num = 1; num <= N; ++num) {
if(sum >= num)
dp[sum][num] = dp[sum - num][num-1]/*with num*/ + dp[sum][num-1]/*without*/;
else
dp[sum][num] = dp[sum][num-1]/*without*/;
}
}
return dp[subsetSum][N]/2; // NB!!! /2 - came up with it
}
int main() {
//freopen("preface.in", "r", stdin);
//freopen("preface.out", "w", stdout);
const int N = 7;
const int subsetSum = (N+1)*N/4;
memset(memo, NOT_VISITED, sizeof(memo));
int res = recur(subsetSum, N);
res = solveDP(subsetSum, N);
return 0;
}
In DP you have to half result(see return dp[subsetSum][N]/2;). And I found it because I got 2 versions and results differed, so it involved something like trial and error method to adjust DP to recursion.
I was able to understand why is what only after I played with DP table with paper and a pen, so I noticed that results are double counted.
That helped me.
But what to do when you got DP problem in a programming contest, you are limited in time and can't, obviously, afford to play around with DP table,
can you advise some techniques that will allow to validate results of my DP solution to make sure that it is correct.
My aim is to reduce amount of time and number of incorrect attempts.
make a simple brute force program and run a small test input to compare the
result from brute force and your DP program. "When in doubt, use brute force"
The problem from uva OJ
my solution with recursion
#include <cstdio>
using namespace std;
#define garbaze 0
//number of ways changes can be made
int coins[] = {garbaze,50,25,10,5,1}; //order does not matter//as in the //count_ways... function we are returning
//0 if which_coin_now is <= 0 so it
//does n't matter what we have in the index 0 [garbaze] .. but we must put //something there to implement the
//code using the pseudo code or recursive relation
typedef unsigned long long ull; //simple typedef
ull dp[7490][6]; //2d table
//recursive approach
ull count_ways_of_changes(int money_now,int which_coin_now)
{
if(money_now == 0)
return 1;
if(money_now < 0 || which_coin_now <=0 )
return 0;
if(dp[money_now][which_coin_now] == -1)
dp[money_now][which_coin_now] = count_ways_of_changes(money_now,which_coin_now-1) //excluding current coin
+ count_ways_of_changes(money_now - coins[which_coin_now],which_coin_now) ; //including current coin
return dp[money_now][which_coin_now] ;
}
int main()
{
for(int loop = 0; loop< 7490 ;loop++)
for(int sec_loop = 0;sec_loop<6;sec_loop++)
dp[loop][sec_loop] = -1; //table initialization
int N = 0;
while(scanf("%d",&N)==1)
{
printf("%llu\n",count_ways_of_changes(N,5)); //llu for unsigned long long
}
return 0;
}
This one got accepted (and took 0.024 s)
And my iterative approach :
#include <cstdio>
//#include <iostream>
//using namespace std;
typedef unsigned long long ull;
ull dp[7490][6];
#define garbaze 0
int value_coins[] = {garbaze,5,1,10,25,50} ;
inline ull count_ways_change(int money,int num_of_coins)
{
for(int sum_money_now = 0; sum_money_now <= money ;sum_money_now++)
for(int recent_coin_index = 0 ; recent_coin_index <= num_of_coins ; recent_coin_index++)
//common mistakes : starting the second index at num_of_coins and decrementing till 0 ...see we are pre calculating
//we have to start bottom to up....if we start at dp[0][5] .....to dp[1][5] but to know that i need to know
//dp[1][4] and dp[..][5] before hand ..but we have not calculated dp[1][4] yet...in this case i don't go to infinite
//loop or anything as the loop is well defined but i get stupid garbaze answer
{
if(sum_money_now == 0)
dp[sum_money_now][recent_coin_index] = 1;
else if(recent_coin_index == 0)
dp[sum_money_now][recent_coin_index] = 0;
else if(sum_money_now < value_coins[recent_coin_index] && recent_coin_index != 0)
dp[sum_money_now][recent_coin_index] = dp[sum_money_now][recent_coin_index-1] ;
else
dp[sum_money_now][recent_coin_index] = dp[sum_money_now][recent_coin_index-1] + dp[sum_money_now - value_coins[recent_coin_index] ][recent_coin_index] ;
// cout<<dp[sum_money_now][recent_coin_index]<<endl;
}
return dp[money][num_of_coins] ;
}
int main()
{/*
for(int loop = 0; loop< 7490 ;loop++)
for(int sec_loop = 0;sec_loop<6;sec_loop++)
dp[loop][sec_loop] = -1; //table initialization
*/ //In the iterative version do not need to initialize the table as we are working bottom - up
int N = 0;
while(scanf("%d",&N)==1)
{
printf("%llu\n",count_ways_change(N,5)); //llu for unsigned long long
}
return 0;
}
But i got time limit exceeded for this one.It gives correct output but i don't see a reason why this one has to be so slow?
The difference is your recursive solution remember partial solutions from previous tasks (because the DP table is global and does not get removed between different inputs), while the iterative doesn't - for each new input, it recalculates the DP matrix from scratch.
It can be solved by remembering which portion of the DP table was already calculated and avoid recalculating it, rather than recalculate it from scratch for every query.
I'm making a C++ program for the game chopsticks.
It's a really simple game with only 625 total game states (and it's even lower if you account for symmetry and unreachable states). I have read up minimax and alpha-beta algorithms, mostly for tic tac toe, but the problem I was having was that in tic tac toe it's impossible to loop back to a previous state while that can easily happen in chopsticks. So when running the code it would end up with a stack overflow.
I fixed this by adding flags for previously visited states (I don't know if that's the right way to do it.) so that they can be avoided, but now the problem I have is that the output is not symmetric as expected.
For example in the start state of the game each player has one finger so it's all symmetric. The program tells me that the best move is to hit my right hand with my left but not the opposite.
My source code is -
#include <iostream>
#include <array>
#include <vector>
#include <limits>
std::array<int, 625> t; //Flags for visited states.
std::array<int, 625> f; //Flags for visited states.
int no = 0; //Unused. For debugging.
class gamestate
{
public:
gamestate(int x, bool t) : turn(t) //Constructor.
{
for (int i = 0; i < 2; i++)
for (int j = 0; j < 2; j++) {
val[i][j] = x % 5;
x /= 5;
}
init();
}
void print() //Unused. For debugging.
{
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++)
std::cout << val[i][j] << "\t";
std::cout << "\n";
}
std::cout << "\n";
}
std::array<int, 6> canmove = {{ 1, 1, 1, 1, 1, 1 }}; //List of available moves.
bool isover() //Is the game over.
{
return ended;
}
bool won() //Who won the game.
{
return winner;
}
bool isturn() //Whose turn it is.
{
return turn;
}
std::vector<int> choosemoves() //Choose the best possible moves in the current state.
{
std::vector<int> bestmoves;
if(ended)
return bestmoves;
std::array<int, 6> scores;
int bestscore;
if(turn)
bestscore = std::numeric_limits<int>::min();
else
bestscore = std::numeric_limits<int>::max();
scores.fill(bestscore);
for (int i = 0; i < 6; i++)
if (canmove[i]) {
t.fill(0);
f.fill(0);
gamestate *play = new gamestate(this->playmove(i),!turn);
scores[i] = minimax(play, 0, std::numeric_limits<int>::min(), std::numeric_limits<int>::max());
std::cout<<i<<": "<<scores[i]<<std::endl;
delete play;
if (turn) if (scores[i] > bestscore) bestscore = scores[i];
if (!turn) if (scores[i] < bestscore) bestscore = scores[i];
}
for (int i = 0; i < 6; i++)
if (scores[i] == bestscore)
bestmoves.push_back(i);
return bestmoves;
}
private:
std::array<std::array<int, 2>, 2 > val; //The values of the fingers.
bool turn; //Whose turn it is.
bool ended = false; //Has the game ended.
bool winner; //Who won the game.
void init() //Check if the game has ended and find the available moves.
{
if (!(val[turn][0]) && !(val[turn][1])) {
ended = true;
winner = !turn;
canmove.fill(0);
return;
}
if (!(val[!turn][0]) && !(val[!turn][1])) {
ended = true;
winner = turn;
canmove.fill(0);
return;
}
if (!val[turn][0]) {
canmove[0] = 0;
canmove[1] = 0;
canmove[2] = 0;
if (val[turn][1] % 2)
canmove[5] = 0;
}
if (!val[turn][1]) {
if (val[turn][0] % 2)
canmove[2] = 0;
canmove[3] = 0;
canmove[4] = 0;
canmove[5] = 0;
}
if (!val[!turn][0]) {
canmove[0] = 0;
canmove[3] = 0;
}
if (!val[!turn][1]) {
canmove[1] = 0;
canmove[4] = 0;
}
}
int playmove(int mov) //Play a move to get the next game state.
{
auto newval = val;
switch (mov) {
case 0:
newval[!turn][0] = (newval[turn][0] + newval[!turn][0]);
newval[!turn][0] = (5 > newval[!turn][0]) ? newval[!turn][0] : 0;
break;
case 1:
newval[!turn][1] = (newval[turn][0] + newval[!turn][1]);
newval[!turn][1] = (5 > newval[!turn][1]) ? newval[!turn][1] : 0;
break;
case 2:
if (newval[turn][1]) {
newval[turn][1] = (newval[turn][0] + newval[turn][1]);
newval[turn][1] = (5 > newval[turn][1]) ? newval[turn][1] : 0;
} else {
newval[turn][0] /= 2;
newval[turn][1] = newval[turn][0];
}
break;
case 3:
newval[!turn][0] = (newval[turn][1] + newval[!turn][0]);
newval[!turn][0] = (5 > newval[!turn][0]) ? newval[!turn][0] : 0;
break;
case 4:
newval[!turn][1] = (newval[turn][1] + newval[!turn][1]);
newval[!turn][1] = (5 > newval[!turn][1]) ? newval[!turn][1] : 0;
break;
case 5:
if (newval[turn][0]) {
newval[turn][0] = (newval[turn][1] + newval[turn][0]);
newval[turn][0] = (5 > newval[turn][0]) ? newval[turn][0] : 0;
} else {
newval[turn][1] /= 2;
newval[turn][0] = newval[turn][1];
}
break;
default:
std::cout << "\nInvalid move!\n";
}
int ret = 0;
for (int i = 1; i > -1; i--)
for (int j = 1; j > -1; j--) {
ret+=newval[i][j];
ret*=5;
}
ret/=5;
return ret;
}
static int minimax(gamestate *game, int depth, int alpha, int beta) //Minimax searching function with alpha beta pruning.
{
if (game->isover()) {
if (game->won())
return 1000 - depth;
else
return depth - 1000;
}
if (game->isturn()) {
for (int i = 0; i < 6; i++)
if (game->canmove[i]&&t[game->playmove(i)]!=-1) {
int score;
if(!t[game->playmove(i)]){
t[game->playmove(i)] = -1;
gamestate *play = new gamestate(game->playmove(i),!game->isturn());
score = minimax(play, depth + 1, alpha, beta);
delete play;
t[game->playmove(i)] = score;
}
else
score = t[game->playmove(i)];
if (score > alpha) alpha = score;
if (alpha >= beta) break;
}
return alpha;
} else {
for (int i = 0; i < 6; i++)
if (game->canmove[i]&&f[game->playmove(i)]!=-1) {
int score;
if(!f[game->playmove(i)]){
f[game->playmove(i)] = -1;
gamestate *play = new gamestate(game->playmove(i),!game->isturn());
score = minimax(play, depth + 1, alpha, beta);
delete play;
f[game->playmove(i)] = score;
}
else
score = f[game->playmove(i)];
if (score < beta) beta = score;
if (alpha >= beta) break;
}
return beta;
}
}
};
int main(void)
{
gamestate test(243, true);
auto movelist = test.choosemoves();
for(auto i: movelist)
std::cout<<i<<std::endl;
return 0;
}
I'm passing the moves in a sort of base-5 to decimal system as each hand can have values from 0 to 4.
In the code I have input the state -
3 3
4 1
The output says I should hit my right hand (1) to the opponent's right (3) but it does not say I should hit it to my opponent's left (also 3)
I think the problem is because of the way I handled infinite looping.
What would be the right way to do it? Or if that is the right way, then how do I fix the problem?
Also please let me know how I can improve my code.
Thanks a lot.
Edit:
I have changed my minimax function as follows to ensure that infinite loops are scored above losing but I'm still not getting symmetry. I also made a function to add depth to the score
static float minimax(gamestate *game, int depth, float alpha, float beta) //Minimax searching function with alpha beta pruning.
{
if (game->isover()) {
if (game->won())
return 1000 - std::atan(depth) * 2000 / std::acos(-1);
else
return std::atan(depth) * 2000 / std::acos(-1) - 1000;
}
if (game->isturn()) {
for (int i = 0; i < 6; i++)
if (game->canmove[i]) {
float score;
if(!t[game->playmove(i)]) {
t[game->playmove(i)] = -1001;
gamestate *play = new gamestate(game->playmove(i), !game->isturn());
score = minimax(play, depth + 1, alpha, beta);
delete play;
t[game->playmove(i)] = score;
} else if(t[game->playmove(i)] == -1001)
score = 0;
else
score = adddepth(t[game->playmove(i)], depth);
if (score > alpha) alpha = score;
if (alpha >= beta) break;
}
return alpha;
} else {
for (int i = 0; i < 6; i++)
if (game->canmove[i]) {
float score;
if(!f[game->playmove(i)]) {
f[game->playmove(i)] = -1001;
gamestate *play = new gamestate(game->playmove(i), !game->isturn());
score = minimax(play, depth + 1, alpha, beta);
delete play;
f[game->playmove(i)] = score;
} else if(f[game->playmove(i)] == -1001)
score = 0;
else
score = adddepth(f[game->playmove(i)], depth);
if (score < beta) beta = score;
if (alpha >= beta) break;
}
return beta;
}
}
This is the function to add depth -
float adddepth(float score, int depth) //Add depth to pre-calculated score.
{
int olddepth;
float newscore;
if(score > 0) {
olddepth = std::tan((1000 - score) * std::acos(-1) / 2000);
depth += olddepth;
newscore = 1000 - std::atan(depth) * 2000 / std::acos(-1);
} else {
olddepth = std::tan((1000 + score) * std::acos(-1) / 2000);
depth += olddepth;
newscore = std::atan(depth) * 2000 / std::acos(-1) - 1000;
}
return newscore;
}
Disclaimer: I don't know C++, and I frankly haven't bothered to read the game rules. I have now read the rules, and still stand by what I said...but I still don't know C++. Still, I can present some general knowledge of the algorithm which should set you off in the right direction.
Asymmetry is not in itself a bad thing. If two moves are exactly equivalent, it should choose one of them and not stand helpless like Buridan's ass. You should, in fact, be sure that any agent you write has some method of choosing arbitrarily between policies which it cannot distinguish.
You should think more carefully about the utility scheme implied by refusing to visit previous states. Pursuing an infinite loop is a valid policy, even if your current representation of it will crash the program; maybe the bug is the overflow, not the policy that caused it. If given the choice between losing the game, and refusing to let the game end, which do you want your agent to prefer?
Playing ad infinitum
If you want your agent to avoid losing at all costs -- that is, you want it to prefer indefinite play over loss -- then I would suggest treating any repeated state as a terminal state and assigning it a value somewhere between winning and losing. After all, in a sense it is terminal -- this is the loop the game will enter forever and ever and ever, and the definite result of it is that there is no winner. However, remember that if you are using simple minimax (one utility function, not two), then this implies that your opponent also regards eternal play as a middling result.
It may sound ridiculous, but maybe playing unto infinity is actually a reasonable policy. Remember that minimax assumes the worst case -- a perfectly rational foe whose interests are the exact opposite of yours. But if, for example, you're writing an agent to play against a human, then the human will either err logically, or will eventually decide they would rather end the game by losing -- so your agent will benefit from patiently staying in this Nash equilibrium loop!
Alright, let's end the game already
If you want your agent to prefer that the game end eventually, then I would suggest implementing a living penalty -- a modifier added to your utility which decreases as a function of time (be it asymptotic or without bound). Implemented carefully, this can guarantee that, eventually, any end is preferable to another turn. With this solution as well, you need to be careful about considering what preferences this implies for your opponent.
A third way
Another common solution is to depth-limit your search and implement an evaluation function. This takes the game state as its input and just spits out a utility value which is its best guess at the end result. Is this provably optimal? No, not unless your evaluation function is just completing the minimax, but it means your algorithm will finish within a reasonable time. By burying this rough estimate deep enough in the tree, you wind up with a pretty reasonable model. However, this produces an incomplete policy, which means that it is more useful for a replanning agent than for a standard planning agent. Minimax replanning is the usual approach for complex games (it is, if I'm not mistaken, the basic algorithm followed by Deep Blue), but since this is a very simple game you probably don't need to take this approach.
A side note on abstraction
Note that all of these solutions are conceptualized as either numeric changes to or estimations of the utility function. This is, in general, preferable to arbitrarily throwing away possible policies. After all, that's what your utility function is for -- any time you make a policy decision on the basis of anything except the numeric value of your utility, you are breaking your abstraction and making your code less robust.
void generate_moves(int gameBoard[9], list<int> &moves)
{
for (int i = 0; i < 9; i++)
{
if (gameBoard[i] == 0){
moves.push_back(i);
}
}
}
int evaluate_position(int gameBoard[9], int playerTurn)
{
state currentGameState = checkWin(gameBoard);
if (currentGameState != PLAYING)
{
if ((playerTurn == 1 && currentGameState == XWIN) || (playerTurn == -1 && currentGameState == OWIN))
return +infinity;
else if ((playerTurn == -1 && currentGameState == XWIN) || (playerTurn == 1 && currentGameState == OWIN))
return -infinity;
else if (currentGameState == DRAW)
return 0;
}
return -1;
}
int MinMove(int gameBoard[9], int playerTurn)
{
//if (checkWin(gameBoard) != PLAYING) { return evaluate_board(gameBoard); };
int pos_val = evaluate_position(gameBoard, playerTurn);
if (pos_val != -1) return pos_val;
int bestScore = +infinity;
list<int> movesList;
generate_moves(gameBoard, movesList);
while (!movesList.empty())
{
gameBoard[movesList.front()] = playerTurn;
int score = MaxMove(gameBoard, playerTurn*-1);
if (score < bestScore)
{
bestScore = score;
}
gameBoard[movesList.front()] = 0;
movesList.pop_front();
}
return bestScore;
}
int MaxMove(int gameBoard[9], int playerTurn)
{
//if (checkWin(gameBoard) != PLAYING) { return evaluate_board(gameBoard); };
int pos_val = evaluate_position(gameBoard, playerTurn);
if (pos_val != -1) return pos_val;
int bestScore = -infinity;
list<int> movesList;
generate_moves(gameBoard, movesList);
while (!movesList.empty())
{
gameBoard[movesList.front()] = playerTurn;
int score = MinMove(gameBoard, playerTurn*-1);
if (score > bestScore)
{
bestScore = score;
}
gameBoard[movesList.front()] = 0;
movesList.pop_front();
}
return bestScore;
}
int MiniMax(int gameBoard[9], int playerTurn)
{
int bestScore = -infinity;
int index = 0;
list<int> movesList;
vector<int> bestMoves;
generate_moves(gameBoard, movesList);
while (!movesList.empty())
{
gameBoard[movesList.front()] = playerTurn;
int score = MinMove(gameBoard, playerTurn);
if (score > bestScore)
{
bestScore = score;
bestMoves.clear();
bestMoves.push_back(movesList.front());
}
else if (score == bestScore)
{
bestMoves.push_back(movesList.front());
}
gameBoard[movesList.front()] = 0;
movesList.pop_front();
}
index = bestMoves.size();
if (index > 0) {
time_t secs;
time(&secs);
srand((uint32_t)secs);
index = rand() % index;
}
return bestMoves[index];
}
I created a tic tac toe program in C++ and tried to implement a MiniMax algorithm with exhaustive search tree.
These are the functions I have written using wiki and with the help of some websites. But the AI just doesn't work right and at times doesn't play its turn at all.
Could someone have a look and please point out if there is anything wrong with the logic?
This is how I think it works:
Minimax : This function starts with very large -ve number as best score and goal is to maximize that number. It calls minMove function. If new score > best score, then best score = new score...
MinMove : This function evaluates game board. If game over then it returns -infinity or +infinity depending on who won. If game is going on this function starts with max +infinity value as best score and goal is to minimize it as much possible. It calls MaxMove with opponent player's turn. (since players alternate turns).
If score < best score then best score = score. ...
MaxMove : This function evaluates game board. If game over then it returns -infinity or +infinity depending on who won. If game is going on this function starts with least -infinity value as best score and goal is to maximize it as much possible. It calls MinMove with opponent player's turn. (since players alternate turns).
If score > best score then best score = score. ...
Minmove and MaxMove call each other mutually recursively, MaxMove maximizing the value and MinMove minimizing it. Finally it returns the best possible moves list.
If there are more than 1 best moves, then a random of them is picked as the computer's move.
In MiniMax, MinMove(gameBoard, playerTurn) should be MinMove(gameBoard, -playerTurn) as you do in MaxMove.
As you use MinMove and MaxMove, your evaluation function should be absolute. I mean +infinity for XWIN
and -infinity for OWIN. And so MinMove can only be use when player == -1 and MaxMove when player == 1, thus the parameter become useless. And so MiniMax can only be used by player == 1.
I have done some changes in your code and it works (https://ideone.com/Ihy1SR).
I'm trying to implement a minimax algorithm for tic tac toe with alpha-beta pruning. Right now I have the program running, but it does not seem to be working. Whenever I run it it seems to input garbage in all the squares. I've implemented it so that my minimax function takes in a board state and modifies that state so that when it is finished, the board state contains the next best move. Then, I set 'this' to equal the modified board. Here are my functions for the minimax algorithm:
void board::getBestMove() {
board returnBoard;
miniMax(INT_MIN + 1, INT_MAX -1, returnBoard);
*this = returnBoard;
}
int board::miniMax(int alpha, int beta, board childWithMaximum) {
if (checkDone())
return boardScore();
vector<board> children = getChildren();
for (int i = 0; i < 9; ++i) {
if(children.empty()) break;
board curr = children.back();
if (curr.firstMoveMade) { // not an empty board
board dummyBoard;
int score = curr.miniMax(alpha, beta, dummyBoard);
if (computerTurn && (beta > score)) {
beta = score;
childWithMaximum = *this;
if (alpha >= beta) break;
} else if (alpha < score) {
alpha = score;
childWithMaximum = *this;
if (alpha >= beta) break;
}
}
}
return computerTurn? alpha : beta;
}
vector<board> board::getChildren() {
vector<board> children;
for (int i = 0; i < 3; ++i) {
for (int j = 0; j < 3; ++j) {
if (getPosition(i, j) == '*') { //move not made here
board moveMade(*this);
moveMade.setPosition(i, j);
children.push_back(moveMade);
}
}
}
return children;
}
And here are my full files if someone wants to try running it:
.cpp : http://pastebin.com/ydG7RFRX
.h : http://pastebin.com/94mDdy7x
There may be many issues with your code... you sure posted a lot of it. Because you are asking your question it is incumbent on you to try everything you can on your own first and then reduce your question to the smallest amount of code necessary to clarify what is going on. As it is, I don't feel that you've put much effort into asking this question.
But maybe I can still provide some help:
void board::getBestMove() {
board returnBoard;
miniMax(INT_MIN + 1, INT_MAX -1, returnBoard);
*this = returnBoard;
}
See how you are saying *this = returnBoard.
That must mean that you want to get a board back from miniMax.
But look at how miniMax is defined!
int board::miniMax(int alpha, int beta, board childWithMaximum)
It accepts childWithMaximum via pass by value so it cannot return a board in this way.
what you wanted to say was probably:
int board::miniMax(int alpha, int beta, board & childWithMaximum)