I'm trying to implement NegaMax ai for Connect 4. The algorithm works well some of the time, and the ai can win. However, sometimes it completely fails to block opponent 3 in a rows, or doesn't take a winning shot when it has three in a row.
The evaluation function iterates through the grid (horizontally, vertically, diagonally up, diagonally down), and takes every set of four squares. It then checks within each of these sets and evaluates based on this.
I've based the function on the evaluation code provided here: http://blogs.skicelab.com/maurizio/connect-four.html
My function is as follows:
//All sets of four tiles are evaluated before this
//and values for the following variables are set.
if (redFoursInARow != 0)
{
redScore = INT_MAX;
}
else
{
redScore = (redThreesInARow * threeWeight) + (redTwosInARow * twoWeight);
}
int yellowScore = 0;
if (yellowFoursInARow != 0)
{
yellowScore = INT_MAX;
}
else
{
yellowScore = (yellowThreesInARow * threeWeight) + (yellowTwosInARow * twoWeight);
}
int finalScore = yellowScore - redScore;
return turn ? finalScore : -finalScore; //If this is an ai turn, return finalScore. Else return -finalScore.
My negamax function looks like this:
inline int NegaMax(char g[6][7], int depth, int &bestMove, int row, int col, bool aiTurn)
{
{
char c = CheckForWinner(g);
if ('E' != c || 0 == depth)
{
return EvaluatePosition(g, aiTurn);
}
}
int bestScore = INT_MIN;
for (int i = 0; i < 7; ++i)
{
if (CanMakeMove(g, i)) //If column i is not full...
{
{
//...then make a move in that column.
//Grid is a 2d char array.
//'E' = empty tile, 'Y' = yellow, 'R' = red.
char newPos[6][7];
memcpy(newPos, g, sizeof(char) * 6 * 7);
int newRow = GetNextEmptyInCol(g, i);
if (aiTurn)
{
UpdateGrid(newPos, i, 'Y');
}
else
{
UpdateGrid(newPos, i, 'R');
}
int newScore = 0; int newMove = 0;
newScore = NegaMax(newPos, depth - 1, newMove, newRow, i, !aiTurn);
newScore = -newScore;
if (newScore > bestScore)
{
bestMove = i;
bestScore = newScore;
}
}
}
}
return bestScore;
}
I'm aware that connect four has been solved are that there are definitely better ways to go about this, but any help or suggestions with fixing/improving this will be greatly appreciated. Thanks!
Related
This question already has answers here:
Why is my power operator (^) not working?
(11 answers)
Closed 2 years ago.
I found this code of a tic tac toe game in c++, but got confused with the side^1 in GetComputerMove function-->randMove = GetWinningMove(board, side ^ 1);
If anything raise to the power of one will only be the value itself, why should it be ^1? Cos it gave me an error when I remove ^1 :D
Can anyone help me to explain this? Thanks!
enum { NOUGHTS, CROSSES, BORDER, EMPTY };
enum { HUMANWIN, COMPWIN, DRAW };
const int directions[4] = { 1, 7, 6, 8 };
const int ConvertTo49[25] =
{
8, 9, 10,11,12,
15,16,17,18,19,
22,23,24,25,26,
29,30,31,32,33,
36,37,38,39,40
};
int GetWinningMove(int* board, const int side)
{
int ourMove = -1;
int winFound = 0;
int index = 0;
for (index = 0; index < 25; ++index)
{
if (board[ConvertTo49[index]] == EMPTY)
{
ourMove = ConvertTo49[index];
board[ourMove] = side;
if (FindFourInARow(board, ourMove, side) == 4)
{
winFound = 1;
}
board[ourMove] = EMPTY;
if (winFound == 1)
{
return ourMove;
}
ourMove = -1;
};
}
return ourMove;
}
int GetComputerMove(int* board, const int side)
{
int index;
int numFree = 0;
int availableMoves[25];
int randMove = 0;
//Set random number to randomly run a function
int randFunction = 0;
randFunction = (rand() % 2);
//Go for the winning move
randMove = GetWinningMove(board, side);
if (randMove != -1)
{
return randMove;
}
//If random function is 1, stop any winning move from the human
if (randFunction == 1)
{
randMove = GetWinningMove(board, side ^ 1);
if (randMove != -1)
{
return randMove;
}
}
randMove = 0;
//Loop through all squares and put piece in random place
for (index = 0; index < 25; ++index)
{
if (board[ConvertTo49[index]] == EMPTY)
{
availableMoves[numFree++] = ConvertTo49[index];
};
}
randMove = (rand() % numFree);
return availableMoves[randMove];
}
In C++, the ^ operator does not mean exponentiation (or raising to a given power); in fact, there is no exponentiation operator in C++ (you have to use the pow function to do that).
Rather, ^ is the exclusive or operator. In your case, given that side will have a value of either 0 or 1 (representing the machine or the player), the side ^ 1 expression will evaluate to the other value. That is, if side is 1, it will give 0 and, if side is 0, it will give 1.
If your sequence is 4 2 1, the largest jump is from 4 to 2. If your sequence is 3 10 5 16 8 4 2 1, the largest jump is from 5 to 16.
I've made an algorithm however I'm not completely sure what I have done wrong (whever I haven't made the loop properly, set my variables correctly, or something else). I'm not sure what I need to set my index, BiggestDiff, or CurrentDiff too. I tried using a while loop to compare each number in my vector but I get zero (I'm assuming because I set BiggestDiff to zero)
If anyone can point me in the right direction, show me an example, or something else, that will be greatly appreciated.
Here is my code below
int findBiggestDiff(std::vector<int> sequence)
{
int index = 0;
int BiggestDiff = 0 ;
int CurrentDiff = BiggestDiff;
CurrentDiff = std::abs(sequence[index] - sequence[index + 1]);
while (index < sequence.size())
{
if (CurrentDiff > BiggestDiff)
{
BiggestDiff = CurrentDiff;
}
return index;
}
}
Try this:
{
int indexOfBiggestJump = 0;
int BiggestDiff = 0 ;
int CurrentDiff = BiggestDiff;
for(int i = 0; i < sequence.size() - 1; i++) {
CurrentDiff = std::abs(sequence[i] - sequence[i + 1]);
if (CurrentDiff > BiggestDiff)
{
BiggestDiff = CurrentDiff;
indexOfBiggestJump = i;
}
}
return indexOfBiggestJump;
}
There are several errors in your code.
your return index literally does nothing, only returns index (which will be 0) always.
you are not saving the index of the biggest jump anywhere.
if you are looking positions i and i + 1, you must go until sequence.size() - 1, otherwise you will look out of the bounds of sequence.
You aren't recalculating CurrentDiff at all. Also, your return statement in the in the wrong spot. You can do something like this (not tested)
int findLargest( const std::vector<int> &sequence ) {
if ( sequence.size() < 2 ) return -1; // if there's not at least two elements, there's nothing valid.
int index = 0;
int biggestIndex = -1;
int biggestDiff = -1;
while (index < sequence.size() - 1) // -1 so that the +1 below doesn't go out of range
{
// get the current difference
int currentDiff = std::abs(sequence[index] - sequence[index + 1]);
if (currentDiff > biggestDiff)
{
// update stats
biggestIndex = index;
biggestDiff = currentDiff;
}
++index;
}
return biggestIndex
}
int main() {
//…
int index = findLargest( sequence );
if ( index != -1 ) {
std::cout << "Biggest difference was between " << sequence[index] << " and " << sequence[index+1];
}
}
I have a minimax tree and an evaluation function.The minimax function return only an integer(best value).How can i store the first move of the way of the founded best value ?
here's my code :
int Brain::MiniMax(GameBoard gb, int depth,int Turn,int lastcount) //0->Max 1->Min
{
if (depth == 5)
return Evaluation(lastcount, gb);
int bestval = 0;
if (Turn == 0)
{
bestval = -100;
vector<pair<int, pair<int, int>>> possibleFences = this->PossibleFences(gb);
for (int i = 0; i < possibleFences.size(); i++)//ForFences
{
int cnt = 0;
NextShortestPathMove(cnt, gb.OpponentPawn,gb);
if (gb.CanPutFence(possibleFences[i].first, possibleFences[i].second.first, possibleFences[i].second.second) == 0)
continue;
gb.PutFence(possibleFences[i].second.first, possibleFences[i].second.second, possibleFences[i].first);
int value = MiniMax(gb, depth + 1,1, cnt);
if (value > bestval)
{
bestval = value;
move = possibleFences[i];
}
}
return bestval;
}
else if (Turn == 1)
{
bestval = +100;
int** possibleMoves = this->PossibleMoves(gb.OpponentPawn.Row, gb.OpponentPawn.Column, gb.OpponentPawn,gb);
for (int i = 0; i < 6; i++)
{
if (possibleMoves[i][0] == -1)
continue;
int cnt = 0;
NextShortestPathMove(cnt, gb.OpponentPawn,gb);
gb.MoveOpponentPlayer(possibleMoves[i][0], possibleMoves[i][1]);
int value = MiniMax(gb, depth + 1, 0,cnt);
bestval = min(value, bestval);
}
return bestval;
}
}
for example at the end if the bestval = 10, i want the first move of the selection of this bestval. now i store the move in the 'move' variable but it doesn't work correctly.
In a practical minimax algorithm implementation, a hash table is used to enter evaluated scores and moves including hashkey, a value unique for each position of the players and pieces. This is also useful during implementation of "force move".
During move evaluation, hashkey, score and move position are recorded in a struct and the hash table as well. So, after successful search, the entire struct is returned to enable update of the graphics and game status.
A typical hash entry looks like so:
struct HashEntry {
int move;
int score;
int depth;
uint64_t posKey;
int flags;
};
Pardon me if this question already exists, I've searched a lot but I haven't gotten the answer to the question I want to ask. So, basically, I'm trying to implement a Tic-Tac-Toe AI that uses the Minimax algorithm to make moves.
However, one thing I don't get is, that when Minimax is used on an empty board, the value returned is always 0 (which makes sense because the game always ends in a draw if both players play optimally).
So Minimax always chooses the first tile as the best move when AI is X (since all moves return 0 as value). Same happens for the second move and it always chooses the second tile instead. How can I fix this problem to make my AI pick the move with the higher probability of winning? Here is the evaluation and Minimax function I use (with Alpha-Beta pruning):
int evaluate(char board[3][3], char AI)
{
for (int row = 0; row<3; row++)
{
if (board[row][0] != '_' && board[row][0] == board[row][1] && board[row][1] == board[row][2])
{
if (board[row][0]==AI)
{
return +10;
}
else
{
return -10;
}
}
}
for (int col = 0; col<3; col++)
{
if (board[0][col] != '_' && board[0][col] == board[1][col] && board[1][col] == board[2][col])
{
if (board[0][col]==AI)
{
return +10;
}
else
{
return -10;
}
}
}
if (board[1][1] != '_' && ((board[0][0]==board[1][1] && board[1][1]==board[2][2]) || (board[0][2]==board[1][1] && board[1][1]==board[2][0])))
{
if (board[1][1]==AI)
{
return +10;
}
else
{
return -10;
}
}
return 0;
}
int Minimax(char board[3][3], bool AITurn, char AI, char Player, int depth, int alpha, int beta)
{
bool breakout = false;
int score = evaluate(board, AI);
if(score == 10)
{
return score - depth;
}
else if(score == -10)
{
return score + depth;
}
else if(NoTilesEmpty(board))
{
return 0;
}
if(AITurn == true)
{
int bestvalue = -1024;
for(int i = 0; i < 3; i++)
{
for(int j = 0; j<3; j++)
{
if(board[i][j] == '_')
{
board[i][j] = AI;
bestvalue = max(bestvalue, Minimax(board, false, AI, Player, depth+1, alpha, beta));
alpha = max(bestvalue, alpha);
board[i][j] = '_';
if(beta <= alpha)
{
breakout = true;
break;
}
}
}
if(breakout == true)
{
break;
}
}
return bestvalue;
}
else if(AITurn == false)
{
int bestvalue = +1024;
for(int i = 0; i < 3; i++)
{
for(int j = 0; j<3; j++)
{
if(board[i][j] == '_')
{
board[i][j] = Player;
bestvalue = min(bestvalue, Minimax(board, true, AI, Player, depth+1, alpha, beta));
beta = min(bestvalue, beta);
board[i][j] = '_';
if(beta <= alpha)
{
breakout = true;
break;
}
}
}
if(breakout == true)
{
break;
}
}
return bestvalue;
}
}
Minimax assumes optimal play, so maximizing "probability of winning" is not a meaningful notion: Since the other player can force a draw but cannot force a win, they will always force a draw. If you want to play optimally against a player who is not perfectly rational (which, of course, is one of the only two ways to win*), you'll need to assume some probability distribution over the opponent's moves and use something like ExpectMinimax, where with some probability the opponent's move is overridden by a random mistake. Alternatively, you can deliberately restrict the ply of the minimax search, using a heuristic for the opponent's play beyond a certain depth (but still searching the game tree for your own moves.)
* The other one is not to play.
Organize your code into smaller routines so that it looks tidier and easier to debug. Apart from the recursive minimax function, an all-possible-valid-move generation function and a robust evaluation sub-routine are essential ( which seems lacking here).
For example, at the beginning of the game, the evaluation algorithm should return a non-zero score, every position should have a relative scoring index ( eg middle position may have slightly higher weightage than the corners).
Your minimax boundary condition - return if there is no empty cell positions ; is flawed as it will evaluate even when a winning/losing move occurred in the preceding ply. Such conditions will aggravate in more complex AI games.
If you are new to minimax, you can find plenty of ready to compile sample codes on CodeReview
Edit: to clarify, the problem is with the second algorithm.
I have a bit of C++ code that samples cards from a 52 card deck, which works just fine:
void sample_allcards(int table[5], int holes[], int players) {
int temp[5 + 2 * players];
bool try_again;
int c, n, i;
for (i = 0; i < 5 + 2 * players; i++) {
try_again = true;
while (try_again == true) {
try_again = false;
c = fast_rand52();
// reject collisions
for (n = 0; n < i + 1; n++) {
try_again = (temp[n] == c) || try_again;
}
temp[i] = c;
}
}
copy_cards(table, temp, 5);
copy_cards(holes, temp + 5, 2 * players);
}
I am implementing code to sample the hole cards according to a known distribution (stored as a 2d table). My code for this looks like:
void sample_allcards_weighted(double weights[][HOLE_CARDS], int table[5], int holes[], int players) {
// weights are distribution over hole cards
int temp[5 + 2 * players];
int n, i;
// table cards
for (i = 0; i < 5; i++) {
bool try_again = true;
while (try_again == true) {
try_again = false;
int c = fast_rand52();
// reject collisions
for (n = 0; n < i + 1; n++) {
try_again = (temp[n] == c) || try_again;
}
temp[i] = c;
}
}
for (int player = 0; player < players; player++) {
// hole cards according to distribution
i = 5 + 2 * player;
bool try_again = true;
while (try_again == true) {
try_again = false;
// weighted-sample c1 and c2 at once
// h is a number < 1325
int h = weighted_randi(&weights[player][0], HOLE_CARDS);
// i2h uses h and sets temp[i] to the 2 cards implied by h
i2h(&temp[i], h);
// reject collisions
for (n = 0; n < i; n++) {
try_again = (temp[n] == temp[i]) || (temp[n] == temp[i+1]) || try_again;
}
}
}
copy_cards(table, temp, 5);
copy_cards(holes, temp + 5, 2 * players);
}
My problem? The weighted sampling algorithm is a factor of 10 slower. Speed is very important for my application.
Is there a way to improve the speed of my algorithm to something more reasonable? Am I doing something wrong in my implementation?
Thanks.
edit: I was asked about this function, which I should have posted, since it is key
inline int weighted_randi(double *w, int num_choices) {
double r = fast_randd();
double threshold = 0;
int n;
for (n = 0; n < num_choices; n++) {
threshold += *w;
if (r <= threshold) return n;
w++;
}
// shouldn't get this far
cerr << n << "\t" << threshold << "\t" << r << endl;
assert(n < num_choices);
return -1;
}
...and i2h() is basically just an array lookup.
Your reject collisions are turning an O(n) algorithm into (I think) an O(n^2) operation.
There are two ways to select cards from a deck: shuffle and pop, or pick sets until the elements of the set are unique; you are doing the latter which requires a considerable amount of backtracking.
I didn't look at the details of the code, just a quick scan.
you could gain some speed by replacing the all the loops that check if a card is taken with a bit mask, eg for a pool of 52 cards, we prevent collisions like so:
DWORD dwMask[2] = {0}; //64 bits
//...
int nCard;
while(true)
{
nCard = rand_52();
if(!(dwMask[nCard >> 5] & 1 << (nCard & 31)))
{
dwMask[nCard >> 5] |= 1 << (nCard & 31);
break;
}
}
//...
My guess would be the memcpy(1326*sizeof(double)) within the retry-loop. It doesn't seem to change, so should it be copied each time?
Rather than tell you what the problem is, let me suggest how you can find it. Either 1) single-step it in the IDE, or 2) randomly halt it to see what it's doing.
That said, sampling by rejection, as you are doing, can take an unreasonably long time if you are rejecting most samples.
Your inner "try_again" for loop should stop as soon as it sets try_again to true - there's no point in doing more work after you know you need to try again.
for (n = 0; n < i && !try_again; n++) {
try_again = (temp[n] == temp[i]) || (temp[n] == temp[i+1]);
}
Answering the second question about picking from a weighted set also has an algorithmic replacement that should be less time complex. This is based on the principle of that which is pre-computed does not need to be re-computed.
In an ordinary selection, you have an integral number of bins which makes picking a bin an O(1) operation. Your weighted_randi function has bins of real length, thus selection in your current version operates in O(n) time. Since you don't say (but do imply) that the vector of weights w is constant, I'll assume that it is.
You aren't interested in the width of the bins, per se, you are interested in the locations of their edges that you re-compute on every call to weighted_randi using the variable threshold. If the constancy of w is true, pre-computing a list of edges (that is, the value of threshold for all *w) is your O(n) step which need only be done once. If you put the results in a (naturally) ordered list, a binary search on all future calls yields an O(log n) time complexity with an increase in space needed of only sizeof w / sizeof w[0].