Solving engine for TicTacToe - c++

I'm a junior programmer and I know the basics of pascal and c++. I made a Tic Tac Toe game with Player-Computer and the game is all finished.
The computer generates a random place where the Os go on the table and that's not good.
I thought that i should multiple procedures that check every winning position and the computer should else try to block the player's Xs or to make a winning position, BUT this would have been lots of time lost cause all of the if's.
Then I thought of a simpler version with some kind of ifs but it would still have been taking lots of time to do.
Then i thought deeper: What about a find-four game? How in the earth someone would manage to check every space available and how it would've been possible that someone could make a function that checks absolutely any winning or progress of player/computer position, Oh and wait, that's not ALL, what if the player is doing some tricks so he blocks the computer? How would the computer know that?!? For sure, that would take ages to program. And I am not talking about something that seems more impossible: Chess.
So here I am, asking myself that there SHOULD be a way more simpler way the computer should search and solve some problems than tons of ifs.
In this case, if any of you know any way of solving this, how can i manage to make the simplest procedure to block and beat the player in a TicTacToe game?
If someone wants to check my code or use it: http://pastebin.com/jhyUn7d1

What you're looking for is Minimax.
Using this algorithm the computer will win every Tic Tac Toe game or you could adjust the depth that the computer analyzes the moves in order to achieve some kind of medium difficulty.
It's not hard to implement, you should be familiar with recursivity and you're set, of course the implementation differs according to your code, but the wikipedia page offers a pretty good starting point.

Tic tac toe algorithm is something like:
Take spot if going to win
Take spot if going to lose
Take corner
Take non-corner non-center
Take center

The short answer is "try all the different moves until the game is won, and record which ones lead to computer winning".
Long answerĂ–
For a limited size TTT game, the number of possible moves before the game is won, isn't that much, so simply try a each possible move, then recursively try all possible opponent moves, and keep going until the game ends. Give each move a "score" of how well it went (e.g. how many different solutions did you get that went successful for the computer, and how many went success to the opponent, and pick the one with the "best" result). Beware that you will probably end up with something that is near on impossible to win against if you do well.

I recently dealt with this, although my code was in C#.
I came up with a way of scoring each candidate move. The approach I took creates a score based on the number of moves that would then be required for a win (less moves needed results in a higher score).
My algorithm also considers the combined number of moves for multiple squares. As a result, the algorithm favors moves that would produce multiple potential wins (the only real tactic I know about for Tic Tac Toe). For example, it is possible sometimes to make a move that produces two potential wins that must be blocked. Since the opponent can only block one, it produces a win.
I posted my entire code and a description of it in the article A Tic-Tac-Toe Game Engine.

I did this once, a long time ago. I don't know if I still have the code lying around...
Anyway, I created a function, return type int, which was the square in which the computer should place its piece (assuming 0 is top left and 8 is bottom right square). Yours uses a 2D array, so would be a little different.
Anyway, for each row, column and diagonal, check to see if any two pieces on that row belong to the player. If they don't, check for the same but belonging to the computer. On the first row that this is true, check the remaining piece - if it's available, put the piece there for the win. If you have a player-dominated row, check that you don't already have a piece there and stick it in to block.
const int PlayerPiece = 1;
const int CPiece = 2;
const int Empty = 0;
int board[3][3];
if(board[0][0] == PlayerPiece && board[0][1] == PlayerPiece && board [0][2] == Empty)
{
//Put_Your_Piece_In_[0][2]
}
You could then go on to changing it so that it could check each row i.e.
int numRows = 3;
for(int i = 0; i < numRows; i++)
{
if(board[i][0] == PlayerPiece && board[i][1] == PlayerPiece && board[i][2] == Empty)
{
//Put_Piece_In_[i][2]
}
}
Then, do the same for rows.
You could always consider that Tic-Tac-Toe is essentially just a magic square, described fairly well here: http://www.sciforums.com/showthread.php?134281-An-isomorphism-Tic-Tac-Toe-on-Magic-Square

There is a perfect strategy for Tic-Tac-Toe available on wikipedia. It is really simple. Due to the small size of the grid, the number of cases you need to test (eg test if there are 2 blocks in a row), are very small.

Related

Problems with implementing approximate(feature based) q learning

I am new to reinforcement learning. I had recently learned about approximate q learning, or feature-based q learning, in which you describe states by features to save space. I have tried to implement this in a simple grid game. Here, the agent is supposed to learn to not go into a firepit(signaled by an f) and to instead eat up as much dots as possible. Here is the grid used:
...A
.f.f
.f.f
...f
Here A signals the agent's starting location. Now, when implementing, I set up two features. One was 1/((distance to closest dot)^2), and the other was (distance to firepit) + 1. When the agent enters a firepit, the program returns with a reward of -100. If it goes to a non firepit position that was already visited(and thus there is no dot to be eaten), the reward is -50. If it goes to an unvisited dot, the reward is +500. In the above grid, no matter what the initial weights are, the program never learns the correct weight values. Specifically, in the output, the first training session gains a score(how many dots it ate) of 3, but for all other training sessions, the score is just 1 and the weights converge to an incorrect value of -125 for weight 1(distance to firepit) and 25 for weight 2(distance to unvisited dot). Is there something specifically wrong with my code or is my understanding of approximate q learning incorrect?
I have tried to play around with the rewards that the environment is giving and also with the initial weights. None of these have fixed the problem.
Here is the link to the entire program: https://repl.it/repls/WrongCheeryInterface
Here is what is going on in the main loop:
while(points != NUMPOINTS){
bool playerDied = false;
if(!start){
if(!atFirepit()){
r = 0;
if(visited[player.x][player.y] == 0){
points += 1;
r += 500;
}else{
r += -50;
}
}else{
playerDied = true;
r = -100;
}
}
//Update visited
visited[player.x][player.y] = 1;
if(!start){
//This is based off the q learning update formula
pairPoint qAndA = getMaxQAndAction();
double maxQValue = qAndA.q;
double sample = r;
if(!playerDied && points != NUMPOINTS)
sample = r + (gamma2 * maxQValue);
double diff = sample - qVal;
updateWeights(player, diff);
}
// checking end game condition
if(playerDied || points == NUMPOINTS) break;
pairPoint qAndA = getMaxQAndAction();
qVal = qAndA.q;
int bestAction = qAndA.a;
//update player and q value
player.x += dx[bestAction];
player.y += dy[bestAction];
start = false;
}
I would expect that both weights would still be positive, but one of them is negative(the one giving distance to the firepit).
I also expected the program to learn overtime that it is bad to enter a firepit and also bad, but not as bad, to go to an unvisited dot.
Probably not the anwser you want to hear, but:
Have you try to implement the simpler tabular Q-learning before approximate Q-learning? In your setting, with a few states and actions, it will work pefectly. If you are learning, I strongly recommend you to start with the simpler cases in order to get a better understanding/intuition about how Reinforcement Learning works.
Do you know the implications of using approximators instead of learning the exact Q function? In some cases, due to the complexity of the problem (e.g., when the state space is continuous) you should approximate the Q function (or the policy, depending on the algorithm), but this may introduce some convergence problems. Additionally, in you case, you are trying to hand-pick some features, which usually required a depth knowledge of the problem (i.e., environment) and the learning algorithm.
Do you understand the meaning of the hyperparameters alpha and gamma? You can not choose them randomly. Sometimes they are critical to obtain the expected results, not always, depending heavely on the problem and the learning algorithm. In your case, taking a look to the convergence curve of you weights, it's pretty clear that you are using a value of alpha too high. As you pointed out, after the first training session your weigths remain constant.
Therefore, practical recommendations:
Be sure to solve your grid game using a tabular Q-learning algorithm before trying more complex things.
Experiment with different values of alpha, gamma and rewards.
Read more in depth about approximated RL. A very good and accesible book (starting from zero knowledge) is the classical Sutton and Barto's book: Reinforcement Learning: An Introduction, which you can obtain for free and was updated in 2018.

How does a program interact with another program?

Alright, I'm a little bit (honestly way too) confused about how the heck I can make a program interact with another program.
For example, let's say a game, a shooter, when you run an external program and you make your character not able to die, or immediately shoot when detects an enemy, etc...
I was reading a little bit about it, and they say you have to know how the "target" is composed. But I still don't get it.
For example, let's say we've got a simple code like this:
#include <iostream>
#include <windows.h>
int main() {
for(int h = 0; ; h++) {
std::cout << "The H's value is: " << h << std::endl;
Sleep(1000);
}
return 0;
}
Then, how do I create another program where I can change the H's value to zero everytime I press any key?
Don't get me wrong, I ain't trying to hack anyone or anything, I'm just curious about how those programs work.
(Sorry if I've got some grammar issues, English isn't my native language).
Specific to your program in the exapmle if we take that the program is already compiled and you are not allowed to make any changes to the source code the solution would be to build a program which will run with high enough privileges to examine this process's memory and directly change the in-memory value of h, which should be on the top of the stack(or almost).
Speaking of some more "legal" ways to do so you should check you should read about inter process communication which can be done with multiple methods. Read this.
However most "Bots" and programs which help cheaters in games are in many cases graphics based and are able to analyse the image and thus help aim. On the other hand some "recoil reducers" simply move your mouse in the opposite direction of the recoil of the gun in game. So there is a ton of approaches to your question and for every particular case the answer might be different.

C++ File outputting strange number, and part of code not running

Yeah. So, I'm trying to make a code for a guessing game. In this game, there's a hard mode. In hard mode, you have 15 guesses, and have to guess between 1 and 500. But my problem is this:
I'm trying to have hard mode save & display your wins/losses, but when it outputs the contents of wins.txt it outputs something like this:
Wins: 0x7fffee26df78
Losses: 0x7fffee26e178
It's really confusing me. Here's the part of the code I have for that:
ifstream losses_var("losses.txt");
ifstream wins_var("wins.txt");
losses_var>> loss;
wins_var>> win;
wins_var.close();
losses_var.close();
Then it gets called with:
cout<<"Wins: "<< wins <<"\nLosses: "<< losses <<"\n"
If you would like to see the full source code, it's here: http://pastebin.com/gPT37uBJ
My second problem:
Hard mode won't display when you win. That's pretty much the whole problem. In my code, the loop for asking users for input uses
while (guess != randNum)
So at the end bracket I have what I want the code to display when a user wins, but it just doesn't run. It just stops. I would like it if someone could help me with this. The line that has the bug is line 97 through 105. Again, source code is here: http://pastebin.com/gPT37uBJ
You've got your variable names confused
cout<<"Wins: "<< wins <<"\nLosses: "<< losses <<"\n";
should be
cout<<"Wins: "<< win <<"\nLosses: "<< loss <<"\n";
It's important to pick good variable names. One reason is so that you don't confuse yourself about what your variables mean (if you confuse yourself think how it's going to be for someone else looking at your code).
Others have already answered the output problem (win vs. wins). The other problem is probably in your logic of while loop nesting. The outer loop (while (guess != randNum)) starts, but its body contains the entire inner loop (while (guesses_left != 0)). This means that the outer condition is not checked again until the inner loop terminates, which means you've run out of guesses. Also note that if you guess correctly, inner loop will never terminate. You probably want something like this:
while (guesses_left > 0) {
// input user's guess
if (guess < randNum) {
// process it
} else if (guess > randNum) {
// process it
} else {
// it's equal, user won
// do what's necessary for a win
return 0;
}
}
// ran out of guesses
// do what's necessary for a loss
return 0;
You are not writing your variables win and loss to cout. From your pasted code, I can see that wins and losses are ofstream objects, which means you are probably seeing addresses there. I would advise you to choose more informative variable names to avoid hard to spot mistakes like this.

Break and rerun while loop c++ Windows

I'm a rookie programmer, so please be polite.
Well i'm trying to write a simple Terminal Backgammon game, just for fun, but i have a problem.
The entire game runs in a while loop which keeps re running as long as nobody moved all their bricks to the end of the board.
A simple integer controls whatever it is black or white who plays.
I wrote a function to check for any possible moves, cause i want to program to skip the turn in case absolutely no moves can be made.
Well, i want this function to run and in case it returns false(No possible moves) then i want the rest of the code to skip and change the turn to the next player. For example if the dice combination gives no possible moves for black, then i want the program to skip black and go to white.
So i sort of want to break the rest of the while loop, but keep it running.
It's a little complicated for me to explain the issue, but i hope you guys understand.
Thanks alot
- Martin
It sounds like you want to use continue:
while (someCondition)
{
doSomething();
if (someOtherCondition)
continue;
doSomethingElse();
}
In this example, if someOtherCondition is true, the continue statement will cause the program to jump back to the top of the loop rather than continuing to execute the following statements. If someOtherCondition is false, doSomethingElse() will get run as normal.
I think this is roughly what you want to know.
Hope it helps.
while( keepRunning )
{
bool noPossibleMoves = checkForPossibleMoves();
setup for each loop iteration
Do things here that are always necessary.
if( noPossibleMoves )
{
continue; // This will go to the top of the while loop
}
wait for user input etc...
...
...
}

WxTextCtrl unable to load large texts

I've read about the solutuon written here on a post a year ago
wx.TextCtrl.LoadFile()
Now I have a windows application that will generate color frequency statistics that are saved in 3D arrays. Here is a part of my code as you will see on the code below the printing of the statistics is dependent on a slider which specifies the threshold.
void Project1Frm::WxButton2Click(wxCommandEvent& event)
{
char stat[32] ="";
int ***report = pGLCanvas->GetPixel();
float max = pGLCanvas->GetMaxval();
float dist = WxSlider5->GetValue();
WxRichTextCtrl1->Clear();
WxRichTextCtrl1->SetMaxLength(100);
if(dist>0)
{
WxRichTextCtrl1->AppendText(wxT("Statistics\nR\tG\tB\t\n"));
for(int m=0; m<256; m++){
for(int n=0; n<256; n++){
for(int o=0; o<256; o++){
if((report[m][n][o]/max)>=(dist/100.0))
{
sprintf(stat,"%d\t%d\t%d\t%3.6f%%\n",m,n,o,report[m][n][o]/max*100.0);
WxRichTextCtrl1->AppendText(wxT(stat));
}
}
}
}
}
else if(dist==0) WxRichTextCtrl1->LoadFile("histodata.txt");
}
The solution I've tried so far is that when I am to print all the statistics I'll get it from a text file rather than going through the 3D array... I would like to ask if the Python implementation of the segmenting can be ported to C++ or are there better ways to deal with this problem. Thank you.
EDIT:
Another reason why I used a text file instead is that I observed that whenever I do sprintf only [with the line WxRichTextCtrl1->AppendText(wxT(stat)); was commented out] the computer starts to slow down.
-Ric
Disclaimer: My answer is more of an alternative than a solution.
I don't believe that there's any situation in which a user of this application is going to find it useful to have a scrolled text window containing ~16 million lines of numbers. It would be impossible to scroll to one specific location in the list that the user might need to see easily. This is all assuming that every single number you output here has some significance to the user of course (you are showing them on the screen for a reason). Providing the user with controls to look up specific, fixed (reasonable) ranges of those numbers would be a better solution, not only in regards to a better user experience, but also in helping to resolve your issue here.
On the other hand, if you still insist on one single window containing all 64 million numbers, you seem to have a very rigid data structure here, which means you can (and should) take advantage of using a virtual grid control (wxGrid), which is intended to work smoothly even with incredibly large data sets like this. The user will likely find this control easier to read and find the section of data they are looking for.