I am trying to implement a minimax algorithm for an AI player within a simple card game. However, from doing research I am confused what are the key differences between state evaluation and heuristics.
From what I understand heuristics are calculated by the current information available to the player (e.g. in chess, the pieces and their relevant locations). With this infomation, they come to a conclusion based on a heuristics function which essentially provides a "rule of thumb".
A state evaluation is the exact value of the current state.
However I am unsure why both things co-exist as I cannot see how they are much different from one another. Please can someone ellaborate, and clear up my confusion. Thanks.
Assuming a zero-sum game, you can implement a state-evaluation for end-states (game ended with win, draw, loss from perspective of player X) which results 1,0,-1. A full tree-search will then get you perfect-play.
But in practice the tree is huge and can't be searched completely. Therefore you have to stop the search at some point, which is not an end-state. There is no determined winner or loser. Now it's hard to mark this state with 1,0,-1 as the game might be too complex to easily evaluate the winner from some state far away from the end-state. But you still need to evaluate this positions and can use some assumptions about the game, which equals to heuristic-information. One example is piece-mass in chess (queen is more valuable then a pawn). This is heuristic information incorporated into the non-perfect evaluation-function (approximation of the real one). The better your assumptions / heuristics, the better the approximation of the real evaluation!
But there are other parts where heuristic information can be incorporated. One very imporant area is controlling the tree-search. Which first-move will be evaluated first, which last. Selecting good moves first allows algorithms like alpha-beta to prune huge parts of the tree. But of course you need to state some assumptions/ heuristic-information to order your moves (e.g. queen-move more powerful than pawn-move; this is a made-up example, i'm not sure about the effect of this heuristic in chess-AIs here)
Related
For my recent project I'm right now looking for an efficient way to structure and store the board information with consideration of the usage for patternmatching.
I'm having a square board, and for pattern matching, I'm using bitfields with 2 bits representing one field of the board. The patterns to match have a diamond shape, that could be centered around any possible field on the board. (so the center is not static, I need to be able to do it for any center)
Example of diamond area around O:
..X..
.XXX.
XXOXX
.XXX.
..X..
If parts of the diamond are outside the the playing area, the bits will be set to 11. The diamonds can have differing radiuses, aboves example would have a radius of 2.
Another important thing for the efficiency of the system is, that I have to be able to quickly rotate/mirror the pattern into all 8 possible symmetries.
For this, it may be beneficial to actually NOT store the information of the central point in the pattern, and as this is not required for my algorithm anyway, this may be a valuable timesaver. Because now some bitshifting magic is possible to quickly rotate/mirror the patterns.
As this kind of patternmatching has to be done at a high frequency, it can prove to be a severe bottleneck of my overall project, when implemented badly.
When trying to get a nice model for doing all this work, I figured, there are 3 important keyareas that require thinking about, but are of course tightly connected.
A. How is the data stored in the board implementation.
Currently this is done in a rather difficult manner, which would be too difficult to read from with such high frequency. But it would be no problem or timeloss to actually store and update the 2 bit data in any possible way for the entire board.
Easiest would be to just store the entire board in an bitset with the size of twice the board, and then each two bit represent the value of a single field. But there is no necessarity for doing it in a special sequence or in only one bitset, even though at first it may look natural to do so.
Anyway, this is the part I'm most felxible about, as this can be done without performance issues in any way it seves the other 2 critical parts of the problem the best.
B. How is the data stored in the pattern.
This is already more difficult. As said, my intention is to store them in a bitset of the appropriate size, but there is he question in what order.
There seem to be two ways, that quickly come to mind:
a) (this could be done with or without the central point C)
...0...
..123..
.45678.
9ABCDEF
.GHIJK.
..LMN..
...O...
b)
...0...
..N14..
.ML235.
KJI.678
.HFC9A.
..GDB..
...E...
If we are just talking about the patterns, b) seems clearly superior. A rotation of the pattern is done by a simple rotateshift (3 bitops total per rotation) and even mirroring the pattern can be done with about a dozen bitops. This kind of operations are much more time consuming with a).
But b) has also some severe drawback... And this leads to:
C. How is the data read from the board implementation to the pattern.
Looking at aboves 2 potential ways to order the pattern bits, now a) is clearly superior. a) can be read by a bunch of bitops from a potential array, as discussed in A. you bitshift each line (getting the line by AND with a bitset nulling all other bits) to the appropriate place and put them together with some OR-operations. Even near the board edges this is done very quick.
Problem of course is, that this would still only get me one possible symmetry of the pattern, but rotations/mirrors are not that easily done. This could be circumvented by saving each pattern to match agaisnt 8 times, but this would look very crude, and may cause troubles elsewhere.
With b) this is much more difficult... Honestly, I don't see a way how it can be done quick, without checking every single bit individually. But when increasing the pattern size (like radius 15) this takes forever, when done very often, especially as the [] operator of bitsets is rather slow.
One possible solution I thought of writing it in CUDA, with each thread generating a pattern around one field, and each block of the thread checking one fixed position around this center. But as I haven't used CUDA before, I don't know how reasonable this is, but if done parallel, this sounds more reasonable than iterating over all positions serially.
As I still didn't find a satisfying solution for the problem, I wanted to ask here, if someone probably knows how it can be done better:
- either rotate/mirror patterns of type a)
- or quickly read pattern of type b) (possibly by arranging the data in a better way in step A., I'm flexible here)
- or if the CUDA idea may actually solve that problem
- or maybe some completely different way, I didn't think of, as I'm sure this has been done before by smarter people
If it matters: I'm coding with VS Pro 2013 and don't mind using boost. If CUDA could solve this effectively, I would also use it.
EDIT:
Okay... So I continued thinking about the whole thing. Maybe there are some other ways to make the whole thing more efficient, by doing some work in more efficient batches.
First of all, what I usually need: On a given board position (and we are talking about 10k positions per second) I need for a large set of positions (every empty field of the board, so most fields) all patterns from size 15 down to size 3. I only need the biggest pattern matched by my database, but in any case, I may often need most of them. So there are 2 things, that could make some time savings possible:
1) some efficient way to use the larger pattern, to generate the pattern one size smaller. This should actually possible, when using the bitordering from b), if it is done the proper way... Then it would only need a few bit ops to cut out the outer ring...
2) As often neighboring fields need their specific pattern, if there would be some way to create their patterns in some sort of batch operation... But I admit, I don't see how this could be done very well... But there may be some time savings.
Oh, and another additional comment, as I had the discussion earlier today with some friend: No it is not an option, t instead of matching the board position against the pattern database, to reverse it and do it the other way around (check if DB pattern matches some board position) I have way too many patterns for that. When doing it the first way, i can just look, if the bitstring exists in my database and be done.
Edit2:
Another Update... First I looked into CUDA, and as it seems incompatible with VS2013, this is a severe blow to that idea. Second I thought about the process how patterns are matched. In fact, it may seem possible, instead of going from the large patterns down to the small ones, doing it in reverse. Now suddenly my pattern library is less of a dictionary but more of a searchtree, as larger patterns certainly have their inner core saved as pattern as well. This should speed up any lookups, but still does not solve my problem of the patterngeneration, sadly.
Edit3:
As I felt, it is more worth of an answer then an edit, I just posted my own new idea (which is different from what I had in mind when posting this question) below.
Okay, as I was thinking about this more and more, I now think, that the following solution may be the best to tackle the problems. This is certainly not final, but my currently best idea. So any criticism is welcome and improvements can surely be made.
As the discussion in the comments led me to the believe, that the approach imagined in the question is not practical for the problem at hand, I now drastically changed my idea. Instead of trying to read the pattern around each empty intersection after each move, I will now update the surrounding pattern of each empty intersection after each move made.
This can be made in an efficient way, as we can use 2 very important features of our patterns:
1) each larger patterns core (so the pattern reduced to a lower radius) will guaranteed to be in the database
2) most patterns will have a rather low radius, and in most cases, not many positions on the board are changed with each move, resulting in not too many positions needing a recheck of their patterns.
My idea is, to store the currently largest pattern, it's radius and it's evaluation with each empty intersection. Now, while a move is made, I generate a list of all positions changed during that move. (usually one) Once the move is finished, I iterate over all empty positions on the board and look at their distance to the closest change. Now we are having 3 possible cases:
a) the distance is smaller or equal the radius of currently matched pattern. Now we have to recheck the pattern.
b) the distance is one bigger then the radius of the currently matched pattern. Now we have to check, if actually a (r+1) size pattern exists, matching the surrounding. If it does, we have to check r+2 etc, until we found the largest.
c) the distance is even bigger: We can keep everything as it is.
As we are having now basically a tree of patterns, with each pattern having lots of child pattern with an incremented radius, it is actually practical to store the pattern information in a series of bitets, each representing a ring of a certain radius around the center.
I hope that this system maximizes the reusability of all the information at hand and is fast enough for my needs. As mentioned before, I welcome criticism and opinions for improvement and if there is not better solution found, will probably implement it in the near future. Once done, I can probably report back on the results.
I am interested in knowing why triangle law is so important for a better data mining.As far as I know the triangle law helps us to define patterns and form clusters based on the distances between different objects.Does anyone have any other inputs for triangle law?
It is actually not that important. In data mining, we cannot generally assume to have a proper "mathematical" distance function. As soon as we allow duplicates, we already lose one of the key axioms - we can have two different objects with the distance 0. (And in classification, they may even have different classes in the worst case).
However, the triangle inequality can allow us to prune the search space. If we have a distance function that satisfies triangle inequality and use an appropriate index, we can skip a lot of computations, thus making the algorithm faster.
Note that a lot of research and implementations do not so much care about this kind of optimization. Many data miners working with R like building a distance matrix (which is in O(n^2)!) and then try to do as much as possible with matrix operations, because that is simple to program and R is quite fast at this kind of operations (using a highly optimized C code, instead of interpreted R code). But if you need to go beyond this, a key ingredient for performance is to exploit triangle inequality where possible.
In our program we use a genetic algorithm since years to sole problems for n variables, each having a fixed set of m possible values. This typically works well for ~1,000 variables and 10 possibilities.
Now i have a new task where only two possibilities (on/off) exist for each variable, but i'll probably need to solve systems with 10,000 or more variables. The existing GA does work but the solution improves only very slowly.
All the EA i find are designed rather for continuous or integer/float problems. Which one is best suited for binary problems?
Well, the Genetic Algorithm in its canonical form is among the best suited metaheuristics for binary decision problems. The default configuration that I would try is such a genetic algorithm that uses 1-elitism and that is configured with roulette-wheel selection, single point crossover (100% crossover rate) and bit flip mutation (e.g. 5% mutation probability). I would suggest you try this combination with a modest population size (100-200). If this does not work well, I would suggest to increase the population size, but also change the selection scheme to a tournament selection scheme (start with binary tournament selction and increase the tournament group size if you need even more selection pressure). The reason is that with a higher population size, the fitness-proportional selection scheme might not excert the necessary amount of selection pressure to drive the search towards the optimal region.
As an alternative, we have developed an advanced version of the GA and termed it Offspring Selection Genetic Algorithm. You can also consider trying to solve this problem with a trajectory-based algorithm like Tabu Search or Simulated Annealing that just uses mutation to move from one solution to another by just making small changes.
We have a GUI-driven software (HeuristicLab) that allows you to experiment with a number of metaheuristics on several problems. Your problem is unfortunately not included, but it's GPL licensed and you can implement your own problem there (through just the GUI even, there's a howto for that).
Like DonAndre said, canonical GA was pretty much designed for binary problems.
However...
No evolutionary algorithm is in itself a magic bullet (unless it has billions of years runtime). What matters most is your representation, and how that interacts with your mutation and crossover operators: together, these define the 'intelligence' of what is essentially a heuristic search in disguise. The aim is for each operator to have a fair chance of producing offspring with similar fitness to the parents, so if you have domain-specific knowledge that allows you to do better than randomly flipping bits or splicing bitstrings, then use this.
Roulette and tournament selection and elitism are good ideas (maybe preserving more than 1, it's a black art, who can say...). You may also benefit from adaptive mutation. The old rule of thumb is that 1/5 of offspring should be better than the parents - keep track of this quantity and vary the mutation rate appropriately. If offspring are coming out worse then mutate less; if offspring are consistently better then mutate more. But the mutation rate needs an inertia component so it doesn't adapt too rapidly, and as with any metaparameter, setting this is something of a black art. Good luck!
Why not try a linear/integer program?
I'm coding an AI engine for a simple board game. My simple implementation for now is to iterate over all optional board states, weight each one according to the game rules and my simple algorithm, and selecting the best move according to that score.
As the scoring algorithm is totally stateless, I want to save computation time by creating a hash table of some (all?) board configurations and get the score from there instead of calculating it on the fly.
My questions are:
1. Is my approach logical? (and if not, can you give me some tips to better it? :))
2. What is the most suitable thread-safe STL container for my needs? I'm thinking to use the char array (board configuration) as key and the score as value.
3. Can you give some tips for making my AI a killer one? :)
edit: more info:
The board is 10x10 and there are two players, each with 10 pawns. The rules are much like checkers.
Yes it's common to store evaluated boards into a hash table it's called transposition table. A STL container could be std::vector. In general you have to create a hash function (e.g. zobrist hashing). The hash function calculates a hash value of a particulare board. The result of hash_value modulo HASH_TABLE_SIZE would be the index to the std::vector.
A transposition table entry can hold more information than only board-score and best-move, you can also store to which depth the board is evaluated and if the evaluated-score (in case you are doing alpha-beta search) is
exact
a Upper Bound
or Lower Bound
I can recommend the chessprogramming site, where I have learned a lot. Look for the terms alpha-beta, transposition table, zobrist hashing, iterative deepening. There are also good papers for further reading:
TA Marsland - Computer Chess and Search
TA Marsland - A Review of Game Tree Pruning
AL Zobrist - A New Hashing Method with Application for Game Playing
J Schaeffer - The games Computers (and People) Play
Your logical approach is ok, you should read and maybe try to use Minimax algorithm :
http://en.wikipedia.org/wiki/Minimax
I think that except from tic tac toe game the number of states would be much too big, you should work on making the counting fast.
Chess and checkers can be done with this approach, but it's not one I'd recommend.
If you go this route then I would use some form of tree. If you think about it, every move reduces the total possibilities that existed before the move was made. Plus, this allows levels of difficulty. Don't pick the best all the time, sometimes pick second best.
The reason I wouldn't go this route is that it's not generally fun. People pick up on this intuitively and they feel it's unfair. I wrote a connect 4 game that was unbeatable, but was rule based rather than game board state based. It was dull. Every move was met with the same response. I think this is what happens in this approach as well. Also, it depends on why you are doing this. If it's to learn AI, very little AI is done like this. If it's to have a fun game, it usually isn't. If it's for the reasons Deep Blue was made, to stretch the limits of what a computer can do, then sure.
I would either use a piece based individual AI and then select the one with the most compelling argument or I would use a variation of hill climbing and put a kind of strategy height into the board. It depends on how much support pieces give one another. For the individual AI I would use neural nets.
A strategy height system would be good for an FPS where soldiers want to know what path has the most cover. Neural nets give each entity more personality. You can even use cascading neural nets where one is the strategy and the second is the personality.
I'm developing a Mahjong-solitaire solver and so far, I'm doing pretty good. However,
it is not so fast as I would like it to be so I'm asking for any additional optimization
techniques you guys might know of.
All the tiles are known from the layouts, but the solution isn't. At the moment, I have few
rules which guarantee safe removal of certain pairs of same tiles (which cannot be an obstacle to possible solution).
For clarity, a tile is free when it can be picked any time and tile is loose, when it doesn't bound any other tiles at all.
If there's four free free tiles available, remove them immediately.
If there's three tiles that can be picked up and at least one of them is a loose tile, remove the non-loose ones.
If there's three tiles that can be picked up and only one free tile (two looses), remove the free and one random loose.
If there's three loose tiles available, remove two of them (doesn't matter which ones).
Since there is four times the exact same tile, if two of them are left, remove them since they're the only ones left.
My algorithm searches solution in multiple threads recursively. Once a branch is finished (to a position where there is no more moves) and it didn't lead to a solution, it puts the position in a vector containing bad ones. Now, every time a new branch is launched it'll iterate via the bad positions to check, if that particular position has been already checked.
This process continues until solution is found or all possible positions are being checked.
This works nicely on a layout which contains, say, 36 or 72 tiles. But when there's more,
this algorithm becomes pretty much useless due to huge amount of positions to search from.
So, I ask you once more, if any of you have good ideas how to implement more rules for safe tile-removal or any other particular speed-up regarding the algorithm.
Very best regards,
nhaa123
I don't completely understand how your solver works. When you have a choice of moves, how do you decide which possibility to explore first?
If you pick an arbitrary one, it's not good enough - it's just brute search, basically. You might need to explore the "better branches" first. To determine which branches are "better", you need a heuristic function that evaluates a position. Then, you can use one of popular heuristic search algorithms. Check these:
A* search
beam search
(Google is your friend)
Some years ago, I wrote a program that solves Solitaire Mahjongg boards by peeking. I used it to examine one million turtles (took a day or something on half a computer: it had two cores) and it appeared that about 2.96 percent of them cannot be solved.
http://www.math.ru.nl/~debondt/mjsolver.html
Ok, that was not what you asked, but you might have a look at the code to find some pruning heuristics in it that did not cross your mind thus far. The program does not use more than a few megabytes of memory.
Instead of a vector containing the "bad" positions, use a set which has a constant lookup time instead of a linear one.
If 4 Tiles are visible but can not be picked up, the other Tiles around have to be removed first. The Path should use your Rules to remove a minimum of Tiles, towards these Tiles, to open them up.
If Tiles are hidden by other Tiles, the Problem has no full Information to find a Path and a Probability of remaining Tiles needs to be calculated.
Very nice Problem!