Which Evolutionary Algorithm for optimization of binary problems? - c++

In our program we use a genetic algorithm since years to sole problems for n variables, each having a fixed set of m possible values. This typically works well for ~1,000 variables and 10 possibilities.
Now i have a new task where only two possibilities (on/off) exist for each variable, but i'll probably need to solve systems with 10,000 or more variables. The existing GA does work but the solution improves only very slowly.
All the EA i find are designed rather for continuous or integer/float problems. Which one is best suited for binary problems?

Well, the Genetic Algorithm in its canonical form is among the best suited metaheuristics for binary decision problems. The default configuration that I would try is such a genetic algorithm that uses 1-elitism and that is configured with roulette-wheel selection, single point crossover (100% crossover rate) and bit flip mutation (e.g. 5% mutation probability). I would suggest you try this combination with a modest population size (100-200). If this does not work well, I would suggest to increase the population size, but also change the selection scheme to a tournament selection scheme (start with binary tournament selction and increase the tournament group size if you need even more selection pressure). The reason is that with a higher population size, the fitness-proportional selection scheme might not excert the necessary amount of selection pressure to drive the search towards the optimal region.
As an alternative, we have developed an advanced version of the GA and termed it Offspring Selection Genetic Algorithm. You can also consider trying to solve this problem with a trajectory-based algorithm like Tabu Search or Simulated Annealing that just uses mutation to move from one solution to another by just making small changes.
We have a GUI-driven software (HeuristicLab) that allows you to experiment with a number of metaheuristics on several problems. Your problem is unfortunately not included, but it's GPL licensed and you can implement your own problem there (through just the GUI even, there's a howto for that).

Like DonAndre said, canonical GA was pretty much designed for binary problems.
However...
No evolutionary algorithm is in itself a magic bullet (unless it has billions of years runtime). What matters most is your representation, and how that interacts with your mutation and crossover operators: together, these define the 'intelligence' of what is essentially a heuristic search in disguise. The aim is for each operator to have a fair chance of producing offspring with similar fitness to the parents, so if you have domain-specific knowledge that allows you to do better than randomly flipping bits or splicing bitstrings, then use this.
Roulette and tournament selection and elitism are good ideas (maybe preserving more than 1, it's a black art, who can say...). You may also benefit from adaptive mutation. The old rule of thumb is that 1/5 of offspring should be better than the parents - keep track of this quantity and vary the mutation rate appropriately. If offspring are coming out worse then mutate less; if offspring are consistently better then mutate more. But the mutation rate needs an inertia component so it doesn't adapt too rapidly, and as with any metaparameter, setting this is something of a black art. Good luck!

Why not try a linear/integer program?

Related

What is the difference between state evaluation and heuristics in game-AI?

I am trying to implement a minimax algorithm for an AI player within a simple card game. However, from doing research I am confused what are the key differences between state evaluation and heuristics.
From what I understand heuristics are calculated by the current information available to the player (e.g. in chess, the pieces and their relevant locations). With this infomation, they come to a conclusion based on a heuristics function which essentially provides a "rule of thumb".
A state evaluation is the exact value of the current state.
However I am unsure why both things co-exist as I cannot see how they are much different from one another. Please can someone ellaborate, and clear up my confusion. Thanks.
Assuming a zero-sum game, you can implement a state-evaluation for end-states (game ended with win, draw, loss from perspective of player X) which results 1,0,-1. A full tree-search will then get you perfect-play.
But in practice the tree is huge and can't be searched completely. Therefore you have to stop the search at some point, which is not an end-state. There is no determined winner or loser. Now it's hard to mark this state with 1,0,-1 as the game might be too complex to easily evaluate the winner from some state far away from the end-state. But you still need to evaluate this positions and can use some assumptions about the game, which equals to heuristic-information. One example is piece-mass in chess (queen is more valuable then a pawn). This is heuristic information incorporated into the non-perfect evaluation-function (approximation of the real one). The better your assumptions / heuristics, the better the approximation of the real evaluation!
But there are other parts where heuristic information can be incorporated. One very imporant area is controlling the tree-search. Which first-move will be evaluated first, which last. Selecting good moves first allows algorithms like alpha-beta to prune huge parts of the tree. But of course you need to state some assumptions/ heuristic-information to order your moves (e.g. queen-move more powerful than pawn-move; this is a made-up example, i'm not sure about the effect of this heuristic in chess-AIs here)

Is it necessary with binary encoding in genetic algorithms?

I'm doing a project exploring the use of genetic algorithms in architecture, where we use an evolutionary approach for creating Voronoi tessellation in 3d. This is done using ofxVoro++ for openFrameworks (c++).
Our chromosomes for the Genomes is a vector (list) of points in 3D. We have implemented single- and two-point crossover and a mutation, which randomises these points with a certain probability. In most examples I've seen, the genome is encoded binarily, which I presume would cause mutation and crossover to act differently.
So my question is this: Are there any other benefits to binary encoding (except speed) and how would you handle such an encoding/decoding in c++? Going from binary to a list of 3d-points.
Best regards,
Fred
I used different GA in logistic and finance problems. Very often I do not use binary representation.
The first example that I can give you is the TSP problem:
https://en.wikipedia.org/wiki/Travelling_salesman_problem
Here I used standard representation: the chromosome is an array of integer, each value represents the city.
So, it depends on the type of problem that you are trying to solve, if you can find a way to implement the GA without a binary representation you do not need any adjustment.
Furthermore I prefer the natural representation because is more simple to understand, while debugging the code, if your GA is working as you want.
You can use real encoding also, but in this case is important what crossover and mutation you use. If your crossover is simply (p1+p2) / 2 or p1*a + p2*(1-a), you will not get good results.
A good crossover operator for real encoding was proposed by K. Deb in 1995. Here is the paper: http://www.complex-systems.com/pdf/09-2-2.pdf
Crossover and mutation are different operators. Crossover uses existing genetic. Mutation introduces new genetic material into the population. Without knowing much more info about your algorithm, randomizing points sounds like mutation. Mutation is typically performed a very low percent of the time (maybe 1%) where crossover can be rather high (50%).
So for your algorithm, I would not "modify" anything for crossover. Instead, for crossover, I would try to reposition material or simply take different portions of points from parents.
For mutation, it might make sense to add or subtract a small number to the points, thus modifying the points (mutation).
It is difficult to make suggestions without knowing more about your algorithm and chromosome representation.

Evolutionary Algorithm without an objective function

I'm currently trying to find good parameters for my program (about 16 parameters and execution of the program takes about a minute). Evolutionary algorithms seemed like a nice idea and I wanted to see how they perform.
Unfortunately I don't have a good fitness function because the variance of my objective function is very high (I can not run it often enough without waiting until 2016). I can, however, compute which set of parameters is better (test two configurations against each other). Do you know if there are evolutionary algorithms that only use that information? Are there other optimization techniques more suitable? For this project I'm using C++ and MATLAB.
// Update: Thank you very much for the answers. Both look promising but I will need a few days to evaluate them. Sorry for the delay.
If your pairwise test gives a proper total ordering, i.e. if a >= b, and b >= c implies a >= c, and some other conditions . Then maybe you can construct a ranking objective on the fly, and use CMA-ES to optimize it. CMA-ES is an evolutionary algorithm and is invariant to order preserving transformation of function value, and angle-preserving transformation of inputs. Furthermore because it's a second order method, its convergence is very fast comparing to other derivative-free search heuristics, especially in higher dimensional problems where random search like genetic algorithms take forever.
If you can compare solutions in a pairwise fashion then some sort of tournament selection approach might be good. The Wikipedia article describes using it for a genetic algorithm but it is easily applied to an evolutionary algorithm. What you do is repeatedly select a small set of solutions from the population and have a tournament among them. For simplicity the tournament size could be a power of 2. If it was 8 then pair those 8 up at random and compare them, selecting 4 winners. Pair those up and select 2 winners. In a final round -- select an overall tournament winner. This solution can then be mutated 1 or more times to provide member(s) for the next generation.

How many samples are optimal in one class using k-nearest neighbor?

I have implemented k-nearest algorithm in my system. It consists from 26 classes, each of 100 samples. In my case, K=7 and it was completely trial and error to get the best classification result.
I know that K should be chosen wisely to reduce the noise on the classification. But what about the number of samples? Is there any general rule such as "the more samples the better result"? Does it depend on something?
Thank you for all your responses.
You could try considering whatever underlying mechanism is generating your data, or whatever background knowledge you have on the problem, which might give you an idea of the relative size of noise and true underlying variation. E.g. predicting favourite sports team from location I would expect more change than predicting favourite sport, so would use smaller k. However I don't know of much general guidance, except to use cross-validation.

Board game AI design: choosing STL data container

I'm coding an AI engine for a simple board game. My simple implementation for now is to iterate over all optional board states, weight each one according to the game rules and my simple algorithm, and selecting the best move according to that score.
As the scoring algorithm is totally stateless, I want to save computation time by creating a hash table of some (all?) board configurations and get the score from there instead of calculating it on the fly.
My questions are:
1. Is my approach logical? (and if not, can you give me some tips to better it? :))
2. What is the most suitable thread-safe STL container for my needs? I'm thinking to use the char array (board configuration) as key and the score as value.
3. Can you give some tips for making my AI a killer one? :)
edit: more info:
The board is 10x10 and there are two players, each with 10 pawns. The rules are much like checkers.
Yes it's common to store evaluated boards into a hash table it's called transposition table. A STL container could be std::vector. In general you have to create a hash function (e.g. zobrist hashing). The hash function calculates a hash value of a particulare board. The result of hash_value modulo HASH_TABLE_SIZE would be the index to the std::vector.
A transposition table entry can hold more information than only board-score and best-move, you can also store to which depth the board is evaluated and if the evaluated-score (in case you are doing alpha-beta search) is
exact
a Upper Bound
or Lower Bound
I can recommend the chessprogramming site, where I have learned a lot. Look for the terms alpha-beta, transposition table, zobrist hashing, iterative deepening. There are also good papers for further reading:
TA Marsland - Computer Chess and Search
TA Marsland - A Review of Game Tree Pruning
AL Zobrist - A New Hashing Method with Application for Game Playing
J Schaeffer - The games Computers (and People) Play
Your logical approach is ok, you should read and maybe try to use Minimax algorithm :
http://en.wikipedia.org/wiki/Minimax
I think that except from tic tac toe game the number of states would be much too big, you should work on making the counting fast.
Chess and checkers can be done with this approach, but it's not one I'd recommend.
If you go this route then I would use some form of tree. If you think about it, every move reduces the total possibilities that existed before the move was made. Plus, this allows levels of difficulty. Don't pick the best all the time, sometimes pick second best.
The reason I wouldn't go this route is that it's not generally fun. People pick up on this intuitively and they feel it's unfair. I wrote a connect 4 game that was unbeatable, but was rule based rather than game board state based. It was dull. Every move was met with the same response. I think this is what happens in this approach as well. Also, it depends on why you are doing this. If it's to learn AI, very little AI is done like this. If it's to have a fun game, it usually isn't. If it's for the reasons Deep Blue was made, to stretch the limits of what a computer can do, then sure.
I would either use a piece based individual AI and then select the one with the most compelling argument or I would use a variation of hill climbing and put a kind of strategy height into the board. It depends on how much support pieces give one another. For the individual AI I would use neural nets.
A strategy height system would be good for an FPS where soldiers want to know what path has the most cover. Neural nets give each entity more personality. You can even use cascading neural nets where one is the strategy and the second is the personality.