Is it necessary with binary encoding in genetic algorithms? - c++

I'm doing a project exploring the use of genetic algorithms in architecture, where we use an evolutionary approach for creating Voronoi tessellation in 3d. This is done using ofxVoro++ for openFrameworks (c++).
Our chromosomes for the Genomes is a vector (list) of points in 3D. We have implemented single- and two-point crossover and a mutation, which randomises these points with a certain probability. In most examples I've seen, the genome is encoded binarily, which I presume would cause mutation and crossover to act differently.
So my question is this: Are there any other benefits to binary encoding (except speed) and how would you handle such an encoding/decoding in c++? Going from binary to a list of 3d-points.
Best regards,
Fred

I used different GA in logistic and finance problems. Very often I do not use binary representation.
The first example that I can give you is the TSP problem:
https://en.wikipedia.org/wiki/Travelling_salesman_problem
Here I used standard representation: the chromosome is an array of integer, each value represents the city.
So, it depends on the type of problem that you are trying to solve, if you can find a way to implement the GA without a binary representation you do not need any adjustment.
Furthermore I prefer the natural representation because is more simple to understand, while debugging the code, if your GA is working as you want.

You can use real encoding also, but in this case is important what crossover and mutation you use. If your crossover is simply (p1+p2) / 2 or p1*a + p2*(1-a), you will not get good results.
A good crossover operator for real encoding was proposed by K. Deb in 1995. Here is the paper: http://www.complex-systems.com/pdf/09-2-2.pdf

Crossover and mutation are different operators. Crossover uses existing genetic. Mutation introduces new genetic material into the population. Without knowing much more info about your algorithm, randomizing points sounds like mutation. Mutation is typically performed a very low percent of the time (maybe 1%) where crossover can be rather high (50%).
So for your algorithm, I would not "modify" anything for crossover. Instead, for crossover, I would try to reposition material or simply take different portions of points from parents.
For mutation, it might make sense to add or subtract a small number to the points, thus modifying the points (mutation).
It is difficult to make suggestions without knowing more about your algorithm and chromosome representation.

Related

Inference in a Bayesian Network

I need to perform some inferences on a Bayesian network, such as the example I have created below.
I was looking at doing something like something like this to solve an inference such as P(F| A = True, B = True). My initial approach was to do something like
For every possible output of F
For every state of each observed variable (A,B)
For every unobserved variable (C, D, E, G)
// Calculate Probability
But I don't think this will work because we actually need to go over many variables at once, not each at a time.
I have heard about Pearls algorithm for message passing but am yet to find a reasonable description that isn't extremely dense. For added information, these Bayesian networks are constrained as to not have more than 15-20 nodes, and we have all the conditional probability tables, the code doesn't really have to be fast or efficient.
Basically I am looking for a way to do this, not necessarily the BEST way to do this.
Your Bayesian Network (BN) does not seem to be particularly complex. I think you should easily get away with using exact inference method, such as junction tree algorithm. Of course, you can still just do brute force enumeration, but that would be a waste of CPU resources given that there are so many nice libraries out there that implement smarter ways of doing both exact and approximate inference in graphical models.
Since your tag mentions C++, my recommendation would be libDAI. It is a well written library that implements multiple exact and approximate inference on generic factor graphs. It does not have any weird dependencies and is very easy to integrate into your project. It is particularly well suited for discrete cases, such as yours, for which you have the probability tables.
Now, you noticed that I mentioned factor graphs. If you are not familiar with the concept, I will refer you to Wikipedia article on factor graphs as well as What are "Factor Graphs" and what are they useful for?. The principle is very simple, you represent your BN as a factor graph and then libDAI will do the inference for you.
EDIT:
Since CPU resources do not seem to be a problem for you and simplicity is the key, you can always go with brute force enumeration. The idea is straightforward.
Your Bayesian Network represents a joint probability distribution, which you can write down in terms of an equation, e.g.
P(A,B,C) = P(A|B,C) * P(B|C) * P(C)
Assuming that you have tables for all your conditional probability distributions, i.e. P(A|B, C) P(B|C) and P(C) then you can simply go over all the possible values of variables A, B, and C and calculate the output.

CBIR with SIFT alike features, discrete- vs. continuous-approach

currently I'm dealing with implementing a CBIR-System for object recognition (object classification in detail) and now since I have some working feature-detectors and -descriptors I try to find the best way for handling these features for the task of content based image retrieval.
As far as I know there are two main trends for this task, the discrete- and the continuous-approach. Where discrete stands for methods like bag-of-visual words and codebooks for building up inverted indices to apply methods referring text-retrieval, and continuous stands for methods like Best Bin First search with k-d trees and nearest neighbor classification.
So one main difference between those both approaches is, one works with an extra representation for features like visual-words and the other one works with the n-D features calculated from the descriptor.
My question is now, is there any comparison between the two method for CBIR which could help me in finding the best approach for my task?
The full answer to this question would be quite complex and long.
but generally, a continuous method can give you more accurate results, but it's slower as you can effectively build a search index, and you need to work with large descriptors.
you should consider a combination that uses discrete features (visual words) for initial results, and afterwards filter the result set using continuous methods.

Which Evolutionary Algorithm for optimization of binary problems?

In our program we use a genetic algorithm since years to sole problems for n variables, each having a fixed set of m possible values. This typically works well for ~1,000 variables and 10 possibilities.
Now i have a new task where only two possibilities (on/off) exist for each variable, but i'll probably need to solve systems with 10,000 or more variables. The existing GA does work but the solution improves only very slowly.
All the EA i find are designed rather for continuous or integer/float problems. Which one is best suited for binary problems?
Well, the Genetic Algorithm in its canonical form is among the best suited metaheuristics for binary decision problems. The default configuration that I would try is such a genetic algorithm that uses 1-elitism and that is configured with roulette-wheel selection, single point crossover (100% crossover rate) and bit flip mutation (e.g. 5% mutation probability). I would suggest you try this combination with a modest population size (100-200). If this does not work well, I would suggest to increase the population size, but also change the selection scheme to a tournament selection scheme (start with binary tournament selction and increase the tournament group size if you need even more selection pressure). The reason is that with a higher population size, the fitness-proportional selection scheme might not excert the necessary amount of selection pressure to drive the search towards the optimal region.
As an alternative, we have developed an advanced version of the GA and termed it Offspring Selection Genetic Algorithm. You can also consider trying to solve this problem with a trajectory-based algorithm like Tabu Search or Simulated Annealing that just uses mutation to move from one solution to another by just making small changes.
We have a GUI-driven software (HeuristicLab) that allows you to experiment with a number of metaheuristics on several problems. Your problem is unfortunately not included, but it's GPL licensed and you can implement your own problem there (through just the GUI even, there's a howto for that).
Like DonAndre said, canonical GA was pretty much designed for binary problems.
However...
No evolutionary algorithm is in itself a magic bullet (unless it has billions of years runtime). What matters most is your representation, and how that interacts with your mutation and crossover operators: together, these define the 'intelligence' of what is essentially a heuristic search in disguise. The aim is for each operator to have a fair chance of producing offspring with similar fitness to the parents, so if you have domain-specific knowledge that allows you to do better than randomly flipping bits or splicing bitstrings, then use this.
Roulette and tournament selection and elitism are good ideas (maybe preserving more than 1, it's a black art, who can say...). You may also benefit from adaptive mutation. The old rule of thumb is that 1/5 of offspring should be better than the parents - keep track of this quantity and vary the mutation rate appropriately. If offspring are coming out worse then mutate less; if offspring are consistently better then mutate more. But the mutation rate needs an inertia component so it doesn't adapt too rapidly, and as with any metaparameter, setting this is something of a black art. Good luck!
Why not try a linear/integer program?

c++ numerical analysis Accurate data structure?

Using double type I made Cubic Spline Interpolation Algorithm.
That work was success as it seems, but there was a relative error around 6% when very small values calculated.
Is double data type enough for accurate scientific numerical analysis?
Double has plenty of precision for most applications. Of course it is finite, but it's always possible to squander any amount of precision by using a bad algorithm. In fact, that should be your first suspect. Look hard at your code and see if you're doing something that lets rounding errors accumulate quicker than necessary, or risky things like subtracting values that are very close to each other.
Scientific numerical analysis is difficult to get right which is why I leave it the professionals. Have you considered using a numeric library instead of writing your own? Eigen is my current favorite here: http://eigen.tuxfamily.org/index.php?title=Main_Page
I always have close at hand the latest copy of Numerical Recipes (nr.com) which does have an excellent chapter on interpolation. NR has a restrictive license but the writers know what they are doing and provide a succinct writeup on each numerical technique. Other libraries to look at include: ATLAS and GNU Scientific Library.
To answer your question double should be more than enough for most scientific applications, I agree with the previous posters it should like an algorithm problem. Have you considered posting the code for the algorithm you are using?
If double is enough for your needs depends on the type of numbers you are working with. As Henning suggests, it is probably best to take a look at the algorithms you are using and make sure they are numerically stable.
For starters, here's a good algorithm for addition: Kahan summation algorithm.
Double precision will be mostly suitable for any problem but the cubic spline will not work well if the polynomial or function is quickly oscillating or repeating or of quite high dimension.
In this case it can be better to use Legendre Polynomials since they handle variants of exponentials.
By way of a simple example if you use, Euler, Trapezoidal or Simpson's rule for interpolating within a 3rd order polynomial you won't need a huge sample rate to get the interpolant (area under the curve). However, if you apply these to an exponential function the sample rate may need to greatly increase to avoid loosing a lot of precision. Legendre Polynomials can cater for this case much more readily.

Least Squares Regression in C/C++

How would one go about implementing least squares regression for factor analysis in C/C++?
the gold standard for this is LAPACK. you want, in particular, xGELS.
When I've had to deal with large datasets and large parameter sets for non-linear parameter fitting I used a combination of RANSAC and Levenberg-Marquardt. I'm talking thousands of parameters with tens of thousands of data-points.
RANSAC is a robust algorithm for minimizing noise due to outliers by using a reduced data set. Its not strictly Least Squares, but can be applied to many fitting methods.
Levenberg-Marquardt is an efficient way to solve non-linear least-squares numerically.
The convergence rate in most cases is between that of steepest-descent and Newton's method, without requiring the calculation of second derivatives. I've found it to be faster than Conjugate gradient in the cases I've examined.
The way I did this was to set up the RANSAC an outer loop around the LM method. This is very robust but slow. If you don't need the additional robustness you can just use LM.
Get ROOT and use TGraph::Fit() (or TGraphErrors::Fit())?
Big, heavy piece of software to install just of for the fitter, though. Works for me because I already have it installed.
Or use GSL.
If you want to implement an optimization algorithm by yourself Levenberg-Marquard seems to be quite difficult to implement. If really fast convergence is not needed, take a look at the Nelder-Mead simplex optimization algorithm. It can be implemented from scratch in at few hours.
http://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method
Have a look at
http://www.alglib.net/optimization/
They have C++ implementations for L-BFGS and Levenberg-Marquardt.
You only need to work out the first derivative of your objective function to use these two algorithms.
I've used TNT/JAMA for linear least-squares estimation. It's not very sophisticated but is fairly quick + easy.
Lets talk first about factor analysis since most of the discussion above is about regression. Most of my experience is with software like SAS, Minitab, or SPSS, that solves the factor analysis equations, so I have limited experience in solving these directly. That said, that the most common implementations do not use linear regression to solve the equations. According to this, the most common methods used are principal component analysis and principal factor analysis. In a text on Applied Multivariate Analysis (Dallas Johnson), no less that seven methods are documented each with their own pros and cons. I would strongly recommend finding an implementation that gives you factor scores rather than programming a solution from scratch.
The reason why there's different methods is that you can choose exactly what you're trying to minimize. There a pretty comprehensive discussion of the breadth of methods here.