Genetic programming in c++, library suggestions? [closed]

Genetic programming in c++, library suggestions? [closed] - c++

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I'm looking to add some genetic algorithms to an Operations research project I have been involved in. Currently we have a program that aids in optimizing some scheduling and we want to add in some heuristics in the form of genetic algorithms. Are there any good libraries for generic genetic programming/algorithms in c++? Or would you recommend I just code my own?
I should add that while I am not new to c++ I am fairly new to doing this sort of mathematical optimization work in c++ as the group I worked with previously had tended to use a proprietary optimization package.
We have a fitness function that is fairly computationally intensive to evaluate and we have a cluster to run this on so parallelized code is highly desirable.
So is c++ a good language for this? If not please recommend some other ones as I am willing to learn another language if it makes life easier.
thanks!

I would recommend rolling your own. 90% of the work in a GP is coding the genotype, how it gets operated on, and the fitness calculation. These are parts that change for every different problem/project. The actual evolutionary algorithm part is usually quite simple.
There are several GP libraries out there ( http://en.wikipedia.org/wiki/Symbolic_Regression#Implementations ). I would use these as examples and references though.
C++ is a good choice for GP because they tend to be very computationally intensive. Usually, the fitness function is the bottleneck, so it's worthwhile to at least make this part compiled/optimized.

I use GAUL
it's a C library with all you want.
( pthread/fork/openmp/mpi )
( various crossover / mutation function )
( non GA optimisation: Hill-Climbing, N-M Simplex, Simulated annealling, Tabu, ... )
Why build your own library when there is such powerful tools ???

I haven't used this personally yet, but the Age Layered Population Structure (ALPS) method has been used to generate human competitive results and has been shown to outperform several popular methods in finding optimal solutions in rough fitness landscapes. Additionally, the link contains source code in C++ FTW.

I have had similar problems. I used to have a complicated problem and defining a solution in terms of a fixed length vector was not desirable. Even a variable length vector does not look attractive. Most of the libraries focus on cases where the cost function is cheap to calculate which did not match my problem. Lack of parallelism is their another pitfall. Expecting the user to allocate memory for being used by the library is adding insult into injury. My cases were even more complicated because most of the libraries check the nonlinear conditions before evaluation. While, I needed to check the nonlinear condition during or after the evaluation based on the result of the evaluation. It is also undesirable when I needed to evaluate the solution to calculate its cost and then I had to recalculate the solution to present it. In most of the cases, I had to write the cost function two times. Once for GA and once for presentation.
Having all of these problems, I eventually, designed my own openGA library which is now mature.
This library is based on C++ and distributed with free Mozilla Public License 2.0. It guarantees that using this library does not limit your project and it can be used for commercial or none commercial purposes for free without asking for any permission. Not all libraries are transparent in this sense.
It supports three modes of single objective, multiple objective (NSGA-III) and Interactive Genetic Algorithm (IGA).
The solution is not mandated to be a vector. It can be any structure with any customized design containing any optional values with variable length. This feature makes this library suitable for Genetic Programming (GP) applications.
C++11 is used. Template feature allows flexibility of the solution structure design.
The standard library is enough to use this library. There is no dependency beyond that. The entire library is also a single header file for ease of use.
The library supports parallelism by default unless you turn it off. If you have an N-core CPU, the number of threads are set to N by default. You can change the settings. You can also set if the solution evaluations are distributed between threads equally or they are assigned to any thread which has finished its job and is currently idle.
The solution evaluation is separated from calculation of the final cost. It means that your evaluation function can simulate the system and keep a lot of information. Your cost function is called later and reports the cost based on the evaluation. While your evaluation results are kept to be used later by the user. You do not need to re-calculate it again.
You can reject a solution at any time during the evaluation. No waste of time. In fact, the evaluation and constraint check are integrated.
The GA assist feature help you to produce the C++ code base from the information you provide.
If these features match what you need, I recommend having a look at the user manual and the examples of openGA.
The number of the readers and citation of the related publication as well as its github favorite marks is increasing and its usage is keep growing.

I suggest you have a look into the matlab optimization toolkit - it comes with GAs out of the box, you only haver to code the fitness function (and a function to generate inital population eventually) and I believe matlab has some C++ interoperability so you could code you functions in C++. I am using it for my experiments and a very nice feature is that you get all sorts of charts out of the box as well.
Said so - if your aim is to learn about genetic algorithms you're better off coding it, but if you just want to run experiments matlab and C++ (or even just matlab) is a good option.

Related

How can I benchmark the performance of C++ code? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am starting to study algorithms and data structures seriously, and interested in learning how to compare the performance of the different ways I can implement A&DTs.
For simple tests, I can get the time before/after something runs, run that thing 10^5 times, and average the running times. I can parametrize input by size, or sample random input, and get a list of running times vs. input size. I can output that as a csv file, and feed it into pandas.
I am not sure there are no caveats. I am also not sure what to do about measuring space complexity.
I am learning to program in C++. Are there humane tools to achieve what I am trying to do?

Benchmarking code is not easy. What I found most useful was Google benchmark library. Even if you are not planning to use it, it might be good to read some of examples. It has a lot of possibilities to parametrize test, output results to file and even returning you Big O notation complexity of your algorithm (to name just few of them). If you are any familiar with Google test framework I would recommend you to use it. It also keeps compiler optimization possible to manage so you can be sure that your code wasn't optimized away.
There is also great talk about benchmarking code on CppCon 2015: Chandler Carruth "Tuning C++: Benchmarks, and CPUs, and Compilers! Oh My!". There are many insights in possible mistake that you can make (it also uses google benchmark)

It is operating system and compiler specific (so implementation specific). You could use profiling tools, you could use timing tools, etc.
On Linux, see time(1), time(7), perf(1), gprof(1), pmap(1), mallinfo(3) and proc(5) and about Invoking GCC.
See also this. In practice, be sure that your runs are lasting long enough (e.g. at least one second of time in a process).
Be aware that optimizing compilers can transform drastically your program. See CppCon 2017: Matt Godbolt talk “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”

Talking from an architecture point of view, you can also benchmark your C++ code using different architectural tools such as Intel Pin, perf tool. You can use these tools to study the architecture dependency of your code. For example, you can compile your code for different level of optimizations and check the IPC/CPI, cache accesses and load-store accesses. You can even check if your code is suffering a performance hit due to library functions. The tools are powerful and can give you potentially huge insights into your code.
You can also try disassembling your code and study where your code spends most of the time and try and optimize that. You can look at different techniques to ensure that the frequently accessed data remains in the cache and thus ensure a high hit rate.
Say, you realize that your code is heavily dominated by loops, you can run your code for different loop bounds and check for the metrics in 2 cases. For example, set the loop bound for 100,000 and find the desired performance metric 'X' and then set the loop bound for 200,000 and find the performance metric 'Y'. Now,calculate Y-X. This will give you a much better insight into the behavior of the loops because by subtracting the two metrics, you have effectively removed the static effects of the code.
Say, you run your code for 10 times and with different user input size. You can maybe find the runtime per user input size and then sort this new metric in ascending order, remove the first and the last value(to remove the outliers) and then take the average. Finally, find the Coefficient of variance to understand how the run times behave.
On a side note, more often than not, we end up using the term 'average' or 'arithmetic mean' rashly. Look at the metric you plan to average and look at harmonic means, arithmetic means and geometric means in each of the cases. For example,finding the arithmetic mean for rates will give you incorrect answers. Simply finding arithmetic means of two events which do not occur equally in time can give incorrect results. Instead, use weighted arithmetic means.

Fortran vs C++, does Fortran still hold any advantage in numerical analysis these days? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
With the rapid development of C++ compilers,especially the intel ones, and the abilities of directly applying SIMD functions in your C/C++ code, does Fortran still hold any real advantage in the world of numerical computations?
I am from an applied maths background, my job involves a lot of numerical analysis, computations, optimisations and such, with a strictly defined performance-requirement.
I hardly know anything about Fortran, I have some experience in C/CUDA/matlab(if you consider the latter as a computer language to begin with), and my daily task involves analysis of very large data (e.g. 10GB-large matrix), and it seems the program at least spend 2/3 of its time on memory-accessing (thats why I send some of its job to GPU), do you people think it may worth the effects for me to trying the fortran routine on at least some performance-critical part of my code to improve the performance of my program?
Because the complexity and things need to be done involved there, I will only go that routine if only there is significant performance benefit there, thanks in advance.

Fortran has strict aliasing semantics compared to C++ and has been aggressively tuned for numerical performance for decades. Algorithms that uses the CPU to work with arrays of data often have the potential to benefit from a Fortran implementation.
The programming languages shootout should not be taken too seriously, but of the 15 benchmarks, Fortran ranks #1 for speed on four of them (for Intel Q6600 one core), more than any other single language. You can see that the benchmarks where Fortran shines are the heavily numerical ones:
spectral norm 27% faster
fasta 67% faster
mandelbrot 56% faster
pidigits 18% faster
Counterexample:
k-nucleotide 500% slower (this benchmark focuses heavily on more sophisticated data structures and string processing, which is not Fortran's strength)
You can also see a summary page "how many times slower" that shows that out of all implementations, the Fortran code is on average closest to the fastest implementation for each benchmark -- although the quantile bars are much larger than for C++, indicating Fortran is unsuited for some tasks that C++ is good at, but you should know that already.
So the questions you will need to ask yourself are:
Is the speed of this function so critical that reimplementing it in Fortran is worth my time?
Is performance so important that my investment in learning Fortran will pay off?
Is it possible to use a library like ATLAS instead of writing the code myself?
Answering these questions would require detailed knowledge of your code base and business model, so I can't answer those. But yes, Fortran implementations are often faster than C++ implementations.
Another factor in your decision is the amount of sample code and the quantity of reference implementations available. Fortran's strong history means that there is a wealth of numerical code available for download and even with a trip to the library. As always you will need to sift through it to find the good stuff.

The complete and correct answer to your question is, "yes, Fortran does hold some advantages".
C++ also holds some, different, advantages. So do Python, R, etc etc. They're different languages. It's easier and faster to do some things in one language, and some in others. All are widely used in their communities, and for very good reasons.
Anything else, in the absence of more specific questions, is just noise and language-war-bait, which is why I've voted to close the question and hope others will too.

Fortran is just naturally suited for numerical programming. You tend to have a large amount of numbers in such programs, typically arranged arrays. Arrays are first class citizens in Fortran and it is often pretty straight forward to translate numerical kernels from Matlab into Fortran.
Regarding potential performance advantages see the other answers, that cover this quite nicely. The baseline is probably you can create highly efficient numerical applications with most compiled languages today, but you might jump through some loops to get there. Fortran was carefully designed to allow the compiler to recognize most spots for optimizations, due to the language features. Of course you can also write arbitrary slow code with any compiled language, including Fortran.
In any case you should pick the tools as suited. Fortran suits numerical applications, C suits system related development. On a final remark, learning Fortran basics is not hard, and it is always worthwhile to have a look into other languages. This opens a different view on problems you want to solve.

Also worth mentioning is that Fortran is a lot easier to master than C++. In fact, Fortran has a shorter language spec than plain C and it's syntax is arguably simpler. You can pick it up very quickly.
Meaning that if you are only interested in learning C++ or Fortran to solve a single specific problem you have at the moment (say, to speed up the bottlenecks in something you wrote in a prototyping language), Fortran might give you a better return on investment.

Fortran code is better for matrix and vector type operation in general. But you also can produce similar performance with c/c++ code by passing hints/suggestions to the compiler to produce similar quality vector instructions. One option that gave me good boost was not to assume memory aliasing among input variables that are array objects. This way, the compiler can aggressively do inner loop unrolling and pipelining for ILP where it can overlap loads and store operation across loop iteration with right prefetches.

Which numerical library to use for porting from Matlab to C++? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am currently prototyping some algorithms in Matlab that rely on matrix, DSP, statistics and image analysis functionality.
Some examples of what I may need:
eigenvectors
convolution in 2D and 3D
FFT
Short Time Fourier Transform
Hilbert transform
Chebyshev polynomials
low pass filter
random multivariate gaussian numbers
kmeans
Later on I will need to implement these algorithms in C++.
I also have a license for Numerical Recipes in C++, which I like because it is well documented and have a wide variety of algorithms.
I also found a class that helps with wrapping NR functions in MEX:nr3matlab.h.
So using this class I should be able to generate wrappers that allow me to call NR functions from Matlab. This is very important to me, so that I can check each step when porting from Matlab to C++.
However Numerical Recipes in C++ have some important shortcomings:
algorithms implemented in a simple, and not necessarily very efficient
manner
not threaded
I am therefore considering using another numerical library.
The ideal library should:
be as broad in scope and functionality as possible
be well documented
(have commercial support)
have already made Matlab wrappers
very robust
very efficient
threaded
(have a GPU implementation that can be turned
on instead of the CPU with a "switch")
Which numerical library (libraries) would you suggest?
Thanks in advance for any answers!

You have a pretty long list of requirements, and it may be challenging to cover them all with a single library.
For general Matlab-to-C++ transitions, I can highly recommend Armadillo which is a templated C++ library with a focus on linear algebra --- and a given focus on making it easy to write Matlab-alike expression. It as very good performance, is very well documented and actively maintained. You could start there and try to fill in the missing pieces for your task.

Actually you should have a look at openCV.
Although its first goal is computer vision/image processing, this library has a lot of linear algebra tools (Almost all that you ask for). At first, this library has been implemented by intel, with a lot of focus on performance. It can handle multi thread, IPP,...
The syntax is rather easier to use than usual C++ library.
You should have a look at this cheat sheet. The syntax has been changed since version 2.0 to mimic matlab.
This library is broadly used, and well active (last big update August 2011).

NAG could be one good option. Loads of financial institutions use it in their mathematical libraries. Don't have a GPU implementation though, when I last used it.

there is also the Eigen library: http://eigen.tuxfamily.org
but it is mostly used as part of a larger framework. It offers basic (and a bit more complex) algebra

What are the functions in the standard library that can be implemented faster with programming hacks? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have recently read an article about fast sqrt calculation. Therefore, I have decided to ask SO community and its experts to help me find out, which STL algorithms or mathematical calculations can be implemented faster with programming hacks?
It would be great if you can give examples or links.
Thanks in advance.

System library developers have more concerns than just performance in mind:
Correctness and standards compliance: Critical!
General use: No optimisations are introduced, unless they benefit the majority of users.
Maintainability: Good hand-written assembly code can be faster, but you don't see much of it. Why?
Portability: Decent libraries should be portable to more than just Windows/x86/32bit.
Many optimisation hacks that you see around violate one or more of the requirements above.
In addition, optimisations that will be useless or even break when the next generation CPU comes around the corner are not a welcome thing.
If you don't have profiler evidence on it being really useful, don't bother optimising the system libraries. If you do, work on your own algorithms and code first, anyway...
EDIT:
I should also mention a couple of other all-encompassing concerns:
The cost/effort to profit/result ratio: Optimisations are an investment. Some of them are seemingly-impressive bubbles. Others are deeper and more effective in the long run. Their benefits must always be considered in relation to the cost of developing and maintaining them.
The marketing people: No matter what you think, you'll end up doing whatever they want - or think they want.

Probably all of them can be made faster for a specific problem domain.
Now the real question is, which ones should you hack to make faster? None, until the profiler tells you to.

Several of the algorithms in <algorithm> can be optimized for vector<bool>::[const_]iterator. These include:
find
count
fill
fill_n
copy
copy_backward
move // C++0x
move_backward // C++0x
swap_ranges
rotate
equal
I've probably missed some. But all of the above algorithms can be optimized to work on many bits at a time instead of just one bit at a time (as would a naive implementation).
This is an optimization that I suspect is sorely missing from most STL implementations. It is not missing from this one:
http://libcxx.llvm.org/

This is where you really need to listen to project managers and MBAs. What you're suggesting is re-implementing parts of the STL and or standard C library. There is an associated cost in terms of time to implement and maintenance burden of doing so, so you shouldn't do it unless you really, genuinely need to, as John points out. The rule is simple: is this calculation you're doing slowing you down (a.k.a. you are bound by the CPU)? If not, don't create your own implementation just for the sake of it.
Now, if you're really interested in fast maths, there are a few places you can start. The gnu multi-precision library implements many algorithms from modern computer arithmetic and semi numerical algorithms that are all about doing maths on arbitrary precision integers and floats insanely fast. The guys who write it optimise in assembly per build platform - it is about as fast as you can get in single core mode. This is the most general case I can think of for optimised maths i.e. that isn't specific to a certain domain.
Bringing my first paragraph and second in with what thkala has said, consider that GMP/MPIR have optimised assembly versions per cpu architecture and OS they support. Really. It's a big job, but it is what makes those libraries so fast on a specific small subset of problems that are programming.
Sometimes domain specific enhancements can be made. This is about understanding the problem in question. For example, when doing finite field arithmetic under rijndael's finite field you can, based on the knowledge that the characteristic polynomial is 2 with 8 terms, assume that your integers are of size uint8_t and that addition/subtraction are equivalent to xor operations. How does this work? Well basically if you add or subtract two elements of the polynomial, they contain either zero or one. If they're both zero or both one, the result is always zero. If they are different, the result is one. Term by term, that is equivalent to xor across a 8-bit binary string, where each bit represents a term in the polynomial. Multiplication is also relatively efficient. You can bet that rijndael was designed to take advantage of this kind of result.
That's a very specific result. It depends entirely on what you're doing to make things efficient. I can't imagine many STL functions are purely optimised for cpu speed, because amongst other things STL provides: collections via templates, which are about memory, file access which is about storage, exception handling etc. In short, being really fast is a narrow subset of what STL does and what it aims to achieve. Also, you should note that optimisation has different views. For example, if your app is heavy on IO, you are IO bound. Having a massively efficient square root calculation isn't really helpful since "slowness" really means waiting on the disk/OS/your file parsing routine.
In short, you as a developer of an STL library are trying to build an "all round" library for many different use cases.
But, since these things are always interesting, you might well be interested in bit twiddling hacks. I can't remember where I saw that, but I've definitely stolen that link from somebody else on here.

Almost none. The standard library is designed the way it is for a reason.
Taking sqrt, which you mention as an example, the standard library version is written to be as fast as possible, without sacrificing numerical accuracy or portability.
The article you mention is really beyond useless. There are some good articles floating around the 'net, describing more efficient ways to implement square roots. But this article isn't among them (it doesn't even measure whether the described algorithms are faster!) Carmack's trick is slower than std::sqrt on a modern CPU, as well as being less accurate.
It was used in a game something like 12 years ago, when CPUs had very different performance characteristics. It was faster then, but CPU's have changed, and today, it's both slower and less accurate than the CPU's built-in sqrt instruction.
You can implement a square root function which is faster than std::sqrt without losing accuracy, but then you lose portability, as it'll rely on CPU features not present on older CPU's.
Speed, accuracy, portability: choose any two. The standard library tries to balance all three, which means that the speed isn't as good as it could be if you were willing to sacrifice accuracy or portability, and accuracy is good, but not as good as it could be if you were willing to sacrifice speed, and so on.
In general, forget any notion of optimizing the standard library. The question you should be asking is whether you can write more specialized code.
The standard library has to cover every case. If you don't need that, you might be able to speed up the cases that you do need. But then it is no longer a suitable replacement for the standard library.
Now, there are no doubt parts of the standard library that could be optimized. the C++ IOStreams library in particular comes to mind. It is often naively, and very inefficiently, implemented. The C++ committee's technical report on C++ performance has an entire chapter dedicated to exploring how IOStreams could be implemented to be faster.
But that's I/O, where performance is often considered to be "unimportant".
For the rest of the standard library, you're unlikely to find much room for optimization.

What are the most widely used C++ vector/matrix math/linear algebra libraries, and their cost and benefit tradeoffs? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
It seems that many projects slowly come upon a need to do matrix math, and fall into the trap of first building some vector classes and slowly adding in functionality until they get caught building a half-assed custom linear algebra library, and depending on it.
I'd like to avoid that while not building in a dependence on some tangentially related library (e.g. OpenCV, OpenSceneGraph).
What are the commonly used matrix math/linear algebra libraries out there, and why would decide to use one over another? Are there any that would be advised against using for some reason? I am specifically using this in a geometric/time context*(2,3,4 Dim)* but may be using higher dimensional data in the future.
I'm looking for differences with respect to any of: API, speed, memory use, breadth/completeness, narrowness/specificness, extensibility, and/or maturity/stability.
Update
I ended up using Eigen3 which I am extremely happy with.

There are quite a few projects that have settled on the Generic Graphics Toolkit for this. The GMTL in there is nice - it's quite small, very functional, and been used widely enough to be very reliable. OpenSG, VRJuggler, and other projects have all switched to using this instead of their own hand-rolled vertor/matrix math.
I've found it quite nice - it does everything via templates, so it's very flexible, and very fast.
Edit:
After the comments discussion, and edits, I thought I'd throw out some more information about the benefits and downsides to specific implementations, and why you might choose one over the other, given your situation.
GMTL -
Benefits: Simple API, specifically designed for graphics engines. Includes many primitive types geared towards rendering (such as planes, AABB, quatenrions with multiple interpolation, etc) that aren't in any other packages. Very low memory overhead, quite fast, easy to use.
Downsides: API is very focused specifically on rendering and graphics. Doesn't include general purpose (NxM) matrices, matrix decomposition and solving, etc, since these are outside the realm of traditional graphics/geometry applications.
Eigen -
Benefits: Clean API, fairly easy to use. Includes a Geometry module with quaternions and geometric transforms. Low memory overhead. Full, highly performant solving of large NxN matrices and other general purpose mathematical routines.
Downsides: May be a bit larger scope than you are wanting (?). Fewer geometric/rendering specific routines when compared to GMTL (ie: Euler angle definitions, etc).
IMSL -
Benefits: Very complete numeric library. Very, very fast (supposedly the fastest solver). By far the largest, most complete mathematical API. Commercially supported, mature, and stable.
Downsides: Cost - not inexpensive. Very few geometric/rendering specific methods, so you'll need to roll your own on top of their linear algebra classes.
NT2 -
Benefits: Provides syntax that is more familiar if you're used to MATLAB. Provides full decomposition and solving for large matrices, etc.
Downsides: Mathematical, not rendering focused. Probably not as performant as Eigen.
LAPACK -
Benefits: Very stable, proven algorithms. Been around for a long time. Complete matrix solving, etc. Many options for obscure mathematics.
Downsides: Not as highly performant in some cases. Ported from Fortran, with odd API for usage.
Personally, for me, it comes down to a single question - how are you planning to use this. If you're focus is just on rendering and graphics, I like Generic Graphics Toolkit, since it performs well, and supports many useful rendering operations out of the box without having to implement your own. If you need general purpose matrix solving (ie: SVD or LU decomposition of large matrices), I'd go with Eigen, since it handles that, provides some geometric operations, and is very performant with large matrix solutions. You may need to write more of your own graphics/geometric operations (on top of their matrices/vectors), but that's not horrible.

So I'm a pretty critical person, and figure if I'm going to invest in a library, I'd better know what I'm getting myself into. I figure it's better to go heavy on the criticism and light on the flattery when scrutinizing; what's wrong with it has many more implications for the future than what's right. So I'm going to go overboard here a little bit to provide the kind of answer that would have helped me and I hope will help others who may journey down this path. Keep in mind that this is based on what little reviewing/testing I've done with these libs. Oh and I stole some of the positive description from Reed.
I'll mention up top that I went with GMTL despite it's idiosyncrasies because the Eigen2 unsafeness was too big of a downside. But I've recently learned that the next release of Eigen2 will contain defines that will shut off the alignment code, and make it safe. So I may switch over.
Update: I've switched to Eigen3. Despite it's idiosyncrasies, its scope and elegance are too hard to ignore, and the optimizations which make it unsafe can be turned off with a define.
Eigen2/Eigen3
Benefits: LGPL MPL2, Clean, well designed API, fairly easy to use. Seems to be well maintained with a vibrant community. Low memory overhead. High performance. Made for general linear algebra, but good geometric functionality available as well. All header lib, no linking required.
Idiocyncracies/downsides: (Some/all of these can be avoided by some defines that are available in the current development branch Eigen3)
Unsafe performance optimizations result in needing careful following of rules. Failure to follow rules causes crashes.
you simply cannot safely pass-by-value
use of Eigen types as members requires special allocator customization (or you crash)
use with stl container types and possibly other templates required
special allocation customization (or you will crash)
certain compilers need special care to prevent crashes on function calls (GCC windows)
GMTL
Benefits: LGPL, Fairly Simple API, specifically designed for graphics engines.
Includes many primitive types geared towards rendering (such as
planes, AABB, quatenrions with multiple interpolation, etc) that
aren't in any other packages. Very low memory overhead, quite fast,
easy to use. All header based, no linking necessary.
Idiocyncracies/downsides:
API is quirky
what might be myVec.x() in another lib is only available via myVec[0] (Readability problem)
an array or stl::vector of points may cause you to do something like pointsList[0][0] to access the x component of the first point
in a naive attempt at optimization, removed cross(vec,vec) and
replaced with makeCross(vec,vec,vec) when compiler eliminates
unnecessary temps anyway
normal math operations don't return normal types unless you shut
off some optimization features e.g.: vec1 - vec2 does not return a
normal vector so length( vecA - vecB ) fails even though vecC = vecA -
vecB works. You must wrap like: length( Vec( vecA - vecB ) )
operations on vectors are provided by external functions rather than
members. This may require you to use the scope resolution everywhere
since common symbol names may collide
you have to do
length( makeCross( vecA, vecB ) )
or
gmtl::length( gmtl::makeCross( vecA, vecB ) )
where otherwise you might try
vecA.cross( vecB ).length()
not well maintained
still claimed as "beta"
documentation missing basic info like which headers are needed to
use normal functionalty
Vec.h does not contain operations for Vectors, VecOps.h contains
some, others are in Generate.h for example. cross(vec&,vec&,vec&) in
VecOps.h, [make]cross(vec&,vec&) in Generate.h
immature/unstable API; still changing.
For example "cross" has moved from "VecOps.h" to "Generate.h", and
then the name was changed to "makeCross". Documentation examples fail
because still refer to old versions of functions that no-longer exist.
NT2
Can't tell because they seem to be more interested in the fractal image header of their web page than the content. Looks more like an academic project than a serious software project.
Latest release over 2 years ago.
Apparently no documentation in English though supposedly there is something in French somewhere.
Cant find a trace of a community around the project.
LAPACK & BLAS
Benefits: Old and mature.
Downsides:
old as dinosaurs with really crappy APIs

For what it's worth, I've tried both Eigen and Armadillo. Below is a brief evaluation.
Eigen
Advantages:
1. Completely self-contained -- no dependence on external BLAS or LAPACK.
2. Documentation decent.
3. Purportedly fast, although I haven't put it to the test.
Disadvantage:
The QR algorithm returns just a single matrix, with the R matrix embedded in the upper triangle. No idea where the rest of the matrix comes from, and no Q matrix can be accessed.
Armadillo
Advantages:
1. Wide range of decompositions and other functions (including QR).
2. Reasonably fast (uses expression templates), but again, I haven't really pushed it to high dimensions.
Disadvantages:
1. Depends on external BLAS and/or LAPACK for matrix decompositions.
2. Documentation is lacking IMHO (including the specifics wrt LAPACK, other than changing a #define statement).
Would be nice if an open source library were available that is self-contained and straightforward to use. I have run into this same issue for 10 years, and it gets frustrating. At one point, I used GSL for C and wrote C++ wrappers around it, but with modern C++ -- especially using the advantages of expression templates -- we shouldn't have to mess with C in the 21st century. Just my tuppencehapenny.

If you are looking for high performance matrix/linear algebra/optimization on Intel processors, I'd look at Intel's MKL library.
MKL is carefully optimized for fast run-time performance - much of it based on the very mature BLAS/LAPACK fortran standards. And its performance scales with the number of cores available. Hands-free scalability with available cores is the future of computing and I wouldn't use any math library for a new project doesn't support multi-core processors.
Very briefly, it includes:
Basic vector-vector, vector-matrix,
and matrix-matrix operations
Matrix factorization (LU decomp, hermitian,sparse)
Least squares fitting and eigenvalue problems
Sparse linear system solvers
Non-linear least squares solver (trust regions)
Plus signal processing routines such as FFT and convolution
Very fast random number generators (mersenne twist)
Much more.... see: link text
A downside is that the MKL API can be quite complex depending on the routines that you need. You could also take a look at their IPP (Integrated Performance Primitives) library which is geared toward high performance image processing operations, but is nevertheless quite broad.
Paul
CenterSpace Software ,.NET Math libraries, centerspace.net

What about GLM?
It's based on the OpenGL Shading Language (GLSL) specification and released under the MIT license.
Clearly aimed at graphics programmers

I've heard good things about Eigen and NT2, but haven't personally used either. There's also Boost.UBLAS, which I believe is getting a bit long in the tooth. The developers of NT2 are building the next version with the intention of getting it into Boost, so that might count for somthing.
My lin. alg. needs don't exteed beyond the 4x4 matrix case, so I can't comment on advanced functionality; I'm just pointing out some options.

I'm new to this topic, so I can't say a whole lot, but BLAS is pretty much the standard in scientific computing. BLAS is actually an API standard, which has many implementations. I'm honestly not sure which implementations are most popular or why.
If you want to also be able to do common linear algebra operations (solving systems, least squares regression, decomposition, etc.) look into LAPACK.

I'll add vote for Eigen: I ported a lot of code (3D geometry, linear algebra and differential equations) from different libraries to this one - improving both performance and code readability in almost all cases.
One advantage that wasn't mentioned: it's very easy to use SSE with Eigen, which significantly improves performance of 2D-3D operations (where everything can be padded to 128 bits).

Okay, I think I know what you're looking for. It appears that GGT is a pretty good solution, as Reed Copsey suggested.
Personally, we rolled our own little library, because we deal with rational points a lot - lots of rational NURBS and Beziers.
It turns out that most 3D graphics libraries do computations with projective points that have no basis in projective math, because that's what gets you the answer you want. We ended up using Grassmann points, which have a solid theoretical underpinning and decreased the number of point types. Grassmann points are basically the same computations people are using now, with the benefit of a robust theory. Most importantly, it makes things clearer in our minds, so we have fewer bugs. Ron Goldman wrote a paper on Grassmann points in computer graphics called "On the Algebraic and Geometric Foundations of Computer Graphics".
Not directly related to your question, but an interesting read.

FLENS
http://flens.sf.net
It also implements a lot of LAPACK functions.

I found this library quite simple and functional (http://kirillsprograms.com/top_Vectors.php). These are bare bone vectors implemented via C++ templates. No fancy stuff - just what you need to do with vectors (add, subtract multiply, dot, etc).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js