A common linear programming problem below
min c'x
s.t. Ax<=b
(A is m*n, m is smaller than n)
As I know, the pivoting procedure in simplex method lets extreme point jump to another extreme point until it finds the optimal solution.
Extreme point has at most m(the number of constraints) nonzero variables. Variables in extreme point can be divided into two parts, basic variables(nonzero terms) and nonbasic variables(zero terms).
In normal condition, Pivoting change one nonbasic variable to basic variable while one basic variable become nonbasic variable in each iteration.
My question is can a nonbasic variable which was basic before become basic again? If yes, is there an clear or special example that at least one variable does.
Related
UPDATE: I ended up not using Eigen and implementing my own GF(2) matrix representation where each row is an array of integers, and each bit of the integer represents a single entry. I then use a modified Gaussian Elimination with bit operations to obtain the desired vectors
I currently have a (large) rectangular sparse matrix that I'm storing using Eigen3 that I want to find the (right) null space over GF(2). I researched around and found some possible approaches to this:
(Modified) Gaussian Elimination
This means simply using some form of Gaussian Elimination to find a reduced form of the matrix that preserves the nullspace then extract the nullspace off of that. Though I know how I would do this by hand, I'm quite clueless as to how I would actually implement this.
SVD Decomposition
QR Decomposition
I'm not familiar with these, but I from my understanding the (orthonormal) basis vectors of the nullspace can be extracted from the decomposed form of the matrix.
Now my question is: Which approach should I use in my case (i.e. rectangular sparse matrix over GF(2)) that doesn't involve converting into a dense matrix? And if there are many approaches, what would recommended in terms of performance and ease of implementation?
I'm also open to using other libraries besides Eigen as well.
For context, I'm trying to find combine equivalence relations for factoring algorithms (e.g. as in Quadratic Sieve). Also, if possible, I would like to look into parallelising these algorithms in the future, so if there exists an approach that would allow this, that would be great!
Let's call the matrix in question M. Then (please correct me if I'm wrong):
GF(2) implies that M is a equivalent to a matrix of bits - each element can have one of two values.
Arithmetic on GF(2) is just like integer arithmetic on non-negative numbers, but done modulo 2, so addition is a bitwise XOR, and multiplication is a bitwise AND. It won't matter what exact elements the GF(2) has - they are all equivalent to bits.
Vectors in GF(2) are linearly independent as long as they are not equal, or as long as they differ by at least on bit, or v_1 + v_2 ≠ 0 (since addition in GF(2) is boolean XOR).
By definition, the (right) nullspace spans basis vectors that the matrix transforms to 0. A vector v would be in the nullspace if one multiplies each j-th column of M with the j-th bit of v, sum them, and the result is zero.
I see at least two ways of going about it.
Do dense Gaussian elimination in terms of bit operations, and organize the data and write the loops so that the compiler vectorizes everything and operates on 512-bit data types. You could use Compiler Explorer on godbolt.org to easily check that the vectorization takes place and e.g. AVX512 instructions are used. Linear gains will eventually lose out with the squared scaling of the problem, of course, but the performance increase over naive bool-based implementation will be massive and may be sufficient for your needs. The sparsity adds a possible complication: if the matrix won't comfortably fit in memory in a dense representation, then a suitable representation has to be devised that makes Gaussian elimination perform well. More is need to be known about the matrices you work on. Generally speaking, row operations will be performed at memory bandwidth if the implementation is correct, on the order of 1E10 elements/s, so a 1E3x1E3 M should process in about a second at most.
Since the problem is equivalent to a set of boolean equations, use a SAT solver (Boolean satisfiability problem solver) to incrementally generate the nullspace. The initial equation set is M × v = 0 and v ≠ 0, where v is a bit vector. Run the SAT until it finds some v, let's call it v_i. Then add a constraint v ≠ v_i, and run SAT again - adding the constraints in each iteration. That is, k-th iteration has constraints v ≠ 0, v ≠ v1, ... v ≠ v(k-1).
Since all bit vectors that are different are also linearly independent, the inequality constraints will force incremental generation of nullspace basis vectors.
Modern SAT excels at sparse problems with more boolean equations than variables, so I imagine this would work very well - the sparser the matrix, the better. The problem should be pre-processed to remove all zero columns in M to minimize the combinatorial explosion. Open source SAT solvers can easily deal with 1M variable problems - so, for a sparse problem, you could be realistically solving with 100k-1M columns in M, and about 10 "ones" in each row. So a 1Mx1M sparse matrix with 10 "ones" in each row on average would be a reasonable task for common SAT solvers, and I imagine that state of the art could deal with 10Mx10M matrices and beyond.
Furthermore, your application is ideal for incremental solvers: you find one solution, stop, add a constraint, resume, and so on. So I imagine you may get very good results, and there are several good open source solvers to choose from.
Since you use Eigen already, the problem would at least fit into the SparseMatrix representation with byte-sized elements, so it's not a very big problem as far as SAT is concerned.
I wonder whether this nullspace basis finding is a case of a cover problem, possibly relaxed. There are some nice algorithms for those, but it's always a question of whether the specialized algorithm will work better than just throwing SAT at it and waiting it out, so to speak.
Updated answer - thanks to harold: QR decomposition is not applicable in general for your case.
See for instance
https://math.stackexchange.com/questions/1346664/how-to-find-orthogonal-vectors-in-gf2
I wrongly assumed, QR is applicable here, but it's not by theory.
If you are still interested in details about QR-algorithms, please open a new thread.
I have:
- a set of points of known size (in my case, only 6 points)
- a line characterized by x = s + t * r, where x, s and r are 3D vectors
I need to find the point closest to the given line. The actual distance does not matter to me.
I had a look at several different questions that seem related (including this one) and know how to solve this on paper from my highschool math classes. But I cannot find a solution without calculating every distance, and I am sure there has to be a better/faster way. Performance is absolutely crucial in my application.
One more thing: All numbers are integers (coordinates of points and elements of s and r vectors). Again, for performance reasons I would like to keep the floating-point math to a minimum.
You have to process every point at least once to know their distance. Unless you want to repeat the process many times with different lines, simply computing the distance of every point is unavoidable. So the algorithm has to be O(n).
Since you don't care about the actual distance, we can make some simplification to the point-distance computation. The exact distance is computed by (source):
d^2 = |r⨯(p-s)|^2 / |r|^2
where ⨯ is the cross product and |r|^2 is the squared length of vector r. Since |r|^2 is constant for all points, we can omit it from the distance computation without changing result:
d^2 = |r⨯(p-s)|^2
Compare the approximated square distances and keep the minimum. The advantage of this formula is that you can do everything with integers since you mentioned that all coordinates are integers.
I'm afraid you can't get away with computing less than 6 distances (if you could, at least one point would be left out -- including the nearest one).
See if it makes sense to preprocess: Is the line fixed and the points vary? Consider rotating coordinates to make the line horizontal.
As there are few points, it is doubtful that this is your bottleneck. Measure where the hot spots are, redesign algorithms/data representation, spice up compiler optimization, compile to assembly and bum that. Strictly in that order.
Jon Bentley's "Writing Efficient Programs" (sadly long out of print) and "Programming Pearls" (2nd edition) are full of advise on practical programming.
I am designing the classical Runge-Kutta scheme (RK4) for a large number of coupled equations in Python 2.7. Since there is going to be over a hundred coupled 1st order equations the for-loops will be hell large and I am looking for some optimization hints.
1. When calculating a vector of returned variables for RK coefficients is it better to...
Prealocate a numpy array and fill it or
Use list.append for each variable and numpy.array(list) at the end?
2. The coupled equations obviously have coefficients. Is it better to...
Insert them into the function called in RK4 steps (i.e. redefine them each time the function evaluation is called) or
Label them as global variables?
If I have data (a daily stock chart is a good example but it could be anything) in which I only know the range (high - low) that X units sold within but I don't know the exact price at which any given item sold. Assume for simplicity that the price range contains enough buckets (e.g. forty one-cent increments for a 40 cent range) to make such a distribution practical. How can I go about distributing those items to form a normal bell curve stored in a vector? It doesn't have to be perfect but realistic.
My (very) naive thinking has been to assume that since random numbers should form a normal distribution I can do something like have a binary RNG. If, for example, there are forty buckets then if a '0' comes up 40 times the 0th bucket gets incremented and if a '1' comes up for times in a row then the 39th bucket gets incremented. If '1' comes up 20 times then it is in the middle of the vector. Do this for each item until X units have been accounted for. This may or may not be right and in any case seems way more inefficient than necessary. I am looking for something more sensible.
This isn't homework, just a problem that has been bugging me and my statistics is not up to snuff. Most literature seems to be about analyzing the distribution after it already exists but not much about how to artificially create one.
I want to write this in c++ so pre-packaged solutions in R or matlab or whatnot are not too useful for me.
Thanks. I hope this made sense.
Most literature seems to be about analyzing the distribution after it already exists but not much about how to artificially create one.
There's tons of literature on how to create one. The Box–Muller transform, the Marsaglia polar method (a variant of Box-Muller), and the Ziggurat algorithm are three. (Google those terms). Both Box-Muller methods are easy to implement.
Better yet, just use a random generator that already exists that implements one of these algorithms. Both boost and the new C++11 have such packages.
The algorithm that you describe relies on the Central Limit Theorem that says that a random variable defined as the sum of n random variables that belong to the same distribution tends to approach a normal distribution when n grows to infinity. Uniformly distributed pseudorandom variables that come from a computer PRNG make a special case of this general theorem.
To get a more efficient algorithm you can view probability density function as a some sort of space warp that expands the real axis in the middle and shrinks it to the ends.
Let F: R -> [0:1] be the cumulative function of the normal distribution, invF be its inverse and x be a random variable uniformly distributed on [0:1] then invF(x) will be a normally distributed random variable.
All you need to implement this is be able to compute invF(x). Unfortunately this function cannot be expressed with elementary functions. In fact, it is a solution of a nonlinear differential equation. However you can efficiently solve the equation x = F(y) using the Newton method.
What I have described is a simplified presentation of the Inverse transform method. It is a very general approach. There are specialized algorithms for sampling from the normal distribution that are more efficient. These are mentioned in the answer of David Hammen.
The background for asking this question is that I am solving a linearized equation system (Ax=b), where A is a matrix (typically of dimension less than 100x100) and x and b are vectors. I am using a direct method, meaning that I first invert A, then find the solution by x=A^(-1)b. This step is repated in an iterative process until convergence.
The way I'm doing it now, using a matrix library (MTL4):
For every iteration I copy all coeffiecients of A (values) in to the matrix object, then invert. This the easiest and safest option.
Using an array of pointers instead:
For my particular case, the coefficients of A happen to be updated between each iteration. These coefficients are stored in different variables (some are arrays, some are not). Would there be a potential for performance gain if I set up A as an array containing pointers to these coefficient variables, then inverting A in-place?
The nice thing about the last option is that once I have set up the pointers in A before the first iteration, I would not need to copy any values between successive iterations. The values which are pointed to in A would automatically be updated between iterations.
So the performance question boils down to this, as I see it:
- The matrix inversion process takes roughly the same amount of time, assuming de-referencing of pointers is non-expensive.
- The array of pointers does not need the extra memory for matrix A containing values.
- The array of pointers option does not have to copy all NxN values of A between each iteration.
- The values that are pointed to the array of pointers option are generally NOT ordered in memory. Hopefully, all values lie relatively close in memory, but *A[0][1] is generally not next to *A[0][0] etc.
Any comments to this? Will the last remark affect performance negatively, thus weighing up for the positive performance effects?
Test, test, test.
Especially in the field of Numerical Linear Algebra. There are many effects in play, which is why there is a number of optimized libraries that have solved that burden for you.
Some effects to consider:
Memory locality and cache effects
Multithreading effects (some algorithms that are optimal while running single-core, cause memory collision/cache eviction when more than one core is utilized).
There is no substitute for testing.
Here are some comments:
Is the function you use for the inversion capable of handling a matrix of pointers instead of values? If it does not realise it has to do an indirection, all kinds of strange effects could happen.
When doing an in-place matrix inversion (meaning the inverted matrix overwrites the input matrix), all input coefficients will get overwritten with new values, because matrix inversion can not be done by re-ordering the elements of the matrix.
During the inversion process, none of the input coefficients may be changed by an outside process. All such updates have to be performed between iterations.
So, you get the following set of trade-offs when you chose the pointer solution:
The coefficients making up matrix A can no longer be calculated asynchronously with the matrix inversion.
Either all coefficients must be recalculated for each iteration (when you use in-place inversion, meaning the inverted matrix uses the same memory as the input matrix), or you still have to use a matrix of N x N values to hold the result of the inversion.
You're getting good answers here. The only thing I would add is some general experience with performance.
You are thinking about performance a-priori. That's reasonable, but the real payoff is a-posteriori. In other words, you don't know for certain where the real optimization opportunities are, until the running code tells you.
You don't know if the bulk of the time will be spent in matrix inversion, multiplication, copying the matrix, dereferencing, or what. People can guess. If I had to guess, it would be matrix inversion, because it's 100x100.
However, something else I can't guess might be even bigger.
Guessing has a very poor track record, especially when you can just find out.
Here's an example of what I mean.