sympy.solve() seems to give me a wrong result. There are known problems with inequalities, e.g. http://code.google.com/p/sympy/issues/detail?id=3244 but this is simple enough that it should work:
import sympy
from sympy.abc import x, u, s
t1 = x*(1 - x)/(1 - s*x)
t2 = u*x + (1-s)*(1 - u)*x*(1 - x)/(1 - s*x)
sympy.solve(t1-t2,x)
is giving me 3 solutions. There should be only two, the first one is wrong. Is this a bug or am I making a mistake somewhere?
This has been fixed in the development version of SymPy (and in 0.7.2, which will be released in about a week). It now gives [0, s*(u - 1)/(2*s*u - s - u)].
So, to answer your question, yes, it was a bug, and it was fixed.
Regarding what went wrong before, I used git bisect to narrow down the commit that fixed the problem. That commit just changed the way that one of the fundamental simplification algorithms in SymPy, as_numer_denom, works. So I guess what happened is that the result in some intermediate operation came out simple enough that solve() was able to recognize that the bad solution is just 1/s (assuming the square root denests). solve() does check the solutions it finds by pluggin them back in, but if they are too complicated, it won't be able to tell if they are bogus.
Probably actually what happened, now that I look a little closer is that just (t1 - t2).as_numer_denom()[0], which solve uses (because the zeros of an expression are just the zeros of the numerator), in 0.7.2 only has degree 2 in x, whereas in 0.7.1 it had degree 3. The bogus solution came from this "false zero" (meaning it was also a zero of the denominator), and as I noted, it was too complicated for solve to notice it as such.
That's the best I can say without really digging into the code deeper.
Related
In some inner loop I have:
double x;
...
int i = x/h_;
double xx = x - i*h_;
Thinking that might be a better way to do this, I tried with std::remquo
double x;
...
int i;
double xx = std::remquo(x, h_, &i);
Suddenly, timings went from 2.6 seconds to 40 seconds (for many executions of the loop).
The timing test is difficult to replicate here, but I did a online code to see if someone can help me to understand what is going on.
naive version: https://godbolt.org/z/PnsfR8
remquo version: https://godbolt.org/z/NSMwyW
It looks like the main difference is that remquo is not inlined and the naive code is. If that is the case, what is the purpose of remquo if it is going to be always slower than the manual code? Is it a matter of accuracy (e.g. for large argument) or not relying on (not well defined) casting conversion?
I just realized that the remquo version is not even doing something equivalent to the first code. So I am using it wrong. In any case, I am surprised that remquo is so slow.
It's a rubbish function that was added to C99 to entice Fortran coders to switch to C. There's little reason to actually use it, so library vendors avoid wasting time optimizing it.
See also: What does the function remquo do and what can it be used for?.
BTW if you assumed that i gets the quotient stored in it, read the documentation more closely! (Or read the answers on the question linked in the previous paragraph).
Hello stackoverflow community,
I have troubles in understanding a least-square-error-problem in the c++ armadillo package.
I have a matrix A with many more rows than columns (5000 to 100 for example) so it is overdetermined.
I want to find x so that A*x=b gives me the least square error.
If i use the solve function of armadillo on my data like "x = Solve(A,b)" the error of "(A*x-b)^2" is sometimes way to high.
If on the other hand I solve for x with the analytical form by "x = (A^T * A)^-1 *A^T * b" the results are always right.
The results for x in both cases can differ by 10 magnitudes.
I had thought that armadillo would use this analytical form in the background if the system is overdetermined.
Now I would like to understand why these two methods give such different results.
I wanted to give a short example program, but i can't reproduce this behavior with a short program.
I thought about giving the Matrix here, but with 5000 times 100 it's also very big. I can deliver the values for which this happens though if needed.
So as a short background.
The matrix I get from my program is a numerically solved reaction of a nonlinear oscillator in which I put information inside by wiggeling a parameter of this system.
Because the influence of this parameter on the system is small, the values of my different rows are very similar but never the same, otherwise armadillo should throw an error.
I'm still thinking that this is the problem, but the solve function never threw any error.
Another thing that confuses me is that in a short example program with a random matrix, the analytical form is waaay slower than the solve function.
But on my program, both are nearly identically fast.
I guess this has something to do with the numerical convergence of the pseudo inverse and the special case of my matrix, but for that i don't know enough about how armadillo works.
I hope someone can help me with that problem and thanks a lot in advance.
Thanks for the replies. I think i figured the problem out and wanted to give some feedback for everybody who runs into the same problem.
The Armadillo solve function gives me the x that minimizes (A*x-b)^2.
I looked at the values of x and they are sometimes in the magnitude of 10^13.
This comes from the fact that the rows of my matrix only slightly change. (So nearly linear dependent but not exactly).
Because of that i was in the numerical precision of my doubles and as a result my error sometimes jumped around.
If i use the rearranged analytical formular (A^T * A)*x = *A^T * b with the solve function this problem doesn't occur anymore because the fitted values of x are in the magnitude of 10^4. The least square error is a little bit higher but that is okay, as i want to avoid overfitting.
I now additionally added Tikhonov regularization by solving (A^T * A + lambda*Identity_Matrix)*x = *A^T * b with the solve function of armadillo.
Now the weight vectors are in the order of around 1 and the error nearly doesn't change compared to the formular without regularization.
This documentation page of boost::math::tools::brent_find_minima says about its first argument:
The function to minimise: a function object (or C++ lambda) ... with no maxima occurring in that interval.
But what happens if this is not the case? (After all, this condition is rather difficult to pre-ensure, especially since the function is usually expensive to evaluate at many points.) Best would be to detect violations to this condition on the fly.
If this condition is violated, does boost throw an exception, or does it exhibit undefined behavior?
A workaround I am thinking of is to build the checking into the lambda ("function to minimize"), by capturing and maintaining a std::map<double,double> holding all the points that have been evaluated, and comparing each new evaluation with its nearest neighbor in each direction, to check whether there may be a local maximum. But I don't want to do all that if it isn't necessary.
There is no way for this to be done. If you read Corless's A Graduate Introduction to Numerical Methods, you'll read a very interesting point: All numerically defined functions are discontinuous halfway between representables, and have zero derivatives between representables. Basically they can be thought of as a sum of Heaviside functions.
So none of them are differentiable in the mathematical sense. Ok, maybe you think this is a bit unfair-the scale should be zoomed out. But how much? We know that |x-1| isn't differentiable at x=1, but how could a computer tell that? How does it know that there isn't some locally smooth mollifier that makes it differentiable between x=1-eps and x=1+eps? I don't think there's a good answer to this question.
One of the most difficult problems in this class arises in quadrature. Some of these methods work fast when the complex extension of the function has poles far from the real axis. Try to numerically determine that.
Function spaces are impossible to determine numerically. Users just have to get it right.
Here's the problem:
I am currently trying to create a control system which is required to find a solution to a series of complex linear equations without a unique solution.
My problem arises because there will ever only be six equations, while there may be upwards of 20 unknowns (usually way more than six unknowns). Of course, this will not yield an exact solution through the standard Gaussian elimination or by changing them in a matrix to reduced row echelon form.
However, I think that I may be able to optimize things further and get a more accurate solution because I know that each of the unknowns cannot have a value smaller than zero or greater than one, but it is free to take on any value in between them.
Of course, I am trying to create code that would find a correct solution, but in the case that there are multiple combinations that yield satisfactory results, I would want to minimize Sum of (value of unknown * efficiency constant) over all unknowns, i.e. Sigma[xI*eI] from I=0 to n, but finding an accurate solution is of a greater priority.
Performance is also important, due to the fact that this algorithm may need to be run several times per second.
So, does anyone have any ideas to help me on implementing this?
Edit: You might just want to stick to linear programming with equality and inequality constraints, but here's an interesting exact solution that does not incorporate the constraint that your unknowns are between 0 and 1.
Here's a powerpoint discussing your problem: http://see.stanford.edu/materials/lsoeldsee263/08-min-norm.pdf
I'll translate your problem into math to make things a bit easier to figure out:
you have a 6x20 matrix A and a vector x with 20 elements. You want to minimize (x^T)e subject to Ax=y. According to the slides, if you were just minimizing the sum of x, then the answer is A^T(AA^T)^(-1)y. I'll take another look at this as soon as I get the chance and see what the solution is to minimizing (x^T)e (ie your specific problem).
Edit: I looked in the powerpoint some more and near the end there's a slide entitled "General norm minimization with equality constraints". I am going to switch the notation to match the slide's:
Your problem is that you want to minimize ||Ax-b||, where b = 0 and A is your e vector and x is the 20 unknowns. This is subject to Cx=d. Apparently the answer is:
x=(A^T A)^-1 (A^T b -C^T(C(A^T A)^-1 C^T)^-1 (C(A^T A)^-1 A^Tb - d))
it's not pretty, but it's not as bad as you might think. There's really aren't that many calculations. For example (A^TA)^-1 only needs to be calculated once and then you can reuse the answer. And your matrices aren't that big.
Note that I didn't incorporate the constraint that the elements of x are within [0,1].
It looks like the solution for what I am doing is with Linear Programming. It is starting to come back to me, but if I have other problems I will post them in their own dedicated questions instead of turning this into an encyclopedia.
I made collision equation
(col and cold are lines ->x and ->y are start points and h() and w() are height and width). o & z are unknown.
col->x+(col->w())*o=cold->x+(cold->w())*z;
col->y+(col->h())*o=cold->y+(cold->h())*z;
and I solved it:
z=(cold->y-col->y-col->h()/col->w()* (cold->x-col->x))/(col->h()/col->w()*cold->w() - cold->h());
o=(cold->x+cold->w()*z-col->x)/col->w();
It works well(? not sure), but if one of lines is vertical or horizontal I get everywhere NaNs. Somebody got idea why? Is it correct (I did it six times)?
You're probably dividing zero by zero in those cases.
I'd suggest breaking your assignment down into step-by-step pieces, and checking the value as you go.
You're probably doing it wrong. It's kind of hard to tell, though, because you apparently can't be bothered to tell us what on earth you're trying to do. Oh, and if you're doing it wrong, it's still wrong if you've been doing it wrong six times. I found that doing it right once is by far the superior approach.
Using Mathematica (I'm lazy) I get:
This should give you a division by zero if coldw colh = coldh colw, or coldh/coldw = colh/colw, i.e. when both slopes are equal (in other words, when both lines are parallel).