How does the comma in line 2 translate to Fortran - fortran

lat2: =ASIN(SIN(lat1)*COS(d/ER) + COS(lat1)*SIN(d/ER)*COS(brng))
lon2: =lon1 + ATAN2(COS(d/ER)-SIN(lat1)*SIN(lat2), SIN(brng)*SIN(d/ER)*COS(lat1))
The above code is part of the code to start with Lat1 and Long1, travel azimuth and distance to arrive at Lat2 and Long2.
I am trying to convert the equations to Fortran, but I do not understand what to do with the comma. My current model is working for most test cases but is not correct when the distance crosses the 0 or 360 deg longitude line. The Long2 error is in stead of say + 10 deg E I get 350 deg E. I hope your model using the above equations handles the quadrant problem better.

ATAN2 is a Fortran function of two arguments. It is a common function that exists in several other programming languages as well, probably also in the language you are copying your lines from. You should have told us which language is that!
The function "computes the principal value of the argument function of the complex number X + i Y". The comma simply divides the first and the other argument.
Check, whether the language you are translating from uses the ATAN2 function in the same order for x and y as Fortran does. If not, switch the two arguments. Then simply call the Fortran function.
I do not understand this remark "hope your model using the above equations handles the quadrant problem better." Is it just some left-over from your private communication?

Related

Taking first observation of a series as unobserved parameter in non-linear regression

I am trying to estimate the following equation in Stata using non-linear least squares command nl:
π_t= π_t_e+ α (y-y*)
where π_t_e= γ * π_t-1_e+(1-γ) π_(t-1)
π_t, y and y* are already given in the dataset, π_t_e is created from π_t using second equation. There are 34 observations in the dataset. The first observation of π_t_e i.e. π_1_e is treated as an unobserved parameter, along with gamma and alpha.
I wrote the following code but it is not working:
local s "{pi_1_e}"
local s "({gamma}*L.`s'+ (1-{gamma})*L.pi)"
nl ( pi = (`s') + {alpha}*(y-y*))
The first line of the code assigns pi_1_e to s. But the second line replaces s with
({gamma}*L.`s'+ (1-{gamma})*L.pi)
For _n==1, L.s doesn't exist. Hence, it is replaced with a missing value and all other 33 observations are assigned missing values. I wish to run second line of the code from _n>=2. But if condition doesn't work with local macro.
Can someone help me understand how to resolve this?

sympy solve vs. solveset vs. nsolve

I am trying to solve the following equation for r:
from sympy import pi, S, solve, solveset, nsolve, symbols
(n_go, P_l, T, gamma_w, P_g, r, R_mol) = symbols(
'n_go, P_l, T, gamma_w, P_g, r, R_mol', real=True)
expr = -P_g + P_l - 3*R_mol*T*n_go/(4*r**3*pi) + 2*gamma_w/r
soln = solveset(expr, r, domain=S.Reals)
soln1 = solve(expr, r)
soln is of the form Complement(Intersection(FiniteSet(...))), which I really don't know what to do with.
soln1 is a list of 3 expressions, two of which are complex. In fact, if I substitute values for the symbols and compute the solutions for soln1, all are complex:
vdict = {n_go: 1e-09, P_l: 101325, T: 300, gamma_w: 0.07168596252716256, P_g: 3534.48011713030, R_mol: 8.31451457896800}
for result in soln1:
print(result.subs(vdict).n())
returns:
-9.17942953565355e-5 + 0.000158143657514283*I
-9.17942953565355e-5 - 0.000158143657514283*I
0.000182122477993494 + 1.23259516440783e-32*I
Interestingly, substituting values first and then using solveset() or solve() gives a real result:
solveset(expr.subs(vdict), r, domain=S.Reals).n()
{0.000182122477993494}
Conversely, nsolve fails with this equation, unless the starting point contains the first 7 significant digits of the solution(!):
nsolve(expr.subs(vdict), r,0.000182122 )
ValueError: Could not find root within given tolerance. (9562985778.9619347103 > 2.16840434497100886801e-19)
It should not be that hard, here is the plot:
My questions:
Why is nsolve so useless here?
How can I use the solution returned from solveset to compute any numerical solutions?
Why can I not obtain a real solution from solve if I solve first and then substitute values?
The answer from Maelstrom is good but I just want to add a few points.
The values you substitute are all floats and with those values the polynomial is ill-conditioned. That means that the form of the expression that you substitute into can affect the accuracy of the returned results. That is one reason why substituting values into the solution from solve does not necessarily give exactly the same value that you get from substituting before calling solve.
Also before you substitute the symbols it isn't possible for solve to know which of the three roots is real. That's why you get three solutions from solve(expr, r) and only one solution from solve(expr.subs(vdict), r). The third solution which is real after the substitution is the same (ignoring the tiny imaginary part) as returned by solve after the substitution:
In [7]: soln1[2].subs(vdict).n()
Out[7]: 0.000182122477993494 + 1.23259516440783e-32⋅ⅈ
In [8]: solve(expr.subs(vdict), r)
Out[8]: [0.000182122477993494]
Because the polynomial is ill-conditioned and has a large gradient at the root nsolve has a hard time finding this root. However nsolve can find the root if given a narrow enough interval:
In [9]: nsolve(expr.subs(vdict), r, [0.0001821, 0.0001823])
Out[9]: 0.000182122477993494
Since this is essentially a polynomial your best bet is actually to convert it to a polynomial and use nroots. The quickest way to do this is using as_numer_denom although in this case that introduces a spurious root at zero:
In [26]: Poly(expr.subs(vdict).as_numer_denom()[0], r).nroots()
Out[26]: [0, 0.000182122477993494, -9.17942953565356e-5 - 0.000158143657514284⋅ⅈ, -9.17942953565356e-5 + 0.000158143657514284⋅ⅈ]
Your expr is essentially a cubic equation.
Applying the subs before or after solving should not substantially change anything.
soln
soln is of the form Complement(Intersection(FiniteSet(<3 cubic solutions>), Reals), FiniteSet(0)) i.e. a cubic solution on a real domain excluding 0.
The following should give you a simple FiniteSet back but evalf does not seem to be implemented well for sets.
print(soln.subs(vdict).evalf())
Hopefully something will be done about it soon.
1
The reason why nsolve is not useful is because the graph is almost asymptotically vertical. According to your graph, the gradient is roughly 1.0e8. I don't think nsolve is useful for such steep graphs.
Plotting your substituted expression we get:
Zooming out we get:
This is a pretty wild function and I suspect nsolve uses an epsilon that is to large to be useful in this situation. To fix this, you could provide more reasonable numbers that are closer to 1 when substituting. (Consider providing different units of measurement. eg. instead of meters/year consider km/hour)
2
It is difficult to tell you how to deal with the output of solveset in general because every type of set needs to be dealt with in different ways. It's also not mathematically sensible since soln.args[0].args[0].args[0] should give the first cubic solution but it forgets that this must be real and nonzero.
You can use args or preorder_traversal or things to navigate the tree. Also reading documentation of various sets should help. solve and solveset need to be used "interactively" because there are lots of possible outputs with lots of ways to understand it.
3
I believe soln1 has 3 solutions instead of 4 as you state. Otherwise, your loop would print 4 lines instead of 3. All of them are technically of complex (as is the nature with floats in Python). However, the third solution you provide has a very small imaginary component. To remove these kinds of finicky things, there is an argument called chop which should help:
for result in soln1:
print(result.subs(vdict).n(chop=True))
One of the results is 0.000182122477993494 which looks like your root.
Here is an unswer to the underlying question: How to compute the roots of the above equation efficiently?
Based on the suggestion by #OscarBenjamin, we can do even better and faster by using Poly and roots instead of nroots. Below, sympy computes in no time the roots of the equation for 100 different values of P_g, while keeping everything else constant:
from sympy import pi, Poly, roots, solve, solveset, nsolve, nroots, symbols
(n_go, P_l, T, gamma_w, P_g, r, R_mol) = symbols(
'n_go, P_l, T, gamma_w, P_g, r, R_mol', real=True)
vdict = {pi:pi.n(), n_go:1e-09, P_l:101325, T:300, gamma_w:0.0717, R_mol: 8.31451457896800}
expr = -P_g + P_l - 3*R_mol*T*n_go/(4*r**3*pi) + 2*gamma_w/r
expr_poly = Poly(expr.as_numer_denom()[0], n_go, P_l, T, gamma_w, P_g, r, R_mol, domain='RR[pi]')
result = [roots(expr_poly.subs(vdict).subs(P_g, val)).keys() for val in range(4000,4100)]
All that remains is to check if the solutions are fulfilling our conditions (positive, real). Thank you, #OscarBenjamin!
PS: Should I expand the topic above to include nroots and roots?

root mean square (RMSD) of two datasets

I'm dragging along in python, learning so slow but making progress. Have hit a wall, and don't even know where to start on this.
I have other scripts that get me to where I am now: two output CSV files with multiple rows containing 4 numbers each. The first number is an identifier integer, the other three are X, Y, Z coordinates.
Now the OTHER file is the same thing, with the same set of identifier integers, but a different set of X, Y, Z coordinates.
For each identifier integer, I want to calculate the RMSD between the X,Y,Z. In other words, I think I need to do (X2-X1)^2 + (Y2-Y1)^2 + (Z2-Z1)^2 then take the square root of that. This will give me a float as an output answer, which I'd like to write into an output file of two columns: one with the the identifier integer, and the second is the output from this script.
I actually have no idea where to start on this one.. I've never had to work with two files at once. Gah!
thanks so much!!
sorry I have no script to even start here!

Obtaining exact positions

I have a simple code written in standard FORTRAN 77 for numerically integrating equations of motion. The integration loop is the following
yant=x0(2)
DO i=1,n-1
ti=t0+DBLE(i-1)*tstep
t=ti
CALL bstoer8(t,tstep,x,ndimf,ierr,derivs)
IF(x(2)*yant.LT.0d0)THEN
WRITE(52,'(7(F16.8))')t,x
ENDIF
yant=x(2)
ENDDO
The bstoer8 module contains the standard Bulirsh-Stoer integrator and it can be found here.
As we can see, I want to print, to an external data file, the time and all six vector elements (x, y, z, p_x, p_y, p_z) when y = 0.
However I do not get the exact times when y = 0. What I get is the closest time step. For example one of lines in the data file is the following
-0.17000000 10.45572291 0.00264921 -0.83321521 -0.21271715 45.32160003 -1.24830046
We observe that y is very small (0.00264921) but not exactly equal to zero. Moreover the time t contains only two decimal digits because the time step of the numerical integration is equal to 0.01.
So, my question is the following: How can I obtain the exact times when y = 0? In other words, how can I have y equal to 0 (with eight decimal digits) and the corresponding time with eight decimal digits?
Many thanks in advance!

Calculating a relative Levenshtein distance - make sense?

I am using both Daitch-Mokotoff soundexing and Damerau-Levenshtein to find out if a user entry and a value in the application are "the same".
Is Levenshtein distance supposed to be used as an absolute value? If I have a 20 letter word, a distance of 4 is not so bad. If the word has 4 letters...
What I am now doing is taking the distance / length to get a distance that better reflects what percentage of the word has been changed.
Is that a valid/proven approach? Or is it plain stupid?
Is Levenshtein distance supposed to be
used as an absolute value?
It seems like it would depend on your requirements. (To clarify: Levenshtein distance is an absolute value, but as the OP pointed out, the raw value may not be as useful as for a given application as a measure that takes the length of the word into account. This is because we are really more interested in similarity than distance per se.)
I am using both Daitch-Mokotoff
soundexing and Damerau-Levenshtein to
find out if a user entry and a value
in the application are "the same".
Sounds like you're trying to determine whether the user intended their entry to be the same as a given data value?
Are you doing spell-checking? or conforming invalid input to a known set of values?
What are your priorities?
Minimize false positives (try to make sure all suggested words are very "similar", and list of suggestions is short)
Minimize false negatives (try to make sure that the string the user intended is in the list of suggestions, even if it makes the list long)
Maximize average matching accuracy
You might end up using the Levenshtein distance in one way to determine whether a word should be offered in a suggestion list; and another way to determine how to order the suggestion list.
It seems to me, if I've inferred your purpose correctly, that the core thing you want to measure is similarity rather than difference between two strings. As such, you could use Jaro or Jaro-Winkler distance, which takes into account the length of the strings and the number of characters in common:
The Jaro distance dj of two given
strings s1 and s2 is
(m / |s1| + m / |s2| + (m - t) / m) / 3
where:
m is the number of matching characters
t is the number of transpositions
Jaro–Winkler distance uses a prefix
scale p which gives more favourable
ratings to strings that match from the
beginning for a set prefix length l.
The levenshtein distance is a relative value between two words. Comparing the LD to the length is not relevant eg
cat -> scat = 1 (75% similar??)
difference -> differences = 1 (90% similar??)
Both these words have lev distances of 1 ie they differ by one character, but when compared to their lengths the second set would appear to be 'more' similar.
I use soundexing to rank words that have the same lev distance eg
cat and fat both have a LD of 1 relative to kat, but the word is more likely to be kat than fat when using soundex (assuming the word is incrrectly spelt, not incorrectly typed!)
So the short answer is just use the lev distance to determine the similarity.