Two-piece normal distribution in Stata - stata

The overall variance and skewness of Two-piece Normal (TPN) is given by:
𝑉(𝑥) = 𝜎1 𝜎2 + (1 − 𝑘2)(𝜎2 − 𝜎1)2
𝛾(𝑥) = 𝑘(𝜎2 − 𝜎1)[(2𝑘2 − 1)(𝜎2 − 𝜎1)^2 + 𝜎1𝜎2]
where 𝑘 = 2/pi
For given values of 𝑉(𝑥) and 𝛾(𝑥), two variances (𝜎1 and 𝜎2) could be estimated. How can I set up this nonlinear simultaneous equations problem in STATA? Any help?

Related

Find euler function of binomial coefficient

I've been trying to solve this problem:
Find Euler's totient function of binomial coefficient C(n, m) = n! / (m! (n - m)!) modulo 10^9 + 7, m <= n < 2 * 10^5.
One of my ideas was that first, we can precalculate the values of phi(i) for all i from 1 to n in linear time, also we can calculate all inverses to numbers from 1 to n modulo 10^9 + 7 using, for example, Fermat's little theorem. After that, we know, that, in general, phi(m * n) = phi(m) * phi(n) * (d / fi(d)), d = gcd(m, n). Because we know that gcd((x - 1)!, x) = 1, if x is prime, 2 if x = 4, and x in all other cases, we can calculate phi(x!) modulo 10^9 + 7 in linear time. However, in the last step, we need to calculate phi(n! / ((m! (n - m)!), (if we already know the function for factorials), so, if we are using this method, we have to know gcd(C(n, m), m! (n - m)!), and I don't know how to find it.
I've also been thinking about factorizing the binomial coefficient, but there seems no efficient way to do this.
Any help would be appreciated.
First, factorize all numbers 1..(2*10^5) as products of prime powers.
Now, factorize n!/k! = n(n-1)(n-2)...(n-k+1) as a product of prime powers by multiplying together the factors of the individual parts. Factorize (n-k)! as a product of prime powers. Subtract the latter powers from the former (to account for the divide).
Now you've got C(n, k) as a product of prime powers. Use the formula phi(N) = N * prod(1 - 1/p for p|N) to calculate phi(C(n, k)), which is straightforward given that you've computed the a list of all the prime powers that divide C(n, k) in the second step.
For example:
phi(C(9, 4)) = 9*8*7*6*5 / 5*4*3*2*1
9*8*7*6*5 = 3*3 * 2*2*2 * 7 * 3*2 * 5 = 7*5*3^3*2^4
5*4*3*2*1 = 5 * 2*2 * 3 * 2 * 1 = 5*3*2^3
9*8*7*6*5/(5*4*3*2*1) = 7*3^2*2
phi(C(9, 4)) = 7*3^2*2 * (1 - 1/7) * (1 - 1/3) * (1 - 1/2) = 36
I've done it in integers rather than integers mod M, but it seems like you already know how division works in the modulo ring.

How to plot lines according to some condition?

I was solving a coding problem and came across this one. It states :
We have an infinitely planar cartesian coordinate system on which N points are plotted. Cartesian coordinates of the point I am represented by (Xi, Yi).
Now we want to draw (N-1) line segments which may have arbitrary lengths and the points need not lie on the lines. The slope of each line must be 1 or -1.
Let's denote the minimum distance we have to walk from a point I to reach a line by Di and let's say a = max(D1, D2, D3,..., DN). We want this distance to be minimum as possible.
Thus we have to plot lines in such a way that it minimizes 'a' and compute a*sqrt(2)
Constraints :
1 <= T <= 100
2 <= N <= 10^4
|Xi|, |Yi| <= 10^9 for each valid i
Here T denotes number of test cases.
Sample input 1 :
N = 3
Points : (0,0) , (0,1) , (0,-1)
Sample output 1 :
0.5
Explanation: We should draw lines described by equations y−x+0.5=0 and y−x−0.5=0
Sample input 2 :
N = 3
Points : (0,1) , (1,0) , (-1,0)
Sample output 2 :
0
Explanation: We should draw lines described by equations y−x−1=0 and y+x−1=0
Output format :
For each test case, print a single line containing one real number — the minimum distance a multiplied by sqrt(2). Your answer will be considered correct if its absolute or relative error does not exceed 10^(-6).
Time limit: 1 sec
My understanding is as the slopes are 1 or -1 the equations of the lines would be y = x + c or y = -x + c and we just have to find the y-intercept c which minimizes the distance 'a' in the problem. Also, the minimum distance from a point to the line is the length of the perpendicular to the line.
So I am having difficulty to devise an algorithm which will check all possible values of 'c' and find the optimal one.
Let us denote M[i] the point (x[i], y[i])
The fist step is to compute the distance between a point M(x, y) and a line D, slope of which is equal to +/-1.
Let us denote D and D' the lines
D: y + x + c = 0
D': y - x + c = 0
Then, a few calculations allow to show that
the distance between M and D is equal to d(M, D) = abs(y + x + c)/sqrt(2)
the distance between M and D' is equal to d(M, D') = abs(y - x + c)/sqrt(2)
Let us now consider two different points, for example M[0] and M[1], and let us calculate the minimum distance between these two points and a line D of parameter c and slope +/-1.
Formally, we have two find the minimum, over c and slope, of
max(d(M[0], D), d(M[1], D))
If the slope is -1, i.e. if the equation is y+x+c=0, one can easily show the the optimum c parameter is equal to
c = -(x0 + y0 + x1 + y1)/2
The corresponding distance is equal to abs(x0+y0-x1-y1)/(2*sqrt(2))
If the slope is 1, i.e. if the equation is y-x+c=0, one can show the the optimum c parameter is equal to
c = (x0 - y0 + x1 - y1)/2
The corresponding distance is equal to abs(y0 - x0 - y1 + x1)/(2*sqrt(2))
Therefore, the minimum distance from these two points to an optimal line is the minimum of the previous two distances.
This leads to define the following quantities, for each points M[i]:
a|i] = y[i] - x[i]
b[i] = y[i] + x[i]
And then to define a distance between points M[i] and M[j] as :
d(M[i], M[j]) = min (abs(b[i]-b[j]), abs(a[i]-a[j]))
The proposed algorithm consists in finding the pair (M[i], M[j]) such that this distance is minimized.
Then the wanted result is equal to half this distance.
This corresponds to consider that a line will pass through the distant points (according to the defined distance), except the two closest ones, for which we will draw a line just in between.
(EDIT)
The complexity is not O(n^2) as previously stated.
The complexity to find the min of d(M[i], M[j]) is O(N logN).
This is obtained by sorting the a[i] and to get the min of the differences between adjacent values, i.e. min(a[i+1] - a[i]).
Then by doing the same for the b[i], and finally taking the minimum of the two obtained values.

How to solve an algebraic equation in formal power series?

Motivation. It is well known that generating function for Catalan numbers satisfies quadratic equation. I would like to have first several coefficients of a function, implicitly defined by an algebraic equation (not necessarily a quadratic one!).
Example.
import sympy as sp
sp.init_printing() # math as latex
from IPython.display import display
z = sp.Symbol('z')
F = sp.Function('F')(z)
equation = 1 + z * F**2 - F
display(equation)
solution = sp.solve(equation, F)[0]
display(solution)
display(sp.series(solution))
Question. The approach where we explicitly solve the equation and then expand it as power series, works only for low-degree equations. How to obtain first coefficients of formal power series for more complicated algebraic equations?
Related.
Since algebraic and differential framework may behave differently, I posted another question.
Sympy: how to solve differential equation in formal power series?
I don't know a built-in way, but plugging in a polynomial for F and equating the coefficients works well enough. Although one should not try to find all coefficients at once from a large nonlinear system; those will give SymPy trouble. I take iterative approach, first equating the free term to zero and solving for c0, then equating 2nd and solving for c1, etc.
This assumes a regular algebraic equation, in which the coefficient of z**k in the equation involves the k-th Taylor coefficient of F, and does not involve higher-order coefficients.
from sympy import *
z = Symbol('z')
d = 10 # how many coefficients to find
c = list(symbols('c:{}'.format(d))) # undetermined coefficients
for k in range(d):
F = sum([c[n]*z**n for n in range(k+1)]) # up to z**k inclusive
equation = 1 + z * F**2 - F
coeff_eqn = Poly(series(equation, z, n=k+1).removeO(), z).coeff_monomial(z**k)
c[k] = solve(coeff_eqn, c[k])[0]
sol = sum([c[n]*z**n for n in range(d)]) # solution
print(series(sol + z**d, z, n=d)) # add z**d to get SymPy to print as a series
This prints
1 + z + 2*z**2 + 5*z**3 + 14*z**4 + 42*z**5 + 132*z**6 + 429*z**7 + 1430*z**8 + 4862*z**9 + O(z**10)

Matrix Representation of Second Degree Linear Recurrence Equations

I can calculate the Matrix representation of first degree Linear recurrence equations. And I calculate for higher order by using fast matrix exponentiation. I learnt this from this tutorial
http://fusharblog.com/solving-linear-recurrence-for-programming-contest/
But I am facing problem in calculating the matrix representation of Second Degree Linear recurrence equations. For example -
S(n) = a * (S(n - 1))^2 + b * S(n - 1) + c
where S(0) = d
Can you help me to figure out the matrix representation of the above equation or give me some insights? Thanks in advance.
This is polynomial of second degree. The well-known recurrence
x_(n+1) = (x_n)^2 + c
that is often called the quadratic map is not in general solvable in closed form. Quadratic iteration
x_(n+1) = a (x_n)^2 + b x_n + c
is iteration of the Mandelbrot fractals.
This is the real version of the complex map defining the Mandelbrot set.

Normalizing from [0.5 - 1] to [0 - 1]

I'm kind of stuck here, I guess it's a bit of a brain teaser. If I have numbers in the range between 0.5 to 1 how can I normalize it to be between 0 to 1?
Thanks for any help, maybe I'm just a bit slow since I've been working for the past 24 hours straight O_O
Others have provided you the formula, but not the work. Here's how you approach a problem like this. You might find this far more valuable than just knowning the answer.
To map [0.5, 1] to [0, 1] we will seek a linear map of the form x -> ax + b. We will require that endpoints are mapped to endpoints and that order is preserved.
Method one: The requirement that endpoints are mapped to endpoints and that order is preserved implies that 0.5 is mapped to 0 and 1 is mapped to 1
a * (0.5) + b = 0 (1)
a * 1 + b = 1 (2)
This is a simultaneous system of linear equations and can be solved by multiplying equation (1) by -2 and adding equation (1) to equation (2). Upon doing this we obtain b = -1 and substituting this back into equation (2) we obtain that a = 2. Thus the map x -> 2x - 1 will do the trick.
Method two: The slope of a line passing through two points (x1, y1) and (x2, y2) is
(y2 - y1) / (x2 - x1).
Here we will use the points (0.5, 0) and (1, 1) to meet the requirement that endpoints are mapped to endpoints and that the map is order-preserving. Therefore the slope is
m = (1 - 0) / (1 - 0.5) = 1 / 0.5 = 2.
We have that (1, 1) is a point on the line and therefore by the point-slope form of an equation of a line we have that
y - 1 = 2 * (x - 1) = 2x - 2
so that
y = 2x - 1.
Once again we see that x -> 2x - 1 is a map that will do the trick.
Subtract 0.5 (giving you a new range of 0 - 0.5) then multiply by 2.
double normalize( double x )
{
// I'll leave range validation up to you
return (x - 0.5) * 2;
}
To add another generic answer.
If you want to map the linear range [A..B] to [C..D], you can apply the following steps:
Shift the range so the lower bound is 0. (subract A from both bounds:
[A..B] -> [0..B-A]
Scale the range so it is [0..1]. (divide by the upper bound):
[0..B-A] -> [0..1]
Scale the range so it has the length of the new range which is D-C. (multiply with D-C):
[0..1] -> [0..D-C]
Shift the range so the lower bound is C. (add C to the bounds):
[0..D-C] -> [C..D]
Combining this to a single formula, we get:
(D-C)*(X-A)
X' = ----------- + C
(B-A)
In your case, A=0.5, B=1, C=0, D=1 you get:
(X-0.5)
X' = ------- = 2X-1
(0.5)
Note, if you have to convert a lot of X to X', you can change the formula to:
(D-C) C*B - A*D
X' = ----- * X + ---------
(B-A) (B-A)
It is also interesting to take a look at non linear ranges. You can take the same steps, but you need an extra step to transform the linear range to a nonlinear range.
Lazyweb answer: To convert a value x from [minimum..maximum] to [floor..ceil]:
General case:
normalized_x = ((ceil - floor) * (x - minimum))/(maximum - minimum) + floor
To normalize to [0..255]:
normalized_x = (255 * (x - minimum))/(maximum - minimum)
To normalize to [0..1]:
normalized_x = (x - minimum)/(maximum - minimum)
× 2 − 1
should do the trick
You could always use clamp or saturate within your math to make sure your final value is between 0-1. Some saturate at the end, but I've seen it done during a computation, too.