Find the nearest match that could add up to zero in SAS - sas

I am using SAS for research. My question is how to find the nearest match in the same column. Please see the following for a quick illustration. I am new to SAS programming, and only have a preliminary guess that proc sql might do the work. What I am doing now is manually adjusting - it is painful and especially so for over 3,000 observations.
I want to find the nearest "Value" match that could add up to zero. For example, for firm AA in 1st quarter 2000, I want to match the nearest two numbers that could add up to 100. I don't want the 50 for firm AA in 2002Q2 nor firm BB 2000Q4. In addition, I also struggle with the case for firm BB, and have no idea how to perform the matching: the two negative numbers add up to -200, the two positive numbers add up to +200, and they maybe in same or different years. To help you understand better, please find the following table for what I have in mind at the end of the day:
For the BB case, it can be 2001Q3 "-100" matched to "50" in 2000Q4, it is also fine if it matches to "100" in 2001Q1 - the order doesn't matter. Thanks in advance! Any help is really appreciated!
Regards,
Michael

At +/- 2 quarters, each row has at most 5 items that need to be to checked in combination.
There are 15 combinations that include the current row (0 column) and at least one other row.
combo -2 -1 0 1 2
1 * * *
2 * *
3 * *
4 * * * *
5 * * *
6 * * *
7 * *
8 * * * * *
9 * * * *
10 * * * *
11 * * *
12 * * * *
13 * * *
14 * * *
15 * *
You could check all these combinations for each row to find your cases 'of sums to zero' in the context of combinations with replacement.

Related

Trigonometric Equation only works with specific input ( 5 ) doesn't work with other inputs

I try to write code for that calculation angles from lengths of triangle. formula is
cos(a)=b^2+c^2-a^2/2bc. (Triangle is here)
angle1 = acosf((powf(length2,2) + powf(length3,2) - powf(length1,2)) / 2 * length2 * length3)* 180 / 3.14153;
angle2 = acosf((powf(length1,2) + powf(length3,2) - powf(length2,2)) / 2 * length1 * length3)* 180 / 3.14153;
angle3 = 180 - (angle2 + angle1);
Everything is float. When entered 5-4-3 inputs outcome this.
angle one is 90.0018
angle two is nan
angle three is nan
changing order doesn't matter, only gives output for 5.
You are doing:
angle1 = acosf((powf(length2,2) + powf(length3,2) - powf(length1,2)) / 2 * length2 * length3)* 180 / 3.14153;
You should be doing:
angle1 = acosf((powf(length2,2) + powf(length3,2) - powf(length1,2)) / (2 * length2 * length3))* 180 / 3.14153;
Explanation: The problem is caused by the following formula, which is in fact badly written:
cos(a)=b^2+c^2-a^2/2bc
// This, obviously, is wrong because
// you need to group the firt three terms together.
// Next to that, everybody understands that the last "b" and "c" are divisors,
// yet it would be better to write it as:
cos(a)=(b^2+c^2-a^2)/(2bc)
The brackets, I added in the code, are similar to the replacement of /2bc by /(2bc).

Difficulty with creating a new variable in Stata using the subtraction operator

I am a Stata novice.
I am having great difficulty creating this variable:
generate gap= 0.364 * (male − 0.707) − 0.0146 * (FVCpercent − 66.763) + 0.131 * (age_integer − 67.676) − 0.0814 * (age_gap) + 0.0287 * (avg_fibrosis − 22.147)
male is numeric (male=1, female=0)
FVCpercent, age_integer, age_gap and avg_fibrosis are all numeric.
I repeatedly get this error
male−0.707 invalid name
For some reason, if I switch all the "-" operators to "+" it works.
I would be grateful for any input. Many thanks.
It was a weird error related to the character − that you are using. It is somewhat different from -(which is the correct one). I replace them, and now it works.
clear all
input male FVCpercent age_integer age_gap avg_fibrosis
10 10 10 10 10 10
end
generate gap = 0.364 * (male - 0.707) - 0.0146 * (FVCpercent - 66.763) + 0.131 * (age_integer - 67.676) - 0.0814 * (age_gap) + 0.0287 * (avg_fibrosis - 22.147)

C++ find where a point set lies between two others

I have three sets of 2d points. What i need to do is to find out where one sits in relation to the other two.
Every set has the same points, in the same order. One is 'neutral', one is 'max', and the third is unknown. What I need is to return a single value, between 0 and 1, that illustrates the amount that the unknown set is between the other two.
For example, in the image:
I would somehow get the 'distance' or 'weight' between Set A and Set B, then find out where Set C sits between them. In this example, i would expect a value of around 75%, or 0.75.
I have looked at using point set registration algorithms that return a scale amount to match Set C to Set B, but i am not convinced that this is the best way. What approach would be suitable for this problem? What algorithms should I be searching for?
You could try to solve this with a simple linear interpolation between the two sets. This works if the transition between the sets is indeed nearly linear. If you know that it is something else, you can adapt the interpolation function.
Let us focus on a single point p. We know its coordinates in all sets p_A, p_B, and p_C. Then, we specify that p_C is more or less a linear interpolation between p_A and p_B with parameter t (where t=0 represents set A and t=1 represents set B):
p_C = (1 - t) * p_A + t * p_B
= p_A - t * p_A + t * p_B
= p_A + t * (p_B - p_A)
p_C - p_A = t * (p_B - p_A)
The question now is to find a t that approximately holds for all your points.
We can solve this by stating the problem as a linear least squares problem. I.e. we want to minimize the summed residuals (difference between left-hand sides and right-hand sides of the above equation) for all points:
arg min_t Σ_i (pi_C.x - pi_A.x - t * (pi_B.x - pi_A.x))^2
+ (pi_C.y - pi_A.y - t * (pi_B.y - pi_A.y))^2
The optimal t is then:
numX = Σ_i (pi_A.x^2 - pi_A.x * pi_B.x - pi_A.x * pi_C.x + pi_B.x * pi_C.x)
numY = Σ_i (pi_A.y^2 - pi_A.y * pi_B.y - pi_A.y * pi_C.y + pi_B.y * pi_C.y)
denX = Σ_i (pi_A.x^2 - 2 * pi_A.x * pi_B.x + pi_B.x^2)
denY = Σ_i (pi_A.y^2 - 2 * pi_A.y * pi_B.y + pi_B.y^2)
t = (numX + numY) / (denX + denY)
If your points have higher dimension, just add the new dimension with the same pattern.

R use apply function on xts zoo class

I am new in R and I try to use apply function on the xts zoo class, however it shows error. I have a formula: ((2*Close-High-Low)/(High-Low)) * Volume
Input:
y <- getSymbols("0005.HK", auto.assign = FALSE, src = "yahoo")
Error:
y$II <- apply(y,2,function(x) (2Cl(x) - Hi(x) - Lo(x)) / ((Hi(x) - Lo(x)) * Vo(stk)))
Error: unexpected symbol in "apply(y,2,function(x) (2Cl"
and then I tried another one:
Error:
y$II <- apply(y,2,function(x) (2(x[,4]) - x[,2] - x[,3]) / (x[,2] - x[,3]) * x[,5])
Error in FUN(newX[, i], ...) : attempt to apply non-function
After that, I would like to sum the y$II 21 days but I don't know how to do apply function to sum 21 days between every 21 days
IIstd = Sum of 21 ((2*C-H-L)/(H-L)) * V
IInorm = (IIstd / Sum 21 day V) * 100
Anyone can help me ? Please advice, thanks.
There are two problems here:
2Cl(x) i s not valid R -- use 2 * Cl(x)
all operations on the right hand side are already vectorized so we do not need apply in the first place
For clarity here we have assumed that II = (2C - H - L)/((H-L) * V)and you want 100 times the 21 period volume weighted moving average of that. Modify if that is not what you want.
Try this:
y$II <- (2*Cl(y) - Hi(y) - Lo(y)) / ((Hi(y) - Lo(y)) * Vo(y))
Regarding the second part of the question try this -- rollapplyr is in the zoo package.
wmean <- function(x) weighted.mean(x$II, Vo(x))
y$MeanII <- 100 * rollapplyr(y, 21, wmean, by.column = FALSE, fill = NA)
Also check out the TTR package.
UPDATE: Added answer to second part of question.

C++ How to calculate area of square,rect, and cross. User will input the co-ordinates [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
As mention on my title, how do i calcuate the area of square ,rect and cross?
The user will input all the coordinates. For square and rect , the area is easy but cross, how do i do it? And if user criss-cross input the coordinates, how do i get the length and width for all the three so that my area calculation is accurate??
Below is the illustration of a cross, which is quite tricky..
****
* *
**** ****
* *
**** ****
* *
****
//this is for square and rectangle,but to take note,user will input from from bottom left to right, then top right to left, so the caculation below will than work
l = (((x1-x2)^2 + (y1-y2)^2))^(1/2);
w = (((x1-x4)^2 + (y1-y4)^2))^(1/2);
A=l*w;
And how do i get the coordinate points on the shapes and coordinate points in shape ?
Example: Coordinates for square is (1,1),(3,1),(1,3),(3,3)
so coordinate in square is (2,2)
and coordinate on square is (1,2),(2,1),(3,2),(2,3)
The cross is the superposition of two rectangles, but you have to countthe overlapping area only once.
The total area is:
the area of:
****
* *
* *
* *
* *
* *
****
plus the area of
**********
* *
**********
minus the area of:
****
* *
****
Get the absolute value of the result to avoid problems with coordinates being in the wrong order - areas are always positive.
a
|--|
c ****
|--* *
-**** ****
b|* *
-**** ****
* *
****
A = (a+b) * (2c+a) - a*b
So, you really only need to identify 4 coordinates. Top left and top right of the vertical bar, and the top left and bottom left coordinates of the horizontal bar.
Top left vertical: y_tlv=y_max, x_tlv = {x_min where y=y_max}
Top right vertical: y_trv=y_max, x_trv = {x_max where y=y_max}
Top left horizontal: y_tlh={y_max where x=x_min}, x_tlh=x_min
Bottom left horizontal: y_blh={y_min where x=x_min}, x_blh=x_min
a = abs(x_trv - x_tlv)
b = abs(y_tlh - y_blh)
c = abs(x_tlv - x_tlh)
I'll leave to you to figure out the algorithm to identify the required coordinate points.
I am assuming the user will be required to input at the very least the following 4 co-ordinates:
C1 * * *
* *
* *
C2 * * * * * * *
* *
* *
* * * * * * * C3
* *
* *
* * * C4
Now from the same you can calculate the co-ordinates of 3 rectangles:
* * * *
* 1 *
* *
* * * * **** * * * *
* 2 *
* *
* * * * **** * * * *
* 3 *
* *
* * * *
and eventually the area of the cross.
Given a simple-ish cross shape like this:
A---B
| |
C--D E--------F
| X Y |
G--H I--------J
| |
K---L
You can, as pointed out above, find the areas of the three quadrilaterals and calculate the area of the whole figure... ABLK + CFJG - DEHI. This works even for skewed crosses which don't have right-angles.
How you compute the centroid of the cross depends on what you actually want, either X or Y. To get Y you must first find the bounding quadrilateral of the cross, and then it will be simple to find the centroid of that quadrilateral. Remember that if you allow unequal arm lengths like I've drawn above, point Y need not be contained within the cross itself!
To find centroid X, you'll need to work out the midpoints of AB and KL, and the midpoints of CG and FJ. You can then find the point of intersection of those two lines, AB-KL and CG-FJ to find the crossing point X which will be inside the cross, so long as the cross has a regular shape.
If you allow an arbitrary cross shape (so, for example, there might be no right angles in the cross at all) I don't think you can guarantee that point X will lie within the shape either, but I'm too lazy to prove this one way or another.
To find an arbitrary point on the perimeter of the shape is easy enough; you just need to pick any pair of corners linked by an edge (say, EF or KH) and pick a point on the vector between the two