SAS: working with matrices using IML - sas

So I am trying to calculate this formula but the results are strange. The elements are extremely large so I am not sure where I went wrong. I have attached a photo of the formula:
and here is my code:
*calculating mu_sum and sigma_sum;
T_hat=180;
mu_sum_first_part={0,0,0,0};
mu_sum_second_part={0,0,0,0};
mu_sum={0,0,0,0};
*calculating mu_sum;
do i = 0 to T_hat;
term=(T_hat - i)*(B0**i)*a;
mu_sum_first_part = mu_sum_first_part + term;
end;
do i=1 to T_hat;
term =B0**i;
mu_sum_second_part = mu_sum_second_part + term;
end;
mu_sum = mu_sum_first_part + mu_sum_second_part*zt;
print mu_sum;
*calculating sigma_sum;
term=I(4);
sigma_sum=sigma;
do j=1 to T_hat;
term = term + B0**j;
sigma_sum = sigma_sum + (term*sigma*(term`));
end;
print sigma_sum;
I know this is long but please help!!

First thing that jumps out at me is your loop first term in mu has 1 too many:
do i = 0 to T_hat;
term=(T_hat - i)*(B0**i)*a;
mu_sum_first_part = mu_sum_first_part + term;
end;
Should be:
do i = 0 to T_hat-1;
term=(T_hat - i)*(B0**i)*a;
mu_sum_first_part = mu_sum_first_part + term;
end;

There is nothing mathematically wrong with your program. When you are raising a matrix to the 180th power, you should not be surprised to see very large or very small values. For example, if you let
B0 = {
0 1 0 0,
0 0 1 0,
0 0 0 1,
0 1 1 1
};
then elements of B0**T are O( 1E47 ). If you divide B0 by 2 and raise the result to the 180th power, then the elements are O( 1E-8 ).
Presumably these formulas are intended for matrices B0 that have a special structure, such as ||B0**n|| --> 0 as n --> infinity. Otherwise the power series won't converge. I suggest you double-check that the B0 you are using satisfies the assumptions of the reference.
You didn't ask about efficiency, but you would be wise to compute the truncated power series by using Horner's method in SAS/IML, rather than explicitly forming powers of B0.

Related

How to generate item variables from total score variable

I want to simulate the item score from total score.
For example, I have generated the total score, which has scores between 5 and 25. I would like to distribute this total score to five items with each having a 5-Likert score.
Then I used a while loop to check the condition in Stata 15. The code takes took too long to finish looping and I do not know whether I have made a mistake.
Perhaps someone would like to suggest another way to simulate the item score from the total score?
My code:
set obs 200
generate id=_n
generate u_i= rnormal(0, 0.5)
generate gr = runiform()>0.5
generate sex = runiform()>0.4
generate age = round(rnormal(65, 10))
expand 5
bysort id: generate time=_n
generate e_ij = rnormal(0, 1.0)
generate run=_n
*Generate Sum score 5-25
generate y = 3.0 + 2.0*gr + 0.2*age -1.2*sex + 0.5*time + u_i + e_ij
summarize y
replace y = round(y)
*Generate each item
forvalues k = 1(1)5 {
generate item`k' = runiform(1, 5)
replace item`k' = round(item`k')
}
egen sum_item=rowtotal(item1 item2 item3 item4 item5)
generate diff = y - sum_item
*Looping check if y=sum_item
forvalues a = 1(1)`=_N' {
quietly gsort -diff
while sum_item!=y[`a'] {
replace sum_item=. if sum_item!=y[_n]
forvalues k = 1(1)5 {
replace item`k' =. if sum_item==.
replace item`k' = runiform(1, 5) if item`k'==.
replace item`k' = round(item`k')
}
replace sum_item= item1 + item2+item3+item4+item5 if sum_item==.
replace diff = y - sum_item
if (sum_item==y[`a']) continue, break
}
}
The expected data that I would like to have:
As you can see, after running the loop I will always get 2-4 cases that the program keep running by generating item score (item1-item5) until the diff variable equals zero.
If I'm understanding correctly, you could loop something like the following (after setting all the items to initial values of 1, since possible values are 1 to 5):
capture generate rand_int = 0
replace rand_int = floor( 5 * runiform() + 1 ) // random int, 1 to 5
capture generate cnd = 0
forvalues k = 1(1)5 {
replace cnd = rand_int == `k' & sum_item < y & item`k' < 6
replace item`k' = item`k' + 1 if cnd
}
replace sum_item = item1+item2+item3+item4+item5
In words, that says is that if sum_item < y, then randomly add 1 to one of the items (as long as that item is not already equal to 5), and then you would keep doing it until sum_item == y for all rows.
So that's going to converge in roughly 20 iterations if the max value of y is 25 and items are from 1 to 5. I say "roughly" because there is a little waste in here when you add 1 to an item that is already equal to 5. You could ad some extra code for that, but I wouldn't bother if this is fast enough. E.g. for high values of item_sum it would be more efficient to start with initial values of 5 and randomly subtract 1 until it converges.
I'm not enough of a statistician to say that's the best or even an adequate way to do it, but intuitively to me it seems OK if you want a fairly uniform distribution of values. If you wanted the modal value to be 4, for example, that's a lot harder and not really a programming question any longer.

Not Equals Constraint in PROC OPTMODEL

I have an optimization problem that I need to solve. It's a binary linear programming problem, so all of the decision variables are equal to 0 or 1. I need certain combinations of these decision variables to add up to either 0 or 2+, they cannot sum to 1. I'm struggling with how to accomplish this in PROC OPTMODEL.
Something like this is what I need:
con sum_con: x+y+z~=1;
Unfortunately, this just throws a syntax error... Is there any way to accomplish this?
See below for a linear reformulation. However, you may not need it. In SAS 9.4m2 (SAS/OR 13.2), your expression works as written. You just need to invoke the (experimental) CLP solver:
proc optmodel;
/* In SAS/OR 13.2 you can use your code directly.
Just invoke the experimental CLP solver */
var x binary, y binary, z binary;
con sum_con: x+y+z~=1;
solve with clp / findall;
print {i in 1 .. _NSOL_} x.sol[i]
{i in 1 .. _NSOL_} y.sol[i]
{i in 1 .. _NSOL_} z.sol[i];
produces immediately:
[1] x.SOL y.SOL z.SOL
1 0 0 0
2 0 1 1
3 1 0 1
4 1 1 0
5 1 1 1
In older versions of SAS/OR, you can still call PROC CLP directly,
which is not experimental.
The syntax for your example will be very similar to PROC OPTMODEL's.
I am sure, however, that your model has other variables and constraints.
In that case, remember that no matter how you formulate this,
it is still a search space with a hole in the middle.
So it potentially can make the solver perform poorly.
How poorly is hard to predict. It depends on other features of your model.
If MILP is a better fit for the rest of your model,
you can reformulate your constraint as a valid MILP in two steps.
First, add a binary variable that is zero only when the expression is zero:
/* If solve with CLP is not available, you can linearize the disjunction: */
var IsGTZero binary; /* 1 if any variable in the expression is 1 */
con IsGTZeroBoundsExpression: 3 * IsGTZero >= x + y + z;
Then add another constraint that forces the expression to be
at least the constant you want (in this case 2) when it is nonzero.
num atLeast init 2;
con ZeroOrAtLeast: x + y + z >= atLeast * IsGTZero;
min f=0; /* Explicit objectives are unnecessary in 13.2 */
solve;
The following equation should work:
(x+y-z)*z + (y+z-x)*x + (x+z-y)*y > -1
It can be generalized to more than three variables and if you have some large number you should be able to use index expansions to make it easier.

looping for a series

I have this question:
Write a program to display the sum of the series 1+1/2+2/3+3/4+...
+(n-1)/n (using for loop).
I did not understand the series well, kindly explaint it for me if n = 6. (no need for coding).
For n = 6, you need to calculate 1 + (1/2) + (2/3) + (3/4) + (4/5) + (5/6)
The question is asking you to fill the details in to the following program:
sum = 0;
for (int i=1; i<=n; ++i) {
sum += ???
}
return sum;
where ??? should give you the following values:
i | ???
-------
1 | 1
2 | 1/2
3 | 2/3
4 | 3/4
5 | 4/5
6 | 5/6
.
.
.
n | (n-1)/n
It is simple. The biggest hint is the nth term itself : (n-1)/n
Except the first term, every other term can be represented by an expression of the form of (i-1)/i, which means the algorithm boils down to this:
double sum = 1.0; //first term
for(int i = 2 ; i <= n ; ++i) //2nd to nth term!
sum += (i-1.0)/i;
Why did I write (i-1.0) instead of (i-1)?
You need to figure that out yourself, as I already have explained and written almost the whole code.
Write a loop that evaluates (n-1)/n for each value of n and adds the outcome to some variable.
That "some variable" is the answer.
Set n=6
The final term of the series can also be written as n / (n + 1) where n is a value that iterates.

Lookup tables in C++

I have to implement small multimage graphic control, which in essence is an array of 9 images, shown one by one. The final goal is to act as minislider.
Now, this graphic control is going to receive various integer ranges: from 5 to 25 or from 0 to 7 or from -9 to 9.
If I am going to use proportion - "rule of three" I am afraid is not technically suistainable because it can be a source of errors. My guess is to use some lookup tables, but has anyone an good advice for approach?
Thnx
I'm not sure look up tables are required. You can get from your input value to an image index between 0 and 9 proportionally:
int ConvertToImageArrayIndex(int inputValue)
{
int maxInputFromOtherModule = 25;
int minInputFromOtherModule = 5;
// +1 required so include both min and max input values in possible range.
// + 0.5 required so that round to the nearest image instead of always rounding down.
// 8.0 required to get to an output range of 9 possible indexes [0..8]
int imageIndex = ( (float)((inputValue-minInputFromOtherModule) * 8.0) / (float)(maxInputFromOtherModule - minInputFromOtherModule + 1) ) + 0.5;
return imageIndex;
}
yes, a lookup table is a good solution
int lookup[9] = {5, 25, ... the other values };
int id1 = floor(slider);
int id2 = id1+1;
int texId1 = lookup[id1];
int texId2 = lookup[id2];
interpolate(texId1, texId2, slider - float(id1));

Math question in regards to functions in the form (1) / ( b ^ c )

I've found functions which follow the pattern of 1 / bc produce nice curves which can be coupled with interpolation functions really nicely.
The way I use the function is by treating 'c' as the changing value, i.e. the interpolation value between 0 and 1, while varying b for 'sharpness'. I use it to work out an interpolation value between 0 and 1, so generelly the function I use is as such:
float interpolationvalue = 1 - 1/pow(100,c);
linearinterpolate( val1, val2, interpolationvalue);
Up to this point I've been using a hacked approach to make it 'work' since when interpolation value = 1 the value is very close to but not quite 0.
So I was wondering, is there a function in the form of or one which can reproduce similar curves to the ones produced by 1 / bc where at c = 0 result = 1 and c = 1 result = 0.
Or even C = 0, result = 0 and C = 1 result = 1.
Thanks for any help!
For interpolation the approach offering the most flexibility is using splines, in your case quadratic splines would seem sufficient. The wikipedia page is math heavy, but you can find adapted desciptions on google.
1 - c ^ b with small values for b? Another option would be to use a cubic polynomial and specifying the slope at 0 and 1.
You could use a similar curve of the form A - 1 / b^(c + a), choosing values of A and a to match your constraints. So, for c = 0, result = 1:
1 = A - 1/b^a => A = 1 + 1/b^a
and for c = 1, result = 0:
0 = A - 1/b^(1+a) => A = 1/b^(1+a)
Combining these, we can find a in terms of b:
1 + 1/b^a = 1/b^(1+a)
b^(1+a) + b = 1
b * (b^a - 1) = 1
b^a = 1/b - 1
So:
a = log_b(1/b - 1) = log(1/b - 1) / log(b)
A = 1 + 1/b^a = 1 / (1-b)
In real numbers, the ones that mathematician use, no function of the form you specify is ever going to return 0, division can't do that. (1/x)==0 has no real solutions. In floating point arithmetic, the poor relation of real arithmetic that computers use, you could write 1/(MAX_FP_VALUE^1) which will give you as close to 0 as you are ever going to get (actually, it might give you a NaN or one of the other odd returns that IEEE 754 allows).
And, as I'm sure you've noticed, 1/(b^0) always returns 1 since b^0 is, by definition of 0-th power, always 1.
So, no function with c = 0 will produce a result of 0.
For c = 1, result = 1, set b = 1
But I guess this is only a partial answer, I'm not terribly sure I understand what you are trying to do.
Regards
Mark