I'd like to do the following regression
proc logistic data=abc
model y = x x*x x*x*x ....;
run;
Is there a shorthand to generate these polynomial terms? Thanks.
Edit: That will teach me to look closer at the question before I answer. The BAR operator is indeed for interaction - not polynomial effects.
Logistic does not have shorthand to accomplish this yet that I know of - but glimmix does have an experimental technique using the effect statement. For example, this..
effect MyPoly = polynomial(x1-x3/degree=2);
model y = MyPoly;
is the same as
model y = x1 x2 x3 x1*x1 x1*x2 x1*x3 x2*x2 x2*x3 x3*x3;
Related
Can someone please help with the scenario below? I am very new to SaS and am not sure how to get this to work?
Simulate 200 observations from the following linear model:
Y = alpha + beta1 * X1 + beta2 * X2 + noise
where:
• alpha=1, beta1=2, beta2=-1.5
• X1 ~ N(1, 4), X2 ~ N(3,1), noise ~ N(0,1)
I have tried this code but not sure its completely accurate:
DATA ONE;
alpha = 1;
beta1 = 2;
beta2 = -1.5;
RUN;
DATA CALC;
SET ONE;
DO i = 1 to 200;
Y=alpha+beta1*X1+beta2*X2+Noise;
X1=Rannor(1);
X2=rannor(3);
Noise=ranuni(0);
OUTPUT;
END;
RUN;
PROC PRINT DATA=CALC;
RUN;
You need to have a look in the SAS help for the topics
"rannor","ranuni","generating random numbers",...
rannor: generating standard normal distributed RVs.
ranuni: uniform distributed RVs.
The argument in rannor is the seed number, not the expected value.
If N(x,y) in your example means that the random variable is normally distributed with expected value x and standard deviation y (or do you mean the variance???) then the code could be (have a look on the changed order of the statements; the definition of Y has to be after the definition of the random numbers...):
DATA ONE;
alpha = 1;
beta1 = 2;
beta2 = -1.5;
RUN;
DATA CALC;
SET ONE;
seed = 1234;
DO i = 1 to 200;
X1=1+4*Rannor(seed);
X2=3+rannor(seed);
Noise=rannor(seed);
Y=alpha+beta1*X1+beta2*X2+Noise;
OUTPUT;
END;
RUN;
PROC PRINT DATA=CALC;
RUN;
There are also variants for generating random numbers, e.g. "call rannor". There are different concepts to deal with seed numbers in SAS. See the SAS help for these topics, e.g. here
Here are my data. Data are structured like so: id x1 x2 x3 y.
I used proc mixed to analyze it, but now want to determine regression coefficients and I don't know how to do it. I'm only a beginner with sas. From the results I see that x1, x2, x3 and x1x2x3 are the significant effects, but how to determine the coefficients alpha, beta, gamma, delta, theta:
y = theta + alpha*x1 + beta*x2 + gamma*x3 + delta*x1*x2*x3
This is my code:
ods graphics on;
proc mixed data=test;
class x1 x2 x3;
model y = x1 | x2 | x3 / solution residual;
random id;
run;
ods graphics off;
EDIT 1: Here is a part of the table Solutions for Fixed Effects:
Since x1 has two levels, there are two rows for it in the table. Do I get the effect of x1 by summing these two values: -109.07 for the first row and 0 for the second, or should I do something else? Note that this is 2^k design. The effect of x1 should be computed as half the difference between the average values for y when x1 is high (20) and when it is low (10).
Based on your model, x1, x2, x3 should be treated as continuous variables, then you should be able to get the coefficients in your model.
proc mixed data=test;
model y=x1 x2 x3 x1*x2*x3/ solution residual;
random id/s;
run;
However, based on your code and the values of x1, x2 and x3, it would be better to treat them as categorical variable as what you did, then the Estimate in your table actually is the mean difference between whatever two levels. The link below may help you understand your results.
http://support.sas.com/kb/38/384.htmlexplanation of estimation of coefficients
The solution option should generate your estimates.You need to include it on the model and random statements. You should see two tables, Solution for Fixed Effects and Solution for Random Effects that hold the estimates.
proc mixed data=test;
class x1 x2 x3;
model y = x1 | x2 | x3 / solution residual;
random id / s;
run;
The Random Coefficients example in the documentation is close to your question.
https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_mixed_sect034.htm
Could anyone suggest some SAS code to solve a small non-linear dynamic model? The model endogenous variables are:
log(consumption)
log(investment)
log(price level)
short term interest rate
long term interest rate
output.
I have tried to use the proc model but I'm having convergence problems due to the fact that the output identity is in level terms and the equations for the demand components are in log terms.
thank you
===========
thank you everybody for the answers/comments. The code I'm using for the estimation is:
proc model outmodel=fair_model;
var py rs delta_log_e rb log_pm ys DUM1997 e rs_us pop py_us y i t c x im stat;
parms c0-c3 i0-i2 p0-p4 r0-r4 b0-b4 e0-e3;
eq.Consumption = c0 + c1*lag(log(c/pop)) + c2*rb + c3*log(y/pop) - log(c/pop);
eq.Investment = i0 + i1*lag(log(i/y)) + i2*rb - log(i/y);
eq.Price = p0 + p1*lag(log(py)) + p2*log_pm + p3*(y/ys-1) + p4*t - log(py);
eq.Interest_Rate = r0 + r1*lag(rs) + r2*((py-lag4(py))/lag4(py)) + r3*(y/ys-1) + r4*DUM1997 - rs;
eq.Int_Rate = b0 + b1*(lag1(rb)- lag2(rs)) + b2*(rs-lag2(rs)) + b3*(lag1(rs)- lag2(rs)) +b4*(lag2(rs)) - rb;
eq.Exchange_Rate = e0 + e1*lag(delta_log_e) + e2*(log(py/py_us)-log(lag(e))) + e3*(log((1+rs/100)/(1+rs_us/100))) - delta_log_e;
eq.income = c + i + g + x - im + stat - y;
run;
quit;
*Fitting the data with the model, getting the estimates;
proc model model=fair_model outmodel=fair_model;
fit Consumption Investment Price Interest_Rate Int_Rate Exchange_Rate/data=fair2 outest=outest2 n2sls;
instruments log_pm del_y del_py g x im del_log_i;
run;
quit;
and then just using the solve command for solving the model and running simulations. One problem appears to be related to the income identity that is specified in levels while the other equation are specified in log terms. I have tried to respecified the c and i in the income equation as exp(log(c)) and exp(log(i)) and even tried to used an approximated income identity in log terms but it has not helped with the convergence issue. Any further thought would be much appreciated.
PROC MODEL from SAS/ETS handles dynamic non-linear systems. Try doing a log transform on the variable output if you are having convergence issues. In addition, consider how your equations relate to each other. Below is just a hypothetical, but be sure that you are using the right fitting method. Are they SUR models, or do they need 2SLS? Could FIML be a good method to fit it? If ITOLS is not working, you'll want to reconsider the model structure and how you go about fitting it.
proc model data=have;
endo log_consumption log_investment
log_price_level short_term_int
long_term_int log_output;
log_consumption = int1 + b1*log_investment b2*log_price_level + b3*short_term_int;
log_investment = int2 + b3*log_consumption + b4*log_price_level;
<other models>;
log_output = int6 + b20*log_consumption + b21*log_price_level + b22*short_term_int;
fit / FIML;
solve / dynamic;
run;
given a matrix X(n * p), I want to split X into Y1(n * p-k) and Y2(n * k), where Y1 is composed by the first k columns of X and Y2 the others.
Now, in R I can get the "crossed" correlation between the columns of Y1 and Y2 calling cor(Y1,Y2, use="pairwise.complete.obs"), how can I get the same result in SAS IML where the corr function admits only 1 dataset?
I tried to find an appropriate solution or algorithm to implement it but with bad results.
Can anyone help with this? Also pointing me some literature about this kind or correlation would be great! I don't want you to code it for me, simply some help or hint on existing functions or algorithms to translate.
Thank you.
EDIT: don't search on the web for crossed correlation, I wrote it simply for trying to explain myself.
Looking up "crossed correlation" leads you to a series of literature on signal processing and a function much like the autocorrelation function. In fact, in R it is documented with acf https://stat.ethz.ch/R-manual/R-devel/library/stats/html/acf.html.
But that is not what your code is doing. In R:
n = 100
p = 6
k = 2
set.seed(1)
r = rnorm(n*p)
x= matrix(r,n,p)
y1 = x[,1:k]
y2 = x[,(k+1):p]
cor.ys = cor(y1,y2,use="pairwise.complete.obs")
cor.x = cor(x)
(cor.ys - cor.x[1:k,(k+1):p])
You see the result from cor(y1,y2) is just a piece of the correlation matrix from x.
You should be able to put this in IML easily.
I can think of a few ways to do this. The simplest is to compute the full matrix of Pearson correlations (using the pairwise option) and then subset the result. (What DomPazz said.) If you have hundreds of variables and you only want a few of the correlations, it will be inefficient, but it is very simple to program:
proc iml;
n = 100; p = 6; k = 2;
call randseed(1);
x = randfun(n//p, "Normal");
varNames = "x1":"x6";
corr = corr(x, "pearson", "pairwise"); /* full matrix */
idx1 = 1:k; /* specify VAR */
idx2 = (k+1):p; /* specify WITH */
withCorr = corr[idx2, idx1]; /* extract submatrix */
print withcorr[r=(varNames[idx2]) c=(varNames[idx1])];
Outside of SAS/IML you can use PROC CORR and the WITH statement to do the same computation, thereby validating your SAS/IML program:
proc corr data=test noprob nosimple;
var x1-x2;
with x3-x6;
run;
My colleague and I are running exactly the same SAS PROC LOGISTIC, but with different input files.
SAS models ooX = 1 when I do it, and ooX = 0 when he does it.
We've checked record counts and FREQ counts for the main variables. They are the same.
Type 3 analysis of effects are the same. MLE estimates are the same, except for the intercept.
Does SAS require input to be sorted a certain way?
PROC LOGISTIC data = TTTT;
class ooX Y1 Y2 Y3 Y4;
model ooX = Y1 Y2 Y3 q1 q2 q3;
RUN;
If your data are not sorted you can specify the order of your outcome variable right after calling PROC LOGISTIC.
I don't have the data, but assuming that ooX is a binary outcome variable with levels 0 and 1, the model will default to modeling ooX = 0 unless you specify that you want it in descending order.
PROC LOGISTIC data = TTTT descending; /* will model ooX = 1 */
class ooX Y1 Y2 Y3 Y4; /* Not sure if it makes sense to have your outcome in the class statement */
model ooX = Y1 Y2 Y3 q1 q2 q3;
RUN;
As explained in SAS manual (http://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_logistic_sect030.htm)
For binary response data with event and nonevent categories, if your event category has a higher Ordered Value, then by default the nonevent is modeled.