Fixed Effects Model in SAS - sas

I have 7 independent variables that make up the total score of a company. The 7 independent variables are financial ratios.
When I run my fixed effects regression I get an estimate where standard errors are all zero. I'm assuming a dummy variable trap/ other biases is causing this, but I am extremely new to SAS and want to make sure I am using the absorb/ glm procedure correctly for a fixed effects model. My code is below:
proc sort data=MYEXCEL;
by Company;
proc glm data= MYEXCEL;
Absorb Company;
model Y = X1 X2 X3 X4 X5 X6 X7 / solution;
store sasuser.Analysis / label='PLM: Getting Started;
run;

Related

Predicting a value using certain x values in multiple Linear regression in sas

In a multiple linear regression problem i have to predict a value using x1=77, x2=20, x3=1998
The code I currently have is
PROC reg Data=GPA;
model y=x1 x2 x3/i;
Output out=new3 student=student2 p=predict2 r=resid2;
Run;
The code runs but I’m not sure how to use the input values to predict another value
Ryan,
An easy way to do this is to add a new row to your data set using the following code:
proc sql;
insert into test (x1, x2, x3) values (77, 20, 1998);
quit;
Then run the model again and you should get your predicted value for y.

Stepwise selection method in (SAS 9.3) PROC REG

I'm running a multivariate linear regression model in SAS (v. 9.3) using the REG procedure with the stepwise statement, as follows below:
(1) Set the regressors list:
%let regressors = x1 x2 x3;
(2) Run the procedure:
ods output DWStatistic=DW ANOVA=F_Fisher parameterestimates=beta CollinDiag=Collinearita outputstatistics=residui fitstatistics=rsquare;
proc reg data=base_dati outest=reg_multivar edf;
model TD&eq. = &regressors. /selection=stepwise`SLSTAY=&signif_amm_multivar_stay. SLENTRY=&signif_amm_multivar_entry. VIF COLLIN adjrsq DW R influence noint;
output out=diagnostic;
quit;
ods output close;
By adding one regressor to the list, let's say x4, to the macro-variable &regressors., the beta value estimates change, although the selected variables are the same ones.
In practice, in both cases the variables chosen from such selection method are x1 and x2, but beta parameters for x1 and x2 change in the second case with respect to the second case.
Could you provide an explanation for that?
It would be nice to have a reference for such explanation.
Thanks all in advance!
I'm going to guess that you have missing data. SAS removes records row wise. So if you include 2 more variables that happen to have a few missing those entire records will be missing which means you're not actually using the exact same data between each regression model.

Outputting predicted values in SAS proc mixed: Prohibitive performance issues

I've noticed strange behavior with SAS proc mixed: Models with a modestly large number of rows, which take only seconds to converge, nevertheless take upwards of half an hour to finish running if I ask for output of predicted values & residuals. The thing that seems perverse is that when I run the analogous models in R using nlme::lme(), I get the predicted values & residuals as a side effect and the models complete in seconds. That makes me think this is not merely a memory limitation of my machine.
Here's some sample code. I can't provide the real data for which I'm seeing this issue, but the structure is 1-5 rows per subject, ~1500 unique subjects, ~5,000 outcome-covariate sets total.
In SAS:
proc mixed data=testdata noclprint covtest;
class subjid ed gender;
model outcome = c_age ed gender / ddfm=kr solution residual outp=testpred;
random int c_age / type=un sub=subjid;
run;
In R:
lme.test <- lme(outcome ~ c_age + ed + gender, data=testdata,
random = ~c_age|factor(subjid), na.action=na.omit)
Relevant stats: Win7, SAS 9.4 (64-bit), R 3.3, nlme 3.1-131.

How does SAS calculate standard errors of coefficients in logistic regression?

I am doing a logistic regression of a binary dependent variable on a four-value multinomial (categorical) independent variable. Somebody suggested to me that it was better to put the independent variable in as multinomial rather than as three binary variables, even though SAS seems to treat the multinomial as if it is three binaries. THeir reason was that, if given a multinomial, SAS would report std errors and confidence intervals for the three binary variables 'relative to the omitted variable', whereas if given three binaries it would report them 'relative to all cases where the variable was zero'.
When I do the regression both ways and compare, I see that nearly all results are the same, including fit statistics, Odds Ratio estimates and confidence intervals for odds ratios. But the coefficient estimates and conf intervals for those differ between the two.
From my reading of the underlying theory,as presented in Hosmer and Lemeshow's 'Applied Logistic Regression', the estimates and conf intervals reported by SAS for the coefficients are consistent with the theory for the regression using three binary independent variables, but not for the one using a 4-value multinomial.
I think the difference may have something to do with SAS's choice of 'design variables', as for the binary regression the values are 0 and 1, whereas for the multinomial they are -1 and 1. But I don't really understand what SAS is doing there.
Does anybody know how SAS's approach differs between the two regressions, and/or can explain the differences in the outputs?
Here is a link to the SAS output:
SAS output
And here is the SAS code:
proc logistic data=tab descending;
class binB binC binD / descending;
model y = binD binC binB ;
run;
proc logistic data=tab descending;
class multi / descending;
model y = multi;
run;

Using margins with vce(unconditional) option after xtreg

I am using Stata 13 and I have a balanced panel dataset (t=Year and i=Individual denoted by Year and IndvID respectively) and the following econometric model
Y = b1*var1 + b2*var2 + b3*var1*var2 + b4*var4 + fe + epsilon
am estimating the following fixed-effects regression with year dummies and a linear time trend
xi: xtreg Y var1 var2 c.var1#c.var2 var3 i.Year i.IndvID|Year, fe vce(cluster IndvID)
(all variables are continuous except for dummies being created by i.Year and i.IndvID|Year)
I want Stata to derive/report the overall marginal effect of var1 and var2 on the outcome Y:
dY/dvar1 = b1 + b3*var2
dY/dvar2 = b2 + b3*var1
Because I estimate the fixed-effect regression using robust standard errors, I want to make sure the marginal effect are being computed taking into account the same heterogeneity that the clustered standard errors correct for. My understanding is that this can be achieved using the
vce(unconditional) option of the margins command. However, after running the above regression, when I run the command
margins, dydx(var1) vce(unconditional)
I get the following error:
xtreg is not supported by margins with the vce(unconditional) option
Am I missing something obvious here or am I not going about this correctly? How can I cluster standard errors for margin estimates computed for Stata rather than using the Delta Method default, which doesn't correct for this?
Thanks in advance,
-Mark
The marginal effect of var1 and var2 are functions (of var1 and var2, respectively). If you want the marginal effect of var1 at the mean level of var2, for example, you can use the lincom command after the regression:
sum var2
local m1 = r(mean)
lincom var1 + `m1' * c.var1#c.var2
This calculates the point estimate of the marginal effect at the mean as well as the standard errors derived from the robust covariance matrix estimated in xtreg.