How to Save R-squared in sas? - sas

I have the following sas code (last few lines). What I need is to save r-squared, adjusted r-squared (besides keeping CUSIP6 trandate CRSPtrandate tshares shrvol dolvol BSI intercept mktrf smb hml umd).
proc sort data=dailyFFrets;
by CUSIP6 trandate CRSPtrandate tshares dolvol BSI shrvol;
run;
options nonotes;
proc reg data=dailyFFrets outest=alpha (keep=CUSIP6 trandate CRSPtrandate
tshares shrvol dolvol BSI intercept mktrf smb hml umd) noprint;
by CUSIP6 trandate CRSPtrandate tshares dolvol BSI shrvol;
model rirf=mktrf smb hml umd;
quit;
options notes;
data alpha;
set alpha;
alpha=intercept*sign(tshares)*100;
run;

This might help: http://support.sas.com/kb/22/640.html
Use an ODS OUTPUT statement to save the table named FitStatistics to a data set. See this note for more on saving tables from procedures. For example,
ods output FitStatistics = fitstats;
proc reg data=in;
model y = x;
run;
You can also specify the RSQUARE and EDF options in the PROC REG or
MODEL statement to add R2 and the error degrees of freedom,
respectively, to the OUTEST= data set. Requesting any additional
statistic results in R2 being added to the OUTEST= data set.
For example, if you specify the ADJRSQ option, then adjusted R2 (ADJRSQ)
and R2 (RSQ) are added to the OUTEST= data set.

Related

Outputting p-values in SAS Proc Autoreg Procedure

i am able to output all sorts of statistics and values, however, am missing the ability to output p-values of parameter estimator significance.
I do get them in the Output window but not in my outputted tables. Here is my code
ods output PhilOul = philipps FitSummary = Stats;
proc autoreg data=ppnr_1.train outest=regression_13;
model mnos = ir_irs10y_yoyd ur_ap_yoy sav_yoyd_l1
/ stationarity=(PHILLIPS)
;
where date ge "&dev_start." and date le "&dev_end." ;
proc print data = regression_13;
run;
quit;
As you can see, I get DW-statistics (in "Stats" table), PhilipsOulier ("Philipps" table) and parameter estimates ("Regression_13") but not the significance of these parameters...
Best regards, Niels
EDIT: I used to figure out how to output p-values in PROC REG statement. Specify the TABLEOUT option. However, this option is not valid in PROC AUTOREG :-(
The p-values are in ODS output table ParameterEstimates. Change your code to be:
ods output
PhilOul = philipps
FitSummary = Stats
ParameterEstimates = Estimates
;
You can observe the ODS output tables that a procedure creates by using ODS TRACE. You only need to trace once, or if you forget :).
ODS TRACE ON;
PROC ...;
ODS TRACE OFF;

SAS - PROC SQL: How to show predicted values in a table using PROC REG?

I have run a regression on a data set. I would like to then add the predicted values into the original data set table. I would like the PredictedMS_Diff values to be added to the PROPreg_CSR_final dataset.
proc reg data=PROPreg_CSR_final outest=outest_model_1 covout plots=diagnostics(stats=(default aic
sbc));
title "CSR Final";
FinalCSR:MODEL MS_Diff_CSR=Rank_Delta_prop;
Output PREDICTED=PredictedMS_Diff
run;
title;
Your output statement does not have an OUT= option so the data set is named by SAS. Also missing a semicolon.
Output PREDICTED=PredictedMS_Diff
If that has worked it would have been a copy of the input data with PredictedMS_Diff added.
proc reg data=sashelp.class;
model weight=height;
output out=pred predicted=p residual=r;
run;

Is there a way to find predicted R-squared using PROC REG in SAS?

I am looking to have SAS calculated the predicted R-squared value using PROC REG.
You mean the R-Square? It's very easy to get.
ods output FitStatistics = FitStatistics;
proc reg data = sashelp.class;
model height = age;
quit;
ods output close;
The R-Square is located at data set FitStatistics.

Output the dropped/excluded observation in Proc GLIMMIX - SAS

When I run a proc glimmix in SAS, sometimes it drops observations.
How do I get the set of dropped/excluded observations or maybe the set of included observations so that I can identify the dropped set?
My current Proc GLIMMX code is as follows-
%LET EST=inputf.aarefestimates;
%LET MODEL_VAR3 = age Male Yearc2010 HOSPST
Hx_CTSURG Cardiogenic_Shock COPD MCANCER DIABETES;
data work.refmodel;
set inputf.readmref;
Yearc2010 = YEAR - 2010;
run;
PROC GLIMMIX DATA = work.refmodel NOCLPRINT MAXLMMUPDATE=100;
CLASS hospid HOSPST(ref="xx");
ODS OUTPUT PARAMETERESTIMATES = &est (KEEP=EFFECT ESTIMATE STDERR);
MODEL RADM30 = &MODEL_VAR3 /Dist=b LINK=LOGIT SOLUTION;
XBETA=_XBETA_;
LINP=_LINP_;
RANDOM INTERCEPT/SUBJECT= hospid SOLUTION;
OUTPUT OUT = inputf.aar
PRED(BLUP ILINK)=PREDPROB PRED(NOBLUP ILINK)=EXPPROB;
ID XBETA LINP hospst hospid Visitlink Key RADM30;
NLOPTIONS TECH=NRRIDG;
run;
Thank you in advance!
It drops records with missing values in any variable you're using in the model, in a CLASS, BY, MODEL, RANDOM statement. So you can check for missing among those variables to see what you get. Usually the output data set will also indicate this by not having predictions for the records that are not used.
You can run the code below.
*create fake data;
data heart;set sashelp.heart; ;run;
*Logistic Regression model, ageCHDdiag is missing ;
proc logistic data=heart;
class sex / param=ref;
model status(event='Dead') = ageCHDdiag height weight diastolic;
*generate output data;
output out=want p=pred;
run;
*explicitly flag records as included;
data included;
set want;
if missing(pred) then include='N'; else include='Y';
run;
*check that Y equals total obs included above;
proc freq data=included;
table include;
run;
The output will show:
The LOGISTIC Procedure
Model Information
Data Set WORK.HEART
Response Variable Status
Number of Response Levels 2
Model binary logit
Optimization Technique Fisher's scoring
Number of Observations Read 5209
Number of Observations Used 1446
And then the PROC FREQ will show:
The FREQ Procedure
Cumulative Cumulative
include Frequency Percent Frequency Percent
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
N 3763 72.24 3763 72.24
Y 1446 27.76 5209 100.00
And 1,446 records are included in both of the data sets.
I think I answered my question.
The code line -
OUTPUT OUT = inputf.aar
gives the output of the model. This table includes all the observations used in the proc statement. So I can match the data in this table to my input table and find the observations that get dropped.
#REEZA - I already looked for missing values for all the columns in the data. Was not able to identify the records there are getting dropped by only identifying the no. of records with missing values. Thanks for the suggestion though.

How to put P-Values and R-squared into a SAS data set at the same time?

I'm running this code in SAS:
%let control = A;
%let test = B C D E F;
ods output ParameterEstimates = parms;
proc reg data=reg_data outest=work.model tableout;
model &control = &test / selection= rsquare adjrsq;
run;
proc sql;
create table max_r_square as
select *
from work.model
order by _ADJRSQ_ desc, _RSQ_ desc;
quit;
It effectively goes through all of the combinations of the test variable and then drops the information including R-Squared into a data set. From there I can choose the model that has the highest R-Squared.
My problem is, I can't find a way for the table to include R-Squared and P-Values at the same time while going through all combinations of the test variable.
Taking out the rsquare and adjrsq options gets the p-values in a table, but it keeps SAS from running code on all of the combinations.
I've been looking through the proc reg arguments and options and haven't found anything that works so far.
Is there a way to have SAS run a regression on all combinations of input variables and output R-Squared and P-Values into the same table?