I am building models with proc hpgenselect but I can't set significance level. In Docs I found out that parameter: ALPHA= Specifies a global significance level. However SAS still use default value of 0.05 building model (see on image below).
I wanted to see which parameters will come to model over diffrent significance levels but now I can't do this. &significance. is a macro variable. My code:
%let significance = 0.15;
proc hpgenselect data=MySet ALPHA=&significance.;
model Y = &Var./ dist=nb ALPHA=&significance.;
id id;
selection method=STEPWISE(stop=SL) DETAILS=SUMMARY;
run;
Try the SLS=&significance on the SELECTION statement. I believe that controls the alpha for selection. The ALPHA= on the model is for the confidence intervals produced, and ALPHA= on the hpgenselect also controls the confidence intervals.
proc hpgenselect data=MySet ALPHA=&significance.;
model Y = &Var./ dist=nb ALPHA=&significance.;
id id;
selection method=STEPWISE(stop=SL SLS=&significance) DETAILS=SUMMARY;
run;
That should give you want you want.
Related
I am using proc transreg to test different transformations in the sashelp.baseball dataset. I request all plots and sometimes I can see a curve fit graph and sometimes I can't. Is there something I am missing if I want to output the regression fit with the code below?
DATA BASEBALL;
SET SASHELP.BASEBALL;
RUN;
ODS GRAPHICS ON;
ODS OUTPUT
NObs = num_obs
FitStatistics = fitstat
Coef = params
;
PROC TRANSREG
DATA=BASEBALL
PLOTS=ALL
SOLVE
SS2
PREDICTED;
;
MODEL_1:
MODEL POWER(logsalary/parameter=1) = log(nruns);
OUTPUT OUT = fitted_model;
RUN;
For clarity, the regression fit plot is a scatter plot with the estimated regression line fitted through
The fit plot is generated when the dependent variable does not have a transformation. You can create the transformation ahead of time to get this graph then.
From documentation:
ODS Graph Name: FitPlot
Plot Description: Simple Regression and Separate Group Regressions
Statement and Option: MODEL, a dependent variable that is not
transformed, one non-CLASS independent variable, and at most one CLASS
variable
This code works for me:
PROC TRANSREG
DATA=sashelp.BASEBALL
PLOTS=ALL
SOLVE
SS2
PREDICTED;
;
MODEL_1:
MODEL identity(logsalary) = log(nruns);
OUTPUT OUT = fitted_model;
RUN;
And generates the desired graph.
Is it possible to score a data set with a model created by PROC ARIMA in SAS?
This is the code I have that is not working:
proc arima data=work.data;
identify var=x crosscorr=(y(7) y(30));
estimate outest=work.arima;
run;
proc score data=work.data score=work.arima type=parms predict out=pred;
var x;
run;
When I run this code I get an error from the PROC SCORE portion that says "ERROR: Variable x not found." The x column is in the data set work.data.
proc score does not support autocorrelated variables. The simplest way to get an out-of-sample score is to combine both proc arima and a data step. Here's an example using sashelp.air.
Step 1: Generate historical data
We leave out the year 1960 as our score dataset.
data have;
set sashelp.air;
where year(date) < 1960;
run;
Step 2: Generate a model and forecast
The nooutall option tells proc arima to only produce the 12 future forecasts.
proc arima data=have;
identify var=air(12);
estimate p=1 q=(2) method=ml;
forecast lead=12 id=date interval=month out=forecast nooutall;
run;
Step 3: Score
Merge together your forecast and full historical dataset to see how well the model did. I personally like the update statement because it will not replace anything with missing values.
data want;
update forecast(in=fcst)
sashelp.air(in=historical);
by Date;
/* Generate fit statistics */
Error = Forecast-Air;
PctError = Error/Air;
AbsPctError = abs(PctError);
/* Helpful for bookkeeping */
if(fcst) then Type = 'Score';
else if(historical) then Type = 'Est';
format PctError AbsPctError percent8.2;
run;
You can take this code and convert it into a generalized macro for yourself. That way in the future, if you wanted to score something, you could simply call a macro program to get what you need.
Below is a sample of my dataset:
Within in the variable "Country" I have countries belonging to Group A, and Group B (dummy variables).
I want to do a panel regression in SAS on the returns of these countries as such:
model Returns = Event(0,1)
with the added condition that, for example,
I only want to consider countries belonging to Group A, and during a Pre-2000 period.
Is there a way to code that in SAS using this current dataset?
SAS/ETS provides the proc panel procedure that will model panel data. Note that you must have identical time periods for each cross-section. If you don't, you'll need to prepare the data with proc timeseries or proc expand beforehand.
Once you read your data in, you'll use proc panel with a where statement to construct the model. The ID statement is a bit different in proc panel. It first expects the cross-section variable, then the time ID variable.
proc panel data=have;
where GroupA = 1
AND year(date) < 2000;
id country date;
class event;
model Returns = Event;
run;
I have a problem with SAS proc logistic.
I was using the following procedures when I had OLS regression and everything worked OK:
proc reg data = input_data outest = output_data;
model y = x1-x25 / selection = cp aic stop = 10;
run;
quit;
Here I wanted SAS to estimate all possible regressions using combinations of 25 regressors (x1-x25) including no more than 10 regressors in model.
Basically, I want to do the same thing (estimate all possible models having 25 regressors with no more than 10 included in a model and output top-models in a dataset with corresponding AIC) but with logistic regression.
I also know that I can use selection = score in Proc Logistic, but I'm not sure how to use outest= then and whether Score Chi-square is really a reliable alternative to cp and AIC in proc reg
So far, I know how to do stepwise/backward/forward logistic regressions, but these methods do not suit me well and btw they display in the output dataset only the top-1 model, while I want at least top-100.
Any help or advice will be highly appreciated!
I need help with proc reg in sas. Currently I'm using the following code:
proc reg data=input outest=data_output;
model y = x1-x25 / selection = cp;
run;
quit;
I wonder how to set maximum limit of number of regressors that enter my model. Now as you can see I want SAS to test 25 variables, but also I want it to select no more than 7 variables in my model.
And one more questions, does anybody now why SAS outputs only 601 model combinations when I use the procedure above? Why doesn't it show all possible models that it can create with this 25 regressors?
Any comments and help will be appreciated!
Use the STOP= option in the model statement.
proc reg data=input outest=data_output;
model y = x1-x25 / selection = cp stop=7;
run;
quit;