SAS: robust regression and output coefficients, t values and adj R squares - sas

I am running robust regression by group in SAS .
My data is like
id stock date stock_liq market_liq
1 VOD 1/5/2016 0.03 0.02
1 VOD 2/5/2016 0.04 0.025
... ... ... ... ...
2 SAB 1/5/2016 0.31 0.02
2 SAB 1/5/2016 0.31 0.02
... ... ... ... ...
Its a panel data and each stock has a unique ID. I want to run robust regression by ID and I want to output the coefficients, t values and adj-R squares.
My code is:
proc robustreg data=have outest= want noprint;
model stock_liq=market_liq ;
by id;
run;
However I don't think the code runs properly. SAS just stops running and the log gives me
"Error: Too many parameters in the model".
Can anyone advise ? Thank you !

The syntax is a bit off. Also the requested outputs can be added:
proc robustreg data=have outest= want noprint;
by id;
model stock_liq=market_liq ;
output out=output_sas
p=stock_liq
r=stock_liqresid ;
run;
See more on the output options from documentation

Related

ERROR: The ID value "xxxxxxxxxxxx" occurs twice in the same BY group. when transposing a complex dataset

I have a strange data set and I am hoping you all can help me. I have a data set of the levels of certain environmental contaminants which are measured multiple ways along with the limit of detection are present in a group of research participants. I need these in a wide format, but unfortunately they are currently long and the naming conventions don’t easily translate.
This is what it looks like now:
ID Class Name Weight Amount_lipids Amount_plasma LOD
1 AAA Lead 1.55 44.0 10.0 5.00
1 AAB Mercury 1.55 222.0 100.0 75.00
2 AAA Lead 1.25 25.5 12.0 5.00
I have tried various forms of Proc Transpose with no luck and this seems to be more complex than what specifying a prefix can handle.
I want it to look like this:
ID Weight Lead_lip Lead_plas Lead_LOD Mercury_lip Mercury_plas Mercury_LOD
1 1.55 44.0 10.0 5.0 222.0 100.0 75.0
2 1.25 25.5 12.0 5.0 . . .
I tried a two step transpose process but received the following error ERROR: The ID value "xxxxxxxxxxxx" occurs twice in the same BY group
by id weight name;
run;
proc transpose data=want_intermediate out=want;
by id weight;
id name _name_;
run;
You likely have a record with the same ID and weight so it's duplicated.
You can add a counter for each ID record and use that. This is a double wide transpose, and it looks like your code was cut off. So to add an enumerator for each ID:
data temp;
set have;
by id;
if first.id then count=1;
else count+1;
run;
Then modify your PROC TRANSPOSE to use ID and count in the BY statement.

Converting daily data to weekly data in SAS

I have the DAILY returns of industry portfolios in SAS.
I would like to calculate the WEEKLY returns.
The daily returns are in percentage so I think that should just be the sum of returns during each week.
Obvious problems I am facing is that the weeks can have a different number of days in.
The table I have in SAS is in the following format:
INDUSTRY_NUMBER DATE DAILY_RETURN
Any help would be greatly appreciated.
I have tried this:
proc expand data=Day_result
out=Week_result from=day to=week;
Industry_Number Trading_Date;
convert Value_weighted_return / method=aggregate observed=total;
run;
The daily data is in Day_Result when I remove the forth line i.e.
proc expand data=Day_result
out=Week_result from=day to=week;
convert Value_weighted_return / method=aggregate observed=total;
run;
This works as in it does what I want it to do but it doesn't do it for each category it does it for the whole table.
So if I have 40 categories I want the weekly returns for each category.
The second set of code provides the weekly return for every category.
EXAMPLE DATA:
data have;
format trading_date date9.;
infile datalines dlm=',';
input trading_date:ddmmyy10. industry_number value_weighted_return;
datalines;
19/01/2000,1, -0.008
20/01/2000,1, 0.008
23/01/2000,1, 0.008
24/01/2000,1, -0.007
25/01/2000,1, -0.009
26/01/2000,1, 0.008
27/01/2000,1, -0.008
30/01/2000,1, 0.003
31/01/2000,1, -0.001
01/02/2000,1, 0.004
02/02/2000,2, -0.008
03/02/2000,2, -0.005
06/02/2000,2, -0.004
07/02/2000,2, -0.009
08/02/2000,2, 0.002
09/02/2000,2, 0.006
10/02/2000,2, 0.008
13/02/2000,2, 0.008
14/02/2000,2, 0.002
15/02/2000,2, 0.01
16/02/2000,2, -0.008
;
run;
Sort your data by INDUSTRY_NUMBER Trading_Date, use INDUSTRY_NUMBER as a by-group, identify your time variable.
proc sort data=have;
by industry_number trading_date;
run;
Next, convert your data into a time-series to remove any time gaps. Set any missing days as the previous value since it does not change on those trading days (e.g. weekends, bank holidays, etc.).
proc timeseries data=have
out=have_ts;
by industry_number;
id trading_date interval=day
setmissing=previous
accumulate=average
;
var value_weighted_return;
run;
Finally, take the time-series output and convert it from day to week. Since you are using weights, you may want to use average rather than total.
proc expand data=have_ts
out=have_ts_week
from=day
to=week
;
by industry_number;
id trading_date;
convert Value_weighted_return / method=aggregate observed=average;
run;

How to print the the Somers'D in a SAS dataset?

As the title suggests, I wonder about there's a way to print the Somers'D statistics and the p-value of the predictor x in a dataset.
You can get such statistics by simply running:
ODS TRACE ON;
PROC LOGISTIC DATA = BETTING.TRAINING_DUMMIES NOPRINT;
MODEL Z1 (EVENT = '1') = D_INT_LNGAP_1;
OPTIONS;
RUN;
ODS TRACE OFF;
ODS OUTPUT FITSTATISTICS=FITDS;
PROC LOGISTIC DATA = BETTING.TRAINING_DUMMIES NOPRINT;
MODEL Z1 (EVENT = '1') = D_INT_LNGAP_1;
OPTIONS;
RUN;
If I run a similar code to the one proposed here, I get only the AIC, the SIC and finally the LR stat and in the SAS log I find:
10 ODS TRACE ON;
11
12 PROC LOGISTIC DATA = BETTING.TRAINING_DUMMIES NOPRINT;
13 MODEL Z1 (EVENT = '1') = D_INT_LNGAP_1;
14 OPTIONS;
15 RUN;
NOTE: PROC LOGISTIC is modeling the probability that z1=1.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 3968 observations read from the data set BETTING.TRAINING_DUMMIES.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.07 seconds
cpu time 0.04 seconds
16
17 ODS TRACE OFF;
in the first piece of code, while in the second I find the following:
18 ODS OUTPUT FITSTATISTICS=FITDS;
NOTE: Writing HTML Body file: sashtml.htm
19 PROC LOGISTIC DATA = BETTING.TRAINING_DUMMIES NOPRINT;
20 MODEL Z1 (EVENT = '1') = D_INT_LNGAP_1;
21 OPTIONS;
22 RUN;
NOTE: PROC LOGISTIC is modeling the probability that z1=1.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 3968 observations read from the data set BETTING.TRAINING_DUMMIES.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds
WARNING: Output 'FITSTATISTICS' was not created. Make sure that the output object name, label,
or path is spelled correctly. Also, verify that the appropriate procedure options are
used to produce the requested output object. For example, verify that the NOPRINT
option is not used.
Some of you can suggest a way to to print such statistics in a new dataset?
Any help will be appreciated.
Thanks!
I don't know why you're not getting ODS TRACE output. I'd restart your SAS version or report it to SAS.
The tables you want are called Association and ParameterEstimates. Somer's D requires the Odds Ratio statement to be created.
ods trace on;
ods output association=somers parameterestimates=pe;
proc logistic data=sashelp.heart;
model status=ageatstart;
oddsratio ageatstart;
run;
ods trace off;

Proc Transpose 2 columns together in SAS [duplicate]

In SAS, I have a data set similar to the one below.
ID TRACT meanFA sdFA medianFA
1 t01 0.56 0.14 0.56
1 t02 0.53 0.07 0.52
1 t03 0.71 0.08 0.71
2 t01 0.72 0.09 0.72
2 t02 0.83 0.10 0.86
2 t03 0.59 0.10 0.62
I am not sure if transpose is the right concept here... but I would want the data to look like the one below.
ID t01_meanFA t01_sdFA t01_medianFA t02_meanFA t02_sdFA t02_medianFA t03_meanFA t03_sdFA t03_medianFA
1 0.56 0.14 0.56 0.53 0.07 0.52 0.71 0.08 0.71
2 0.72 0.09 0.72 0.83 0.10 0.86 0.59 0.10 0.62
proc transpose data=TRACT out=newTRACT;
var meanFA sdFA medianFA;
by id;
id tract meanFA sdFA medianFA;
run;
I have been playing around with the SAS code above, but with no success. Any ideas or suggestions would be great!
Double transpose is how you get to that. Get it to a dataset that has one row per desired variable per ID, so
ID=1 variable=t01_meanFA value=0.56
ID=1 variable=t01_sdFA value=0.14
...
ID=2 variable=t01_meanFA value=0.72
...
Then transpose using ID=variable and var=value (or whatever you choose to name those columns). You create the intermediate dataset by creating an array of your values (array vars[3] meanFA sdFA medianFA;) and then iterating over that array, setting variable name to catx('_',tract,vname(vars[n])); (vname gets the variable name of the array element).
You need 2 transposes. Transpose, use a data step to update then _NAME_ variable, and then transpose again;
proc transpose data=tract out=tract2;
by id tract;
run;
data tract2;
format _name_ $32.;
set tract2;
_name_ = strip(tract) || "_" || strip(_name_);
run;
proc transpose data=tract2 out=tract3(drop=_name_);
by id;
/*With no ID statement, the _NAME_ variable is used*/
var col1;
run;
Using example data from this duplicate question.
You can also just do this with a data step.
First, put the maximum sequence number into a macro variable.
proc sql;
select
max(sequence_no) into : maxseq
from
have
;
quit;
Create arrays for your new variables, setting the dimensions with the macro variable. Then loop over each visit, putting the events and notes into their respective variables. Output 1 line per visit.
data want(drop=sequence_no--notes);
do until (last.visit_no);
set have;
by id visit_no;
array event_ (&maxseq);
array notes_ (&maxseq) $;
event_(sequence_no)=event_code;
notes_(sequence_no)=notes;
end;
output;
run;

sas, how can i retrieve a value from regression result and assign to a variable

how i can do if I want to assign i to intercept to make i = 906.73916.thanks
Parameter Estimates
Parameter Standard
Variable Label DF Estimate Error t Value Pr > |t|
Intercept Intercept 1 906.73916 28.26505 32.08 <.0001
acs_k3 avg class size k-3 1 -2.68151 1.39399 -1.92 0.0553
meals pct free meals 1 -3.70242 0.15403 -24.04 <.0001
full pct full credential 1 0.10861 0.09072 1.20 0.2321
ODS is very helpful for this. The names of different output components differ for different procs. Example for PROC REG below, should be about the same for most regression PROCS:
ods output ParameterEstimates=MyIntercept(where=(Variable="Intercept"));
proc reg data=sashelp.class;
model weight=age;
run;
quit;
ods output close;
proc print data=MyIntercept;
run;