Can SAS do as STATA esttab? - sas

STATA has a wonderful code esttab to report multiple regressions in one table. Each column is a regression and each row is a variable.
Can SAS do the same thing? I only can get something in SAS like the following. However, the table is not so beautiful as esttab.
Thanks in advance.
data error;
input Y X1 X2 X3 ;
datalines;
4 5 6 7
6 6 5 9
9 8 8 8
10 10 2 1
4 4 2 2
6 8 3 5
4 4 6 7
7 9 8 8
8 8 5 5
7 5 6 7
9 8 9 8
0 2 5 8
6 6 8 7
1 2 5 4
5 6 5 8
6 6 8 9
7 7 8 2
5 5 8 2
5 8 7 8
run;
PROC PRINT;RUN;
proc reg data=error outest=est tableout alpha=0.1;
M1: MODEL Y = X1 X2 / noprint;
M2: MODEL Y = X2 X3 / noprint;
M3: MODEL Y = X1 X3 / noprint;
M4: MODEL Y = X1 X2 X3 / noprint;
proc print data=est;
run;

Thanks for Praneeth Kumar's inspiration. I found the related information from http://stats.idre.ucla.edu/sas/code/ummary-table-for-multiple-regression-models/
I change it to fit my needs.
/*1*//*set the formation*/
proc format;
picture stderrf (round)
low-high=' 9.9999)' (prefix='(')
.=' ';
run;
/*2*//*run the several regressions and turn the results to a dataset*/
ods output ParameterEstimates (persist)=t;
PROC REG DATA=error;
M1: MODEL Y = X1 X2 ;
M2: MODEL Y = X2 X3 ;
M3: MODEL Y = X1 X3 ;
M4: MODEL Y = X1 X2 X3 ;
run;
ods output close;
proc print data=t;
run;
/*3*//*use the formation and the dataset change into a table*/
proc tabulate data=t noseps;
class model variable;
var estimate Probt;
table variable=''*(estimate =' '*sum=' '
Probt=' '*sum=' '*F=stderrf.),
model=' '
/ box=[label="Parameter"] rts=15 row=float misstext=' ';
run;

I have not used Stata but knew it as a part of my project. Unfortunately, there's no good way to do it using SAS. You can try installing and using latest Tagsets to get the desired output. excltags.tpl should help in this case.
Like,
ods path work.tmplmst(update) ;
filename tagset url 'http://support.sas.com/rnd/base/ods/odsmarkup/excltags.tpl';
%include tagset;
Above installs Tagsets and stores the same in Work. This will not disrupt already installed tagsets on the system. Also, this step need to be done everytime you open a new SAS session.
ods listing close;
ods tagsets.ExcelXP file='Excelxp.xml';
#Your Code#
proc reg data=error outest=est tableout alpha=0.1;
M1: MODEL Y = X1 X2 / noprint;
M2: MODEL Y = X2 X3 / noprint;
M3: MODEL Y = X1 X3 / noprint;
M4: MODEL Y = X1 X2 X3 / noprint;
proc print data=est;
run;
#Your Code#
ods tagsets.ExcelXP close;
I am currently on my home desktop and it dont have SAS installed and i've not given it a try. This should export result of regressions into a table that includes coefficients,significance level etc. into Excel.
Let me know if this works. Also, please refer to this Document for more information.

Related

Geometric mean in SAS for each column of a data set

Suppose I have the following SAS data set
data have;
input id$ x1 x2 x3;
datalines;
1 10 5 15
2 15 10 7
3 12 10 9
4 14 15 10
;
run;
I would like to calculate the geometric mean, geometric coefficient of variation (formula is 100*(exp(ASD^2)-1)^0.5 ASD is the arithmetic SD of log-transformed data.) of each variable x1, x2, and x3. How can I perform these operations?
As I am completely new in SAS, I would appreciate your support.
*Compute log value of x1, x2, x3;
data tab1;
set have;
logx1=log(x1);
logx2=log(x2);
logx3=log(x3);
run;
*Compute SD of logx1, logx2, logx3;
proc means data=tab1 noprint;
var logx1-logx3;
output out=tab2 n=n1-n3 std=std1-std3;
run;
*Compute logcv using formula;
data tab3;
set tab2;
logcv1=100*(exp(std1**2)-1)**0.5;
logcv2=100*(exp(std2**2)-1)**0.5;
logcv3=100*(exp(std3**2)-1)**0.5;
putlog 'NOTE: ' logcv1= logcv2= logcv3=;
run;
The result is show in log window:
NOTE: logcv1=18.155613536 logcv2=48.09165987 logcv3=32.538955751
It not much diffcult to do caculation in SAS, just try to do it step by step and you will get it.

Creating pie chart from single data row - SAS analytics

I have a dataset that consists of a product variable, an area variable and then each of the years from the last 10 years as an individual variable, so 12 variables in total for the dataset.
I cant work out how to display the data from a single row into a pie chart.
The dataset looks as such just to make it easier to visualise:
Product Area year1 year2 year3
1 1 7 14 7
1 2 12 15 11
1 3 5 9 8
2 1 4 12 5
2 2 8 3 14
2 3 5 0 2
3 1 2 12 12
My end result is to be able to input say product 1 and area 3 and then have it produce a pie chart that shows the values for each of the years. I can't figure out a way of doing it though, my current knowledge and research suggests that pulling from a single row isn't possible?
First stack the years variables in one colum with proc transpose ;
Then make a normal pie chart with BY Product Area; to have one chart per original line (assuming Product*Area is actually a unique ID for you lines). I used proc gchart here.
*** DEFINE DATA --------------;
data have;
infile datalines dlm=' ';
input Product Area year1 year2 year3;
datalines;
1 1 7 14 7
1 2 12 15 11
1 3 5 9 8
2 1 4 12 5
2 2 8 3 14
2 3 5 0 2
3 1 2 12 12
;run;
*** STACK YEARS --------------;
proc sort data=work.have out=work.tmp0temptableinput;
by product area;
run;
proc sql;
create view work.tt1 as
select src.*, "values" as _eg_idcol_
from work.tmp0temptableinput as src;
quit;
proc transpose data=work.tt1
out=work.tt2
name=year;
by product area;
id _eg_idcol_;
var year:; * THIS IS GENERALISED FOR FOR THAN 3 YEARxx VARIABLES;
run;
proc datasets lib=work nolist;
modify tt2;
label values = ;
label year = ;
label valuedescription = ;
run;
*** PLOT --------------;
proc sort
data=work.tt2
out=work.tt3;
by product area;
run;
proc gchart data =work.tt3;
pie year /
sumvar=values
type=sum
nolegend
slice=outside
percent=none
value=outside
other=4
otherlabel="other"
coutline=black
noheading
;
by product area;
run; quit;

SAS - Create Table

I have two models that have output:
output out=m1 pred=p1;
output out=m2 pred=p2;
They run fine and create the below sample observations:
Sample of M2
Obs p1
1 0.98057
2 0.71486
3 0.91951
4 0.93073
5 0.93505
6 0.98788
7 0.94461
8 0.99449
9 0.93282
10 0.88654
AND
Sample of M1
Obs p2
1 0.97988
2 0.70704
3 0.91731
4 0.92880
5 0.93324
6 0.98746
7 0.94386
8 0.99431
9 0.93102
10 0.88404
Next I try to combine the two into a table with the statement:
proc sql;
create table ptable as select *
from m1 as a left join m2 as b on a.cnt=b.cnt;
quit;
But I'm getting this error:
ERROR: Column cnt could not be found in the table/view identified with the correlation name A.
ERROR: Column cnt could not be found in the table/view identified with the correlation name A.
ERROR: Column cnt could not be found in the table/view identified with the correlation name B.
ERROR: Column cnt could not be found in the table/view identified with the correlation name B.
So how do I put p1 and p2 into a table in SAS?
Below is the code used to generate p1 and p2 where DATA = source
/*initial model*/
proc hplogistic data=DATA;
model value(event='1')=X1-X20;
output out=m1 pred=p1;
run;
/*new model*/
proc hplogistic data=data;
model value(event='1')=X1-X20;
output out=m2 pred=p2;
run;
This is a case for performing a merge without a by statement. The tables will be joined row-by-row.
For example
data have1(keep=p1) have2(keep=p2);
do _n_ = 1 to 10;
p1 = 1 / _n_;
p2 = _n_ ** 2;
output;
end;
run;
data want;
merge have1 have2;
*** NO BY STATEMENT ***;
run;
proc print data=want;run;
-------- OUTPUT --------
Obs p1 p2
1 1.00000 1
2 0.50000 4
3 0.33333 9
4 0.25000 16
5 0.20000 25
6 0.16667 36
7 0.14286 49
8 0.12500 64
9 0.11111 81
10 0.10000 100
Merges without BY is atypical in the data processing that I normally do.
Here is the answer that ran the whole way through:
/*initial model*/
proc hplogistic data=DATA;
model value(event='1')=X1-X20;
output out=m1 pred=p1 copyvars=(cnt readmit);
run;
/*new model*/
proc hplogistic data=data;
model value(event='1')=X1-X20;
output out=m2 pred=p2 copyvar=cnt;
run;
cnt had to be included in the output to be joined in the table.

Why does proc arima with NoEst throw 'There is not enough data to fit the model' error?

I am using proc arima in SAS 9.4 to produce a forecast using a previously calibrated model, but it is throwing an error as if it is trying to calibrate the model itself :
ERROR: There is not enough data to fit the model
sample data:
data inputs;
input x var1 var2 var3 var4 var5;
datalines;
20 5 2 4 5 4
25 12 56 13 44 4
20 5 2 4 5 4
25 12 56 13 44 4
20 5 2 4 5 4
25 12 56 13 44 4
. 2 5 6 5 4
;
failing version:
proc arima;
identify
data = inputs
var = x
crossCorr = ( var1 var2 var3 var4 var5 )
noPrint;
estimate
p = 1 input = ( var1 var2 var3 var4 var5 )
ar = 0.9
initVal = ( 0.1$var1 0.2$var2 0.3$var3 0.4$var4 0.4$var5 )
noint
noEst /* Using noEst so should not need to do any estimation and short data-set should not be a problem */
method=ml
noprint
;
forecast lead=1 out=outputs noOutAll noprint;
quit;
If I remove the final variable from the model, it works fine:
proc arima;
identify
data = inputs
var = x
crossCorr = ( var1 var2 var3 var4 )
noPrint;
estimate
p = 1 input = ( var1 var2 var3 var4 )
ar = 0.9
initVal = ( 0.1$var1 0.2$var2 0.3$var3 0.4$var4 )
noint
noEst /* Using noEst so should not need to do any estimation and short data-set should not be a problem */
method=ml
noprint
;
forecast lead=1 out=outputs noOutAll noprint;
quit;
I can also get it to 'work' by adding one more value to the data. However, this shouldn't be necessary when the model is already calibrated (using much more data).
I've checked the SAS documentation to see if there are any flags to prevent the unnecessary check that causes this error but none of them helped.
The answer has been provided on the SAS communities forum. It is known behaviour and so my uncommon use case is not supported. The only workaround would be to add some dummy data, but in my case with MA terms that would change the results.
Response on SAS Communities

SAS proc SQL and arrays

This is a newbie SAS question. I have a dataset with numerical variables v1-v120, V and a categorical variable Z(with say three possible values). For each possible value of Z, I would like to get another set of variables w1-w120, where w{i}=sum(v{i}}/V, where the sum is a sum over a given value of Z. Thus I am looking for 3*120 matrix in this case. I can do this in data step, but would like to do it by Proc SQL or Proc MEANS, as the number of categorical variables in the actual dataset is moderately large. Thanks in advance.
Here's a solution using proc sql. You could probably also do something similar with proc means using an output dataset and a 'by' statement.
data t1;
input z v1 v2 v3;
datalines;
1 2 3 4
2 3 4 5
3 4 5 6
1 7 8 9
2 4 7 9
3 2 2 2
;
run;
%macro listForSQL(varstem1, varstem2, numvars);
%local numWithCommas;
%let numWithCommas = %eval(&numvars - 1);
%local i;
%do i = 1 %to &numWithCommas;
mean(&varstem1.&i) as &varstem2.&i,
%end;
mean(&varstem1.&numvars) as &varstem2.&numvars
%mend listForSQL;
proc sql;
create table t2 as
select
z,
%listForSQL(v, z, 3)
from t1
group by z
;
quit;
It's easy to do this with proc means. Using the t1 data set from Louisa Grey's answer:
proc means data=t1 nway noprint;
class z;
var v1-v3;
output out=t3 mean=w1-w3;
run;
This creates an table of results that match the SQL results.