difference between proc univariate in sas 9.1 vs sas 9.3 - sas

In SAS 9.1, this code works fine and includes the missing values, which I need. As soon as I ported this program to SAS 9.3, it gave me the wrong minpoint values and excluded the missing values. How do I include the missing values and also why is it giving me the wrong output?
data myData;
input value;
datalines;
-2.47
-4
-5
5
6
7
8
9
10
12
;
run;
proc univariate data = myData noprint;
histogram value /
barwidth = 0.05
endpoints = (-2.5 to 2.45 by 0.05)
outhist = histogram
nochart;
run;
This is the HISTOGRAM dataset as output from SAS 9.1, which is correct:
MinPoint Cumpercent
-2.45 10%
-2.4 0%
-2.35 0%
However, in SAS 9.3 I get these results:
MinPoint Cumpercent
-2 10%
The first problem in the SAS 9.3 output is that observations with CUMPERCENT=0 are excluded. The second problem is that the minpoints are wrong.

Related

SAS Proc Freq Not Displaying Values

I am doing some simple cross tabulations using Proc Freq, but I'm noticing that the output SAS gives me doesn't contain any frequency counts; I'm only getting percents.
Here is an example code that I ran in SAS (I am using SAS 9.4):
data test;
input year 1-5 group $6;
cards;
2018 A
2018 A
2018 B
2018 B
2019 A
2019 A
2019 A
2019 B
;
run;
proc freq data = test;
table year * group / norow nopercent;
run;
I'm expecting a table that has the frequency counts with the column percentage below, but instead, this is what SAS is giving me:
Does anyone know how I can get the frequency values to be shown?
I ran your code and got this. I reckon there is something you are not telling us.
Thank you all for your help- I found the issue. It looks like there was an issue with the cross-tab frequency template that came with SAS. I was able to restore it by using the following code:
proc template;
delete base.freq.crosstabfreqs;
run;
Thank you all for your help!
#_null_ your image is NOT the output I get when running the questions code.
The Frequency and Col Pct are NOT in row header cells, and instead are shown in a box offset to the left from the table.

SAS: How to perform a maximum likelihoood estimation use PROC NLMIXED?

All I am trying to do is to perform a maximum likelihood estimation of the parameters of a one-side truncated normal. I think I have specified the likelihood properly but I keep getting this error:
ERROR: Invalid Operation. ERROR: Termination due to Floating Point
Exception
I don’t think there is anything wrong with my code.
data ln;
input dor 8.;
qt=quantile("normal", dor, 0, 1);
datalines;
0.10
0.20
0.15
0.22
0.15
0.10
0.08
0.09
0.12
;
run;
/* obtain number accounts */
%let dsn = ln;
%let dsnid = %sysfunc(open(&dsn));
%let nobs=%sysfunc(attrn(&dsnid,nlobs));
%let rc =%sysfunc(close(&dsnid));
proc sql noprint;
select count(*), mean(qt), std(qt) into :nobs, :mean, :std
from ln;
quit;
%put &nobs.;
%put &mean.;
%put &std.;
proc nlmixed data=LN;
parms mu &mean. sigma &std.; * initial values of parameters;
bounds 0 < sigma; * bounds on parameters;
LL = logpdf("normal", qt, mu, sigma) - &nobs.*logcdf("normal",qt, mu, sigma);
model qt ~ general(LL);
run;
Actually, I only fails to run in SAS Enterprise Guide (EG). It ran fine on Base SAS.
You should consider your version of SAS.
Following documentation here :
http://support.sas.com/kb/46/318.html
This error occurs when the client system is running the 32-bit version
of SAS because the 32-bit version of SAS cannot open a table that
contains more than 2,147,483,647 observations. This number is the
largest value that can be stored in a 32-bit variable, and it is
sometimes referred to as 2G-1, where 2G-1 means 2^31-1.
It seems possible to fix it but I kindly suggest you to continue to run the code on SAS Base if it's running well and if you don't want to waste time in system configuration.
Regards,

SAS proc ttest or proc mixed

Hey so I'm trying to do a 2 tailed t test to well I'll just post the question
Solution 1: 9.9, 10.6, 9.4, 10.3, 10.0, 9.3, 10.3, 9.8
Solution 2: 10.2, 10.6, 10.0, 10.2, 10.7, 10.4, 10.5, 10.3
(a) Do the data indicate that the claim that both solutions have the same mean etch rate is valid? Use ( alpha= 0.05 and assume equal variances.
(b) Find a 95 percent confidence interval on the difference in mean etch rates.
Here is my code.
data one;
input one ##;
cards;
9.9 10.6 9.4 10.3 10.0 9.3 10.3 9.8
run;
data two;
input two ##;
cards;
10.2 10.6 10.0 10.2 10.7 10.4 10.5 10.3
run;
data one_two;
set one two;
run;
I have tried using proc t test however I am having a hard time making the 2 diffirent data sets compare with one another I was going to use Proc Mixed data however I am not getting any output with that. So any hints or tips that can be given will be very much appreciated.
Thanks.
Good day,
The issue seems to with the formatting. Code
data one_two;
set one two;
run;
Creates table like
one two
1 .
1 .
. 2
. 2
. 2
What you want to do is merge the data row by row:
data one_two;
merge one two;
run;
proc ttest data=one_two;
var one two;
run;
For more about merging in SAS see documentation.

SAS PROC PRINT is really slow for me, any ideas?

Let me start by saying that I'm on a team that are all very new to SAS. We are using Enterprise Guide 5.1 in SAS 9.3, and have a set of schedule data arranged vertically (one or two rows per person per day). We have some PROC SQL statements, a PROC TRANSPOSE, and a couple other steps that together primarily make the data grouped by week and displayed horizontally. That set of code works fine. The first time the process flow runs, it takes a little extra time establishing the connection to the database, but once the connection is made, the rest of the process only takes a few seconds (about 6 seconds for a test run of 7 months of data: 58,000 rows and 26 columns of source data going to 6,000 rows, 53 columns of output).
Our problem is in the output. The end-users are looking for results in Excel, so we are using the SAS Excel add-in and opening a stored process. In order to get output, we need a PROC PRINT, or something similar. But using PROC PRINT on the results from above (6,000 rows x 53 columns) is taking 36 seconds just to generate. Then, it is taking another 10 seconds or so to render in EG, and even more time in Excel.
The code is very basic, just:
PROC PRINT DATA=WORK.Report_1
NOOBS
LABEL;
RUN;
We have also tried using a basic PROC REPORT, but we are only gaining 3 seconds: it is still taking 33 seconds to generate plus rendering time.
PROC REPORT DATA=WORK.Report_1
RUN;
QUIT;
Any ideas why it is taking so long? Are there other print options that might be faster?
Tested on my laptop. Took about 13 seconds to output a table with 6000 records and 53 variables (I used 8 character long strings) with PROC PRINT and ODS HTML.
data test;
format vars1-vars53 $8.;
array vars[53];
do i=1 to 6000;
do j=1 to 53;
vars[j] = "aasdfjkl;";
end;
output;
end;
drop i j;
run;
ods html body="c:\temp\test.html";
proc print data=test noobs;
run;
ods html close;
File size was a little less than 11M.
If you are only using this as a stored process, you can make it a streaming process and write to _WEBOUT HTML. This will work for viewing in Excel and greatly reduces the size of the HTML generated (no CSS included).
data _null_;
set test end=last;
file _webout;
array vars[53] $;
format outstr $32.;
if _n_ = 1 then do;
put '<html><body><table>';
put '<tr>';
do i=1 to 53;
outstr = vname(vars[i]);
put '<th>' outstr '</th>';
end;
put '</tr>';
end;
put '<tr>';
do i=1 to 53;
put '<td>' vars[i] '</td>';
end;
put '</tr>';
if last then do;
put '</table></body></html>';
end;
run;
This takes .2 seconds to run and generated 6M of output. Add any HTML decorators as needed.

SAS Proc means: How to capture non default statistics in output dataset such as nmiss p1 p99 etc?

Original Question:
By default Proc Means outputs N, MIN, MEAN, MAX and STD in the output dataset. How do I add, NMISS, P1, P5 etc to this list?
Additional info 1:
I want statistics on all numeric variables in my dataset. So I use _numeric_ in the var specification.
I wan't each statistic to be in a row and variables for columns.
Obs _TYPE_ _FREQ_ _STAT_ var1 var2 var3 etc
1 0 84829 N 84826.00
2 0 84829 MIN 0.00
3 0 84829 MAX 5000.00
4 0 84829 MEAN 151.22
5 0 84829 STD 1989.47
6 0 84829 NMISS 3
7 0 84829 P1 2.00
8 0 84839 P99 4999.00
How do I do this?
Thanks!
Assuming you are using the output option in proc means (and not ODS OUTPUT), you can control what comes in that dataset like so:
proc means data=sashelp.class;
var age;
class sex;
output out=mymeans nmiss= P1= P5= /autoname;
run;
The full list of statistic names is available in the PROC MEANS documentation under "statistics keyword".
You can also achieve the same result (with a slightly different output format) with ODS OUTPUT.
ods output summary=mymeans;
ods trace on;
proc means data=sashelp.class nmiss p1 p5;
var age;
class sex;
run;
ods trace off;
ods output close;
ODS TRACE on/off is to show the name of the table created (ie, 'summary'). It's not needed in production. In this case you ask for statistics the same way you ask for them to the output window (in the PROC MEANS statement).
Based on your edits, you want it transposed (one row per statistic). You can't get that directly, but the transposition isn't very hard.
proc means data=sashelp.class nmiss p1 p5;
class sex;
var _numeric_;
output out=mymeans n= mean= nmiss= p1= p5= /autoname ;
run;
data mymeans_out;
set mymeans(drop=_type_ _freq_);
by sex;
array numvars _numeric_;
format var stat $32.;
do _t = 1 to dim(numvars);
var=scan(vname(numvars[_t]),1,'_');
stat=scan(vname(numvars[_t]),-1,'_');
value = numvars[_t];
output;
end;
keep sex var stat value;
run;
This has a few limitations. If your variable names have underscores in them already, the var=scan... line will need to be rewritten to use substr and find the last underscore, then var = substr(vname(...),1,position_of_last_underscore). Stat should be fine since it uses -1 (reverse direction). If your variable names might exceed ~23 characters, you may not get the exact variable name back out again as it may be truncated or modified. If that's the case, then the ODS OUTPUT solution from above will help you (as it provides in an additional column the name of the original variable), but some more work would be needed to relate that value to the truncated name.
I also drop _TYPE_ and _FREQ_, to simplify the array definition; if you need those, then you'd need to write a bit of code to exclude them from separately being output, and keep them.
This paper has an excellent discussion of the exact issue you describe, along with macro code to output a dataset fitting your description.
A Better Means — The ODS Data Trap
Update:
I've discovered that there is a more recent paper that "presents a revised version of the macro supporting additional features and eliminating a surprising error." This is the updated solution:
Solve the SAS® ODS Data Trap in PROC MEANS
The macro appears well designed and avoids a wide variety of possible issues. The contortions used to create the output dataset involve calls to proc means (of course), proc sql, proc contents, and proc datasets and extensive use of the macro language architecture, and a description of them would probably not be instructive in this answer. I don't claim to understand it entirely myself.
However, once you have compiled the macro you should be able to create your desired dataset with one simple statement.
%better_means(data=MyDataSet)
Now that I've found this convenient solution I may start to use it myself.