SAS: How to perform a maximum likelihoood estimation use PROC NLMIXED? - sas

All I am trying to do is to perform a maximum likelihood estimation of the parameters of a one-side truncated normal. I think I have specified the likelihood properly but I keep getting this error:
ERROR: Invalid Operation. ERROR: Termination due to Floating Point
Exception
I don’t think there is anything wrong with my code.
data ln;
input dor 8.;
qt=quantile("normal", dor, 0, 1);
datalines;
0.10
0.20
0.15
0.22
0.15
0.10
0.08
0.09
0.12
;
run;
/* obtain number accounts */
%let dsn = ln;
%let dsnid = %sysfunc(open(&dsn));
%let nobs=%sysfunc(attrn(&dsnid,nlobs));
%let rc =%sysfunc(close(&dsnid));
proc sql noprint;
select count(*), mean(qt), std(qt) into :nobs, :mean, :std
from ln;
quit;
%put &nobs.;
%put &mean.;
%put &std.;
proc nlmixed data=LN;
parms mu &mean. sigma &std.; * initial values of parameters;
bounds 0 < sigma; * bounds on parameters;
LL = logpdf("normal", qt, mu, sigma) - &nobs.*logcdf("normal",qt, mu, sigma);
model qt ~ general(LL);
run;

Actually, I only fails to run in SAS Enterprise Guide (EG). It ran fine on Base SAS.

You should consider your version of SAS.
Following documentation here :
http://support.sas.com/kb/46/318.html
This error occurs when the client system is running the 32-bit version
of SAS because the 32-bit version of SAS cannot open a table that
contains more than 2,147,483,647 observations. This number is the
largest value that can be stored in a 32-bit variable, and it is
sometimes referred to as 2G-1, where 2G-1 means 2^31-1.
It seems possible to fix it but I kindly suggest you to continue to run the code on SAS Base if it's running well and if you don't want to waste time in system configuration.
Regards,

Related

Why do i get Stack overflow error in SAS while running stepwise regression

I try to run stepwise regression in sas with 2.9M rows with 300 columns. I am getting below error.
ERROR: Event Stack Overflow. This is probably caused by mis-matched begin and end event calls.
My Code is
* Forward Selection;
proc reg data =work.bs_bm_final_data outest=est1;
model y = A_004 - A_300 / selection = forward slentry = 0.99 ss2 sse aic;
output out=out1 p=p r=r; run; quit;
This may be a problem with your version of SAS or something around your machine's configuration. For a server this would be a relatively small dataset but for a consumer desktop or laptop it could be too much.
Can you try running proc glmselect instead and see if it works? The adapted code is below.
proc glmselect data=sashelp.cars;
model y = A_004 - A_300 / selection=forward(sle=0.99) showpvalues;
output out=out p=p r=r;
run;
Otherwise, SAS Tech Support would be a good option to contact.

How can I calculate the average of multiple beta coefficients from a rolling window?

I'm a beginner to SAS (and stackexchange) and would really appreciate some assistance.
I currently have a regression setup in SAS which is a basic macro that runs the same single variable OLS model each year, using the prior 20 observations. That is, the year 1952 uses data from 1933 to 1952, and yields a beta for that period, with associated errors.
My goal is to calculate an average of these coefficients each decade, but I want to take into account the errors associated with each regression coefficient. Then for each decade i'd like to test whether the betas for the rolling window over that decade are significantly different from zero.
I currently have one large dataset which simply lists standard error, pvalue, beta and date, for every period. Below is a worked example of my simple code
%MACRO A_ROLLING (i);
proc reg data=dataset
outest=temp_dataset (drop=_model_) tableout;
model y = x / NOINT;
WHERE t >(1+&i) and t <(22+&i);
run;
data A_EQ1ROLLING;
set A_EQ1ROLLING temp_dataset;
run;
%mend
DATA A_EQ1ROLLING;
RUN;
%MACRO A_ROLL;
%do j=0 %to 64;
%A_ROLLING (&j);
%end;
%MEND;
%A_ROLL;

Difference between Proc univarite and Proc severity for fitting continuous (positive support) distribution

My goal is to fit a data to any distribution which has positive support. (weibull(2p), gamma(2p), pareto(2p), lognormal (2p),exponential(1P)). First attempt,i used proc univariate.This is my code
proc univariate data=fit plot outtable=table;
var week1;
histogram / exp gamma lognormal weibull pareto;
inset n mean(5.3) std='Standar Deviasi'(5.3)
/ pos = ne header = 'Summary Statistics';
axis1 label=(a=90 r=0);
run;
The first thing i noticed, there's no kolmogorov statistic shown for weibull distribution.Then i used proc severity instead.
proc severity data=fit print=all plots(histogram kernel)=all;
loss week1;
dist exp pareto gamma logn weibull;
run;
Now, i got the KS statistic for weibull distribution.
Then i compared KS statistic produced by proc severity and proc univariate. They're different. Why? Which one should i use?
I do not have access to SAS/ETS so cannot confirm this with proc severity, but I imagine that the difference you are seeing come down to the way the distribution parameters are fitted.
With your proc univriate code you are not requesting estimation for several of the parameters (some are in some cases set to 1 or 0 by default, see sigma and theta in the user guide). For example:
data have;
do i = 1 to 1000;
x = rand("weibull", 5, 5);
output;
end;
run;
ods graphics on;
proc univariate data = have;
var x;
/* Request maximum liklihood estimate of scale and threshold parameters */
histogram / weibull(theta = EST sigma = EST);
/* Request maximum liklihood estimate of scale parameter and 0 as threshold */
histogram / weibull;
run;
You will note that when an estimate of theta is requested SAS also produces the KS statistic, this is due to the way that SAS estimates the fit statistic requiring know distribution parameters (full explanation here).
My guess is that you are seeing different fit statistics between the two procedures because either they are returning slightly different fits, or they use different calculations for the estimation of fit statistics. If you are interested you can investigate how they perform their parameter estimation in the user guide (proc severity and proc univariate). If you wanted to investigate further you could force the distribution parameters to match in both procedures and then compare the fit statistics to see how far they differ.
I would recommend that if possible you use only one of the procedure, and that you select the one that best fits your needs in terms of output.

difference between proc univariate in sas 9.1 vs sas 9.3

In SAS 9.1, this code works fine and includes the missing values, which I need. As soon as I ported this program to SAS 9.3, it gave me the wrong minpoint values and excluded the missing values. How do I include the missing values and also why is it giving me the wrong output?
data myData;
input value;
datalines;
-2.47
-4
-5
5
6
7
8
9
10
12
;
run;
proc univariate data = myData noprint;
histogram value /
barwidth = 0.05
endpoints = (-2.5 to 2.45 by 0.05)
outhist = histogram
nochart;
run;
This is the HISTOGRAM dataset as output from SAS 9.1, which is correct:
MinPoint Cumpercent
-2.45 10%
-2.4 0%
-2.35 0%
However, in SAS 9.3 I get these results:
MinPoint Cumpercent
-2 10%
The first problem in the SAS 9.3 output is that observations with CUMPERCENT=0 are excluded. The second problem is that the minpoints are wrong.

How to calculate P-value by using Fisher's exact test in SAS

how to calculate P-value by using proc freq ( by fisher exact method) in SAS, as i am getting a warning note while using chisq method in proc freq.
WARNING: 25% of the cells have expected counts less than 5. Chi-Square may not be a valid test
please provide me syntax and explanation why i am geting this warning?
thanks
Current Code:
ods OUTPUT Freq.Table1.ChiSq=P1_&TR1&V1(WHERE=(Statistic="Chi-Square") RENAME=(Prob=COL&TR1));
PROC FREQ DATA=P&V1;
TABLE TREATMENT*SSA/CHISQ ;
WHERE TREATMENT IN (1 &TR1);
RUN;
QUIT;
All you need to do is put / FISHER instead of / CHISQ in your TABLES statement.
You'll need to change the ODS statement as well, the table name is 'FishersExact'