In SAS is there a fractional response procedure - sas

In SAS is there a fractional response procedure
proc glimmix data=damage;
model y = lconc / dist=beta link=logit solution;
output out=gmxout pred(ilink)=gpredy lcl(ilink)=lower ucl(ilink)=upper;
run;
In SAS is there a fractional response procedure

Related

Missing values in VARMAX

I have a dataset with visitors and weather variables. I'm trying to forecast visitors based on the weather variables. Since the dataset only consists of visitors in season there is missing values and gaps for every year. When running proc reg in sas it's all okay but the issue comes when i'm using proc VARMAX. I cannot run the regression due to missing values. How can i tackle this?
proc varmax data=tivoli4 printall plots=forecast(all);
id obs interval=day;
model lvisitors = rain sunshine averagetemp
dfebruary dmarch dmay djune djuly daugust doctober dnovember ddecember
dwednesday dthursday dfriday dsaturday dsunday
d_24Dec2016 d_05Dec2013 d_24Dec2017 d_24Dec2014 d_24Dec2015 d_24Dec2019
d_24Dec2018 d_24Sep2012 d_06Jul2015
d_08feb2019 d_16oct2014 d_15oct2019 d_20oct2016 d_15oct2015 d_22sep2017 d_08jul2015
d_20Sep2019 d_08jul2016 d_16oct2013 d_01aug2012 d_18oct2012 d_23dec2012 d_30nov2013 d_20sep2014 d_17oct2012 d_17jun2014
dFrock2012 dFrock2013 dFrock2014 dFrock2015 dFrock2016 dFrock2017 dFrock2018 dFrock2019
dYear2015 dYear2016 dYear2017
/p=7 q=2 Method=ml dftest;
garch p=1 q=1 form=ccc OUTHT=CONDITIONAL;
restrict
ar(3,1,1)=0, ar(4,1,1)=0, ar(5,1,1)=0,
XL(0,1,13)=0, XL(0,1,14)=0, XL(0,1,13)=0, XL(0,1,27)=0, XL(0,1,38)=0, XL(0,1,42)=0;
output lead=10 out=forecast;
run;
As with any forecast, you will first need to prepare your time-series. You should first run through your data through PROC TIMESERIES to fill-in or impute missing values. The impute choice that is most appropriate is dependent on your variables. The below code will:
Sum lvisitors by day and set missing values to 0
Set missing values of averagetemp to average
Set missing values of rain, sunshine, and your variables starting with d to 0 (assuming these are indicators)
Code:
proc timeseries data=have out=want;
id obs interval = day
setmissing = 0
notsorted
;
var lvisitors / accumulate=total;
crossvar averagetemp / accumulate=none setmissing=average;
crossvar rain sunshine d: / accumulate=none;
run;
Important Time Interval Consideration
Depending on your data, this could bias your error rate and estimates since you always know no one will be around in the off-season. If you have many missing values for off-season data, you will want to remove those rows.
Since PROC VARMAX does not support custom time intervals, you can instead create a simple time identifier. You can alternatively turn this into a format for proc format and converttime_id at the end.
data want;
set have;
time_id+1;
run;
proc varmax data=want;
id time_id interval=day;
...
output lead=10 out=myforecast;
run;
data myforecast;
merge myforecast
want(keep=time_id date)
;
by time_id;
run;
Or, if you made a format:
data myforecast;
set myforecast;
date = put(time_id, timeid.);
drop time_id;
run;

Macro not loading data set

Previous Posts :
Variable check and summary out
Macro that outputs table with testing results of SAS table
Question/Problem
From the previous posts, I thought I was able to run the macro and produce the desired results. However, after finally getting a report back that the output is not working I'm really confused as to why I'm getting the error that there were missing variables. It appears as if the data set is not being loaded after sub-setting. I'm able to process basic summary statistic tables, but when I load the macro the output is not working.
Why is the data set not loading? Does a macro require a certain type of data set?
Note : A limitation is that I do not have access to the data set, so I must send code to be run and won't get results for a few days. It's a very long and frustrating process, but I'm sure some can relate.
The code that is causing problems is the macro (in beginning of code) and the very last section which calls the macro with the data set.
Error Log :
Code :
# Filename : Census2007_Hawaii_BearingCoffee_BigIsland.sas
/******************************************************************
Clearance Test Macro
input_dataset - desired dataset which variables are located
output_dataset - an output table with test results
variable_to_consider - list of variables to compute test on
*******************************************************************/
%macro clearance_test(input_dataset= ,output_dataset=, variable_to_consider=);
%let variable_to_consider=%cmpres(&variable_to_consider);
proc sql noprint;
select count(*) into : obs_count from &input_dataset;
quit;
%let obs_count=&obs_count;
proc transpose data=&input_dataset out=&output_dataset prefix=top_;
var &variable_to_consider;
run;
data &output_dataset;
set &output_dataset end=eof;
array top(*) top_&obs_count.-top_1;
x=dim(top);
call sortn(of top[*]);
total=sum(of top[*]);
top_2_total=sum(top_1, top_2);
if sum(top_1,top_2) > 0.9 * total then Flag90=1; else Flag90=0;
if top_1 > total * 0.6 then Flag60=1; else Flag60=0;
keep total top_1 top_2 _name_ top_2_total total Flag60 Flag90;
run;
%mend mymacro;
/***********************************************************************/
*Define file path statics;
Libname def 'P:\Hawaii_Arita\John_Hawaii_Coffee\Datasets';
Libname abc "P:\Hawaii_Arita\John_Hawaii_Coffee\Datasets";
option obs=max;
/* Initialize database */
DATA def.Census2007_Hawaii_Coffee;
SET abc.census2007_hawaii_SubSet_Coffee;
**<create the variables used in the macro> **;
RUN;
/* Clearance Test Results */
%clearance_test(input_dataset=def.census2007_hawaii_SubSet_Coffee, output_dataset=test_data ,variable_to_consider= OIR OIRO ROA ROAO SProfit
LProfit SProfitAcre LProfitAcre Profitable MachineandRent UtilityandFuel LaborH LaborO FertilizerandChem MaintandCustom
Interest Tax Dep Others TFPE_cal operators workers operatorsandworkers)
A Complete/Verifiable Example :
This has been tested on the remote machine and works perfectly.
/* Create test data set*/
data business_data;
do firm = 1 to 3;
revenue = rand("uniform");
costs = rand("uniform");
profits = rand("uniform");
vcost = rand("uniform");
output;
end;
run;
/******************************************************************
Clearance Test Macro
input_dataset - desired dataset which variables are located
output_dataset - an output table with test results
variable_to_consider - list of variables to compute test on
*******************************************************************/
%macro clearance_test(input_dataset= ,output_dataset=, variable_to_consider=);
%let variable_to_consider=%cmpres(&variable_to_consider);
proc sql noprint;
select count(*) into : obs_count from &input_dataset;
quit;
%let obs_count=&obs_count;
proc transpose data=&input_dataset out=&output_dataset prefix=top_;
var &variable_to_consider;
run;
data &output_dataset;
set &output_dataset end=eof;
array top(*) top_&obs_count.-top_1;
x=dim(top);
call sortn(of top[*]);
total=sum(of top[*]);
top_2_total=sum(top_1, top_2);
if sum(top_1,top_2) > 0.9 * total then Flag90=1; else Flag90=0;
if top_1 > total * 0.6 then Flag60=1; else Flag60=0;
keep total top_1 top_2 _name_ top_2_total total Flag60 Flag90;
run;
%mend mymacro;
/* Print summary table, run macro, and print clearance test table */
PROC MEANS data = business_data n sum mean median std;
VAR revenue costs profits vcost;
RUN;
%clearance_test(input_dataset=business_data, output_dataset=test_data ,
variable_to_consider=revenue costs profits vcost)
proc print data = test_data; run;
This is where a minimal, complete verifiable example (MCVE) would be helpful for testing whether your problem is a problem with the code, or the data.
Here's the code above, but with a SASHELP dataset (those are built-in to SAS so everyone has them).
%macro clearance_test(input_dataset= ,output_dataset=, variable_to_consider=);
%let variable_to_consider=%cmpres(&variable_to_consider);
proc sql noprint;
select count(*) into : obs_count from &input_dataset;
quit;
%let obs_count=&obs_count;
proc transpose data=&input_dataset out=&output_dataset prefix=top_;
var &variable_to_consider;
run;
data &output_dataset;
set &output_dataset end=eof;
array top(*) top_&obs_count.-top_1;
x=dim(top);
call sortn(of top[*]);
total=sum(of top[*]);
top_2_total=sum(top_1, top_2);
if sum(top_1,top_2) > 0.9 * total then Flag90=1; else Flag90=0;
if top_1 > total * 0.6 then Flag60=1; else Flag60=0;
keep total top_1 top_2 _name_ top_2_total total Flag60 Flag90;
run;
%mend clearance_test;
%clearance_test(input_dataset=sashelp.cars, output_dataset=work.test, variable_to_consider=mpg_city mpg_highway);
That's the exact macro, just using a different input dataset. It works correctly on my machine (the flag variables are meaningless since the data isn't right for them, but the code works).
Run the same on your colleague's machine, and if it runs, then you know the data is the problem (ie, the dataset doesn't have the variables you think it does). If it doesn't run, then you have some other problem (perhaps an issue with how it's being submitted, maybe you end up with spurious characters or something).

control the number of decimal places in SAS proc means

I am trying to report my proc means output with 10 decimal places by specifying maxdec=10. However, SAS does not report more than 7 decimal places.
Here is the warning I get:
WARNING: NDec value is inappropriate, BEST format will be used.
I appreciate any suggestion.
If you look at the documentation, it states that MEANS will print out 0-8 decimal places based on the value of MAXDEC. If you want more, you will need to save the results and print them yourself.
Try this:
data test;
format x 12.11;
do i=1 to 1000;
x = rannor(0);
output;
end;
drop i;
run;
proc means data=test noprint;
var x;
output out=means_out mean=mean std=std;
run;
proc print data=means_out noobs;
var mean std;
format mean std 12.11;
run;
As already mentioned, maxdec= works for limiting the number of decimal places below 8. Proc means isn't going to let you do too much to change the format of the summary statistics. I'd suggest using proc tabulate:
If your proc means looks like:
proc means data=yourdata;
var yourvariable;
run;
Than use something like:
proc tabulate data=yourdata;
var yourvariable;
table yourvariable*
(n
mean*format=15.10
stddev*format=15.10
min*format=15.10
max*format=15.10);
run;

Sensitivity and specificity

I need to calculate the following for my dataset. I could calculate individual PPV (95% CI) and NPV (95% CI) but got tad confused about how to calculate this:
PPV+NPV-1 (95% CI)
How do I do this calculation?
This page on SAS support gives code as follows:
title 'Sensitivity';
proc freq data=FatComp;
where Response=1;
weight Count;
tables Test / binomial(level="1");
exact binomial;
run;
title 'Specificity';
proc freq data=FatComp;
where Response=0;
weight Count;
tables Test / binomial(level="0");
exact binomial;
run;
title 'Positive predictive value';
proc freq data=FatComp;
where Test=1;
weight Count;
tables Response / binomial(level="1");
exact binomial;
run;
title 'Negative predictive value';
proc freq data=FatComp;
where Test=0;
weight Count;
tables Response / binomial(level="0");
exact binomial;
run;
I doubt that this is a useful measure. In general you should present sensitivity, specificity, positive and negative predictive values. If you want a global measure of accuracy you should go for the proportion of correctly classified subjects.
If you go in the webpage already suggested by Peter Flom yo can scroll until a piece of code for overall accuracy. The accuracy can be computed by creating a binary variable indicating whether test and response agree in each observation. :
data acc;
set FatComp;
if (test and response) or
(not test and not response) then acc=1;
else acc=0;
run;
proc freq;
weight count;
tables acc / binomial(level="1");
exact binomial;
run;
Hope it helps

how to customize porc freq to deal with missing values

I have the following code
data work.customBins;
retain fmtname 'bins' type 'n';
do binStart=-2.5 to 2.45 by 0.05;
binEnd=binStart+0.05;
difference=cat(binStart," to ",binEnd);
output;
end;
run;
proc format library=work cntlin=work.customBins; run;
proc freq data=work.myData;
table variable /missing;
format variable bins.;
run;
This code works properly everything is fine my only issue is If I have bins for example -1.45 to -1.40 that dont have any values proc freq disregards them. I want the cumulative frequency of the pervious bin to be displayed in the bins that have no values for example
-1.50 to -.145 cumulative Freq = 2%
-.1.45 to -1.4 has no values but the cumulative Freq for this should be 2%
I have also tried doing this
data work.combined;
set work.myData (in=a) work.customBins (in=b)
if a then cont=1;
if b then cont=0;
run;
proc freq data=work.combined;
table variable /missing;
format variable bins.;
weight cont/zeros;
run;
But this also does not work
myData just contains a single variabrle called variable which is decimal numbers in the range of -2.45 to 2.45
Here is a working variant:
data work.customBins;
do binStart=-2.5 to 2.45 by 0.05;
binEnd=binStart+0.05;
difference=cat(binStart," to ",binEnd);
output;
end;
run;
proc sql;
create table want as
select difference, count(variable) as count
from customBins left join mydata
on binStart < variable <= binEnd
group by difference
order by binStart;
quit;
proc freq data=want order=data;
tables difference;
weight count / zeros;
run;
Regarding your first variant. Are you sure that your PROC FORMAT works as expected? Dataset used in CNTLIN-option should have variables START, END and LABEL, not voluntarily named ones. Anyway, it wouldn't work because PROC FREQ uses only values that you do have in mydata dataset, doesn't matter how many other labels you defined in your format.