Getting Chi-Square statistics in proc surveylogistic

Getting Chi-Square statistics in proc surveylogistic - sas

per Default proc surveylogistic displays an F test in the "testing the null hypothesis /beta = 0 " output. Can I somehow change that to a Chi-Square Test?
Usually I use proc logistics but this time I have a cluster variable and to my knowledge proc logistic cant handle those.
In the documentation I read the F and Chi-Square test are equivallent but I get different results for the significance tests (although the point estimates for intercept and my independent variable are the same) to proc logistic for the same analysis.
I also tried using the df=infinity option but the name just changes the value stays the same.
Regards

Related

Kruskal-Wallis test vs. ANOVA in SAS with complex survey data?

I am analyzing a temporal trend(yr) of certain chemicals(a b & c).
I use proc sgplot and series statement to draw a plot and found there was a decreasing trend.
Becuase the data is right-skewed, I used the median concentration of each year to draw the plot.
Now I would like to conduct a statistical test on the trend. My data came from the NHANES and need to use the proc survey** to perform analysis. I know I can do an ANOVA test based on proc surveyreg and use ANOVA option in the MODELstatement.
proc suveyreg data=a;
stratum stra;
cluster clus;
weight wt;
model a=yr/anova;
run;
But since the original data is right-skewed, I think maybe it is better to use Kruskal-Wallis test on the original data. But I don't know how to write a code in SAS and I didn't find information in proc survey**-related document.
My plan B is to use the log-transformed data and ANOVA test. But I am not sure if that is an appropriate approach. Can somebody tell me how to get the normality test of the residual in ANOVA while using proc surveyreg? I would also like to know if I can test a b & c in one procedure or I should write multiple procedures with changes in MODEL statement.
Looking forward to your engagement.Thank you!

Understanding SAS output data sets

SAS has several forms it uses to create output data sets from within a procedure. It is not always clear whether or not a particular procedure can generate a data set and, if it seems to be able to, it's not always clear how.
Off the top of my head, here are some examples of how widely the syntax can differ.
Example 1
proc sort data = sashelp.baseball out = baseball_sorted;
by
league
division
;
run;
Example 2
proc means noprint data = baseball_sorted;
by
league
division
;
var nHits;
output
out = baseball_avg_hits (drop = _TYPE_ _FREQ_)
mean = mean_hits
;
run;
Example 3
ods exclude all;
ods output
statistics = baseball_statistics
equality = baseball_ftest
;
proc ttest data = baseball_sorted;
class league;
var nHits;
run;
ods exclude none;
Example 4
The PROC ANOVA OUTSTAT= option.
It seems almost as if SAS has implemented each of these willy-nilly. Is the SAS syntax dictating how to create a data set directed by some consistent approach I am not seeing or is it truly capricious and arbitrary?

For PROC code, the syntax for outputting data is often specific to that procedure, which often feels willy-nilly. (Your examples 1, 2, 4) I think PROC developers are given a lot of freedom, and remember that many of these PROCS are 30+ years old.
The great thing about the Output Delivery System (ODS, your example 3) is it provides a single syntax for outputting data, regardless of the procedure. So you can use the ODS OUTPUT statement with (almost?) any PROC. The names and structures of the output objects will of course vary between PROCs. So if you are looking for a consistent approach, I would focus on using ODS OUTPUT. ODS was added in V7 (I think).
It would be interesting to try to find an example of an output dataset which could be made by a PROC but could not be made by ODS OUTPUT. I hope there aren't any. If that is the case, you could consider the range of OUTPUT statements/options within PROCs as legacy code.

Agree with Quentin. You have to remember that there are SAS systems out there running code written in the 80s. SAS would have a huge headache if they forced every team to rewrite all the procedures and then forced their customers to change all their code. SAS has been around since the 60s and the organic growth of the syntax is to be expected.
FWIW, having an OUT= statement makes sense on things with no graphical output. I.E. PROC SORT or PROC TRANSPOSE.

The way I see it there are four main ways to specify the output data sets.
In the PROC statement you may be able to specify some type of output statements or options, such as OUT= OUTEST=.
In the main statement of the procedure, ie MODEL/TABLE can have options that allow for output. ie PROC FREQ has an OUT= on the TABLE statement.
An explicit OUTPUT statement within a procedure. These are typically from older procedures. ie PROC MEANS
ODS tables which are relatively newer method, more frequently used these days since the format aligns with what you'd expect to see.
Yes, there are multiple places to check, but fortunately the SAS documentation for procedures is relatively clear with the options and how to use/specify the outputs.
If I've missed anything that seems different post in the comments and I can update this.
PS. Although SAS is definitely bad, trying to navigate different packages/modules in Python to export an XLSX file isn't straight forward either. Some packages support some options others don't. I've given up on asking why these days and just accept it as peculiarities of the different languages at this point.

SAS Proc TTest - Difference from fixed value

How do I test if a variable is significantly different from the value 15?
What value does the test need to be greater than for a confidence level of 95% that is it different (i.e 5% opportunity the average is 15?)

To specify a null value to test against look at the H0 option. For the 5%, I assume is the alpha level. If your test is two sided then the default side option is fine, otherwise set that value to 1.
Proc ttest data=sashelp.class h0=15 alpha=0.05 sides=2;
Var age;
Run;
All of these options are detailed in the documentation. The value used for comparison against the test statistic varies based on the sample size.
https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_ttest_sect002.htm
UCLA offers a detailed walk through of Proc Ttest
http://www.ats.ucla.edu/stat/sas/output/ttest.htm

How does SAS calculate standard errors of coefficients in logistic regression?

I am doing a logistic regression of a binary dependent variable on a four-value multinomial (categorical) independent variable. Somebody suggested to me that it was better to put the independent variable in as multinomial rather than as three binary variables, even though SAS seems to treat the multinomial as if it is three binaries. THeir reason was that, if given a multinomial, SAS would report std errors and confidence intervals for the three binary variables 'relative to the omitted variable', whereas if given three binaries it would report them 'relative to all cases where the variable was zero'.
When I do the regression both ways and compare, I see that nearly all results are the same, including fit statistics, Odds Ratio estimates and confidence intervals for odds ratios. But the coefficient estimates and conf intervals for those differ between the two.
From my reading of the underlying theory,as presented in Hosmer and Lemeshow's 'Applied Logistic Regression', the estimates and conf intervals reported by SAS for the coefficients are consistent with the theory for the regression using three binary independent variables, but not for the one using a 4-value multinomial.
I think the difference may have something to do with SAS's choice of 'design variables', as for the binary regression the values are 0 and 1, whereas for the multinomial they are -1 and 1. But I don't really understand what SAS is doing there.
Does anybody know how SAS's approach differs between the two regressions, and/or can explain the differences in the outputs?
Here is a link to the SAS output:
SAS output
And here is the SAS code:
proc logistic data=tab descending;
class binB binC binD / descending;
model y = binD binC binB ;
run;
proc logistic data=tab descending;
class multi / descending;
model y = multi;
run;

One-way random-effects ANOVA in SAS: PROC GLM or MIXED?

I'm attempting to conduct a simple one-way random-effects ANOVA in SAS. I want to know if the population variance is significantly different than zero or not.
On UCLA's idre site, they state to use PROC MIXED as follows:
proc mixed data = in.hsb12 covtest noclprint;
class school;
model mathach = / solution;
random intercept / subject = school;
run;
This makes sense to me given my previous experience with using PROC MIXED.
However, in the text Biostatistical Design and Analysis Using R by Murray Logan, he says for a one-way ANOVA, fixed and random effects are not distinguished and conducts (in R) a "standard" one-way ANOVA even though he's testing the variance, not the means. I've found that in SAS, his R procedure is equivalent to using any of the following:
PROC ANOVA
PROC GLM (same as ANOVA, but with GLM in place of ANOVA)
PROC GLM with RANDOM statement
The p-values from the above three models are the same, but differ from the PROC MIXED model used by UCLA. For my data, it's a difference of p=0.2508 and p=0.3138. Although conclusions don't change in this instance, I'm not really comfortable with this difference.
Can anyone give advice on which one is more appropriate and also why there is this difference?

For your model, the difference between PROC ANOVA and PROC MIXED is only due to numerical noise(REML estimator of PROC MIXED). However, the p-values mentioned in your question correspond to the different tests. In order to get the F value using the output of COVTEST in PROC MIXED, you need to recalculate MS_groups taking into account the unequal sample sizes (either manually as explained on p.231 of http://bio.classes.ucsc.edu/bio286/MIcksBookPDFs/QK08.PDF, or just using PROC MIXED with the same fixed model spec as in PROC ANOVA). This paper (http://isites.harvard.edu/fs/docs/icb.topic1140782.files/S98.pdf) provides some examples of used of PROC MIXED in addition to SAS manual.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Getting Chi-Square statistics in proc surveylogistic - sas

Related

Kruskal-Wallis test vs. ANOVA in SAS with complex survey data?

Understanding SAS output data sets

SAS Proc TTest - Difference from fixed value

How does SAS calculate standard errors of coefficients in logistic regression?

One-way random-effects ANOVA in SAS: PROC GLM or MIXED?

Categories

Resources