I have conducted competing risk analysis using fine and gray method in stata, similar to this command:
stcrreg ifp tumsize pelnode, compete(failtype==2)
stcurve, cif at1(ifp=5 pelnode=0) at2(ifp=20 pelnode=0)
I could not get the 95% confidence interval for the estimates. Can someone help me get the CI?
Thank you
Related
Is there a way in SAS to generate Odds ratio and Confidence Interval for Fisher? I know, we can find CI, OR for Pearson's Chi-square. Even P-value is available for Pearson's chi-square and Fisher using proc freq. I have tried FISHER/EXACT statistics but OR and CI are not available for Fisher. Please help!
I am trying to compute a binomial confidence interval for a dummy variable after specifying the survey design in Stata with the svyset command but I get the following error: ci is not supported by svy with vce(linearized)
svyset [pweight=My_weight]
svy: ci Variable, binomial
I have also tried the following code:
ci Variable [pweight=My_weight], binomial
But got the error: pweight not allowed
Binomial confidence intervals are calculated as proportions in Stata 14 (Stata 13 uses binomial). This makes sense because the mean of a dummy variable is the proportion of 1's. Look at the help file here: http://www.stata.com/help.cgi?ci
So you likely want a command like:
ci proportions Variable [pweight=My_weight]
From the help file, it looks like only fweights may be allowed here.
Originally I thought that a better way might be to grab your CI from the means output. Here is an example modified from the svy help file.
webuse nhanes2f
svyset psuid [pweight=finalwgt]
svy: mean sex
But OP is right, this doesn't adjust for the binomial distribution.
Is it possible to change the confidence interval level (say from the default 95% to 90%) when using Stata's marginsplot command? It does not seem to accept the level(90) option or keep information from a preceding margins, level(90) command.
Please post exact code along with your explanation of what went wrong. It's difficult to assess what the problem is if you don't do that. This works fine (from the help file):
clear
set more off
webuse nhanes2
regress bpsystol agegrp##sex
margins agegrp
marginsplot, level(80)
I'm trying to calculate some odds ratios and significance forsomething that can be out into a 2x2 table. The problem is the Fisher test in Sas is taking a long time.
I already have the cell counts. I could calculate a chi square if not for the fact that done of the sample sizes are extremely small. And yet some are extremely large, with cell sizes in the hundreds of thousands.
When I try to compute these in R, I have no problem. However, when I try to compute them in Sas, it either tasks way too long, out simply errors out with the message "Fishers exact test cannot be computed with sufficient precision for this sample size."
When I create a toy example (pull one instance from the dataset, and calculate it) it does calculate, but takes a long time.
Data Bob;
Input targ $ status $ wt;
Cards;
A c 4083
A d 111
B c 376494
B d 114231
;
Run;
Proc freq data = Bob;
Weight wt;
Tables targ*status;
Exact Fisher;
Run;
What is going wrong here?
That's funny. SAS calculates the Fisher's exact test p-value the exact way, by enumerating the hypergeometric probability of every single table in which the odds ratio is at least as big or bigger in favor of the alternative hypothesis. There's probably a way for me to calculate how many tables that is, but knowing that it's big enough to slow SAS down is enough.
R does not do this. R uses Monte Carlo methods which work just as fine in small sample sizes as large sample sizes.
tab <- matrix(c(4083, 111, 376494, 114231), 2, 2)
pc <- proc.time()
fisher.test(tab)
proc.time()-pc
gives us
> tab <- matrix(c(4083, 111, 376494, 114231), 2, 2)
> pc <- proc.time()
> fisher.test(tab)
Fisher's Exact Test for Count Data
data: tab
p-value < 2.2e-16
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
9.240311 13.606906
sample estimates:
odds ratio
11.16046
> proc.time()-pc
user system elapsed
0.08 0.00 0.08
>
A fraction of a second.
That said, the smart statistician would realize, in tables such as yours, that the normal approximation to the log odds ratio is fairly good, and as such the Pearson Chi-square test should give approximately very similar results.
People claim two very different advantages to the Fisher's exact test: some say it's good in small sample sizes. Others say it's good when cell counts are very small in specific margins of the table. The way that I've come to understand it is that Fisher's exact test is a nice alternative to the Chi Square test when bootstrapped datasets are somewhat likely to generate tables with infinite odds ratios. Visually you can imagine that the normal approximation to the log odds ratio is breaking down.
Logistic Regression using Accord.net (http://accord-framework.net/docs/html/T_Accord_Statistics_Analysis_LogisticRegressionAnalysis.htm) takes about 5 mins compute. SAS does it in a few seconds (using single CPU core as well).
Dataset is about 40000 rows and 30 inputs.
Why is there such a difference? Does SAS use algorithm with much better complexity? Logistic regression is quite simple algorithm as I know.
Is there any other library that will do better (preferably free)?
The solution is to comment out this line:
https://github.com/accord-net/framework/blob/development/Sources/Accord.Statistics/Analysis/LogisticRegressionAnalysis.cs#L504
It computes some very expensive statistics that I don't need.
There is the class that can be used with the standard Accord package: https://gist.github.com/eugenem/e1dd2ef2149e8c21c37d
I had the same expirence with multinomial logistic regression. I have made a comparison between Accord, R, SPSS and Python's Scikit. I have 30 inputs, 10 outputs and 1600+ training examples. Accord took 8 min, and the rest took 2-8 sek. Accord looks beautiful but for the multinomial logistic regression it's way to slow. My solution was that I made a small python webservice that calculates the regression and saves the result in the database.