Stata user defined Program, Probit and Maximization Options

Stata user defined Program, Probit and Maximization Options - stata

I am having a problem calling maximization options in a user-defined program. For example, the following works perfectly:
sysuse nlsw88, clear
probit collgrad age grade, tech(bfgs)
But, when I define a function that calls on Probit, I get an error message of "option tech() not allowed". Here's the code for that
program probit_test, eclass
version 10.1
if replay() {
syntax [anything] [, Level(real 95) ]
eret di, level(`level')
}
else {
qui {
syntax [varlist] [if] [in], [Level(real 95) *]
tempvar touse e1
tempname beta var
mark `touse' `if' `in'
markout `touse'
gettoken depv vl:varlist
probit `depv' `vl' if `touse', tech(bfgs)
g `e1'=e(sample)
loc N=r(N)
matrix `beta'=e(b)
matrix `var'=e(V)
ereturn post `beta' `var', dep(`depv') e(`e1') obs(`N')
cap drop _d*
}
eret di, level(`level')
}
end
****
sysuse nlsw88, clear
probit_test collgrad age grade
Edit: I just wanted to add that this is not the actual program I am running, but a simple version that shows the problem.

Your probit_test program is telling Stata to behave as if it was version 10.1. In 10.1, the Broyden–Fletcher–Goldfarb–Shanno algorithm was not yet available, so it errors. Change the version to something more current (bfgs was introduced in v11), and it should run fine.

Related

Power analysis via simulations in Stata version 15.1

I have been trying to run this simulation code in Stata version 15.1, but am having issues running it as indicated below.
local num_clus 3 6 9 18 36
local clussize 5 10 15 20 25
*Model specifications
local intercept 17.87
local timecoeff1 -5.42
local timecoeff2 -5.72
local timecoeff3 -7.03
local timecoeff4 -6.13
local timecoeff5 -9.13
local intrvcoeff 5.00
local sigma_u3 25.77
local sigma_u2 120.62
local sigma_error 38.35
local nrep 1000
local alpha 0.05
*Generate multi-level data
capture program drop swcrt
program define swcrt, rclass
version 15.1
preserve
clear
args num_clus clussize intercept intrvcoeff timecoeff1 timecoeff2 timecoeff3 timecoeff4 timecoeff5 sigma_u3 sigma_error alpha
assert `num_clus' > 0 & `clussize' > 0 & `intercept' > 0 & `intrvcoeff' > 0 & `timecoeff1' < 0 & `timecoeff2' < 0 & `timecoeff3' < 0 & `timecoeff4' < 0 & `timecoeff5' < 0 & `sigma_u3' > 0 & `sigma_error' > 0 & `alpha' > 0
/*Generate simulated multi—level data*/
qui
clear
set obs `num_clus'
qui gen cluster = _n
qui gen group = 1+mod(_n-1,4)
/*Generate cluster-level errors*/
qui gen u_3 = rnormal(0,`sigma_u3')
expand `clussize'
bysort cluster: gen individual = _n
/*Set up time*/
expand 6
bysort cluster individual: gen time = _n-1
/*Set up intervention variable*/
gen intrv = (time>=group)
/*Generate residual errors*/
qui gen error = rnormal(0,`sigma_error')
/*Generate outcome y*/
qui gen y = `intercept' + `intrvcoeff'*intrv + `timecoeff1'*1.time + `timecoeff2'*2.time + `timecoeff3'*3.time + `timecoeff4'*4.time + `timecoeff5'*5.time + u_3 + error
/*Fit multi-level model to simulated dataset*/
mixed y intrv i.time ||cluster:, covariance(unstructured) reml dfmethod(kroger)
/*Return estimated effect size, bias, p-value, and significance dichotomy*/
tempname M
matrix `M' = r(table)
return scalar bias = _b[intrv] - `intrvcoeff'
return scalar p = `M'[1,4]
return scalar p_= (`M'[1,4] < `alpha')
exit
end swcrt
*Postfile to store results
tempname step
tempfile powerresults
capture postutil clear
postfile `step' num_clus [B]clussize[/B] intrvcoeff p p_ bias using `powerresults', replace
ERROR: (note: file /var/folders/v4/j5kzzhc52q9fvh6w9pcx9fgm0000gn/T//S_00310.00000c not found)
*Loop over number of clusters
foreach c of local num_clus{
display as text "Number of clusters" as result "`c'"
foreach s of local clussize{
display as text "Cluster size" as result "`s'"
forvalue i = 1/`nrep'{
display as text "Iterations" as result `nrep'
quietly swcrt `num_clus' `clussize' `intercept' `intrvcoeff' `timecoeff1' `timecoeff2' `timecoeff3' `timecoeff4' `timecoeff5' `sigma_u3' `sigma_error' `alpha'
post `step' (`c') (`s') (`intrvcoeff') (`r(p)') (`r(p_)') (`r(bias)')
}
}
}
postclose `step'
ERROR:
Number of clusters3
Cluster size5
Iterations1000
r(9);
*Open results, calculate power
use `powerresults', clear
levelsof num_clus, local(num_clus)
levelsof clussize, local(clussize)
matrix drop _all
*Loop over combinations of clusters
*Add power results to matrix
foreach c of local num_clus{
foreach s of local clussize{
quietly ci proportions p_ if num_clus == `c' & clussize = `s'
local power `r(proportion)'
local power_lb `r(lb)'
local power_ub `r(ub)'
quietly ci mean bias if num_clus == `c' & clussize = `s'
local bias `r(mean)'
matrix M = nullmat(M) \ (`c', `s', `intrvcoeff', `power', `power_lb', `power_ub', `bias')
}
}
*Display the matrix
matrix colnames M = c s intrvcoeff power power_lb power_ub bias
ERROR:
matrix M not found
r(111);
matrix list M, noheader format(%3.2f)
ERROR:
matrix M not found
r(111);
There are a few things that seem to be amiss above.
I get a message after the postfile command saying that the file is not found. Nowhere in my code do I actually use that name so it seems to be generated by Stata.
After the loop and the post command I get error r(9).
Error message r(111) - says that the matrix is not found.
I have checked the following parts of the code to try and resolve the issue:
Specified local macros outside of the program and passed into it via the args statement of the program
Match between the variables in the call of the swcrt with the args statement in the program
Match between arguments in assert statement of the program with args command and whether the alligator clips are specified appropriately
Match b/w the number of variables in the post and postfile commands
I am not quite sure why I get these errors considering that the code did work previously and the program iterated (even when I take away the changes there is still the error). Does anyone know why this happens? If I had to guess, the matrix can't be found because of the error with the file not being found when I use postfile.

Using egen and ordering a variable simultaneously

In a previous question, I got an efficient solution to generate a variable and at the same time order it:
sysuse auto, clear
generate random = runiform(), before(make)
This solution does not seem to work if the egen command is used:
egen avgprice = mean(price), before(make)
option before() not allowed
r(198);
Is it possible to generate a variable and at the same time order it when using egen?

The egen command does not have an option similar to the before() option of generate.
However, you can accomplish what you want by writing a small program:
program define egen2
unab allvars : *
gettoken firstvar : allvars
tempname var
gettoken firstarg 0 : 0, parse("=")
egen `var' `0'
generate `firstarg' = `var', before(`firstvar')
end
You could then do the following:
sysuse auto, clear
egen2 foo = mean(price)
EDIT:
The program can be reduced to the following if you do not want to completely avoid order:
program define egen2
gettoken firstarg 0 : 0, parse("=")
egen `firstarg' `0'
order `firstarg'
end

How to display r-squared for multiple models using outreg2

I run two regressions for which I would like to show the r-squared:
logit y c.x1 c.x2
quietly est store e1
local r1 = e(r2_p)
logit y c.x1 c.x2 c.x3
quietly est store e2
local r2 = e(r2_p)
I tried to create a matrix to fill it but was not successful:
mat t1=J(1,2,0) //Defining empty matrix with 2 columns 1 row
local rsq `r*' //Trying to store r1 and r2 as numeric
local a=1
forval i=1/2{
mat t1[`i'+1,`a']= `r*' // filling each row, one at a time, this fails
loc ++a
}
mat li t1
Ultimately, I would like to export the results with the community-contributed Stata command outreg2:
outreg2 [e*] using "myfile", excel replace addstat(Adj. R^2:, `rsq')

The following works for me:
webuse lbw, clear
logit low age smoke
outreg2 using "myfile.txt", replace addstat(Adj. R^2, e(r2_p))
logit low age smoke ptl ht ui
outreg2 using "myfile.txt", addstat(Adj. R^2, e(r2_p)) append
type myfile.txt
(1) (2)
VARIABLES low low
age -0.0498 -0.0541
(0.0320) (0.0339)
smoke 0.692** 0.557
(0.322) (0.339)
ptl 0.679**
(0.344)
ht 1.408**
(0.624)
ui 0.817*
(0.451)
Constant 0.0609 -0.168
(0.757) (0.806)
Observations 189 189
Adj. R^2 0.0315 0.0882
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1

Summarizing a variable in Stata and extracting standard deviation

I am trying to create a variable for each year in my data based on mathematical expressions of other variables (I have annual data and used "..." to avoid writing each year). I am using the summarize command in Stata to extract the standard deviation but Stata does not recognize the frac variable. I have tried to use egen but that results in an unknown function error. Using gen results in an already defined variable. I would appreciate anyone helping with the following code or pointing me to a link where this issue has been discussed.
foreach yr of numlist 1995...2012 {
local row = `yr' - 1994
local numerator = 100*(income - L1.income)
local denominator = ((abs(income) + abs(L1.income)) / 2)
local frac = (`numerator' / `denominator')
summarize frac
local sdfrac = r(sd)
matrix C[`row', 1] = `numerator'
matrix C[`row', 2] = `denominator'
matrix C[`row', 3] = `sdfrac'
}

If I am understanding your question right, maybe you don't need to use a loop until the end and then you can post the results to a postfile:
This is just a thought:
tempname memhold
tempfile filename
postfile `memhold' year sdfrac using `filename'
gen row=year-1994
gen numerator=100*(income-L1.income)
gen denominator=((abs(income)+abs(L1.income))/2)
gen frac=numerator/denominator
foreach yr of numlist 1995...2012 {
summarize frac if year=`yr'
local sdfrac=r(sd)
post `memhold' (year) (`sdfrac')
}
postclose `memhold'
clear all
use `filename'
*View Results
list
This code should get you a data set with the name of the year and the standard deviation of the frac variable as variables.

In a comment, OP added a question about code similar to this (but ignored the request to post it in a more civilised form). Note that backticks or left quotation marks in Stata clash with SO mark-up codes in comments. Presumably some
tempname memhold
definition preceded this.
postfile `memhold' year sdfrac sex race using myresults
levels of sex, local (s)
levelsof race, local (r)
foreach a of local s {
foreach b of local r {
forval yr = 1995/2012 {
summarize frac if year == `yr' & sex == `a' & race == `b'
post `memhold' (`yr') (`r(sd)') (`sex') (`race')
}
}
}
Let's focus on what the problem is. You want the standard deviations of frac for all combinations of sex, race and year in a separate file. That's one line
collapse (sd) frac, by(year sex race)
If you want to see a table alongside the data, consider
egen group = group(sex race year), label
and then
tab group, su(frac)
or
tabstat frac, by(group) stat(sd)

This code modifies that by #Pcarlitz, mostly by simplifying it. I can't check with your data, which I don't have.
It's too long to fit into a comment.
I would not use a temporary file as you want to save these results, it seems.
tempname memhold
postfile `memhold' year sdfrac using myresults
gen frac = (100*(income - L1.income))/((abs(income) + abs(L1.income))/2)
forval yr = 1995/2012 {
summarize frac if year==`yr'
post `memhold' (`yr') (`r(sd)')
}
postclose `memhold'
use myresults
list
UPDATE As in a later answer, consider collapse as a much simpler direct alternative here.

Is Conger's kappa available in Stata?

Is the modified version of kappa proposed by Conger (1980) available in Stata? Tried to google it to no avail.

This is an old question, but in case anyone is still looking--the SSC package kappaetc now calculates that, along with every other inter-rater statistic you could ever want.

Since no one has responded with a Stata solution, I developed some code to calculate Conger's kappa using the formulas provided in Gwet, K. L. (2012). Handbook of Inter-Rater Reliability (3rd ed.), Gaithersburg, MD: Advanced Analytics, LLC. See especially pp. 34-35.
My code is undoubtedly not as efficient as others could write, and I would welcome any improvements to the code or to the program format that others wish to make.
cap prog drop congerkappa
prog def congerkappa
* This program has only been tested with Stata 11.2, 12.1, and 13.0.
preserve
* Number of judges
scalar judgesnum = _N
* Subject IDs
quietly ds
local vlist `r(varlist)'
local removeit = word("`vlist'",1)
local targets: list vlist - removeit
* Sums of ratings by each judge
egen judgesum = rowtotal(`targets')
* Sum of each target's ratings
foreach i in `targets' {
quietly summarize `i', meanonly
scalar mean`i' = r(mean)
}
* % each target rating of all target ratings
foreach i in `targets' {
gen `i'2 = `i'/judgesum
}
* Variance of each target's % ratings
foreach i in `targets' {
quietly summarize `i'2
scalar s2`i'2 = r(Var)
}
* Mean variance of each target's % ratings
foreach i in `targets' {
quietly summarize `i'2, meanonly
scalar mean`i'2 = r(mean)
}
* Square of mean of each target's % ratings
foreach i in `targets' {
scalar mean`i'2sq = mean`i'2^2
}
* Sum of variances of each target's % ratings
scalar sumvar = 0
foreach i in `targets' {
scalar sumvar = sumvar + s2`i'2
}
* Sum of means of each target's % ratings
scalar summeans = 0
foreach i in `targets' {
scalar summeans = summeans + mean`i'2
}
* Sum of meansquares of each target's % ratings
scalar summeansqs = 0
foreach i in `targets' {
scalar summeansqs = summeansqs + mean`i'2sq
}
* Conger's kappa
scalar conkappa = summeansqs -(sumvar/judgesnum)
di _n "Conger's kappa = " conkappa
restore
end
The data structure required by the program is shown below. The variable names are not fixed, but the judge/rater variable must be in the first position in the data set. The data set should not include any variables other than the judge/rater and targets/ratings.
Judge S1 S2 S3 S4 S5 S6
Rater1 2 4 2 1 1 4
Rater2 2 3 2 2 2 3
Rater3 2 5 3 3 3 5
Rater4 3 3 2 3 2 3
If you would like to run this against a test data set, you can use the judges data set from StataCorp and reshape it as shown.
use http://www.stata-press.com/data/r12/judges.dta, clear
sort judge
list, sepby(judge)
reshape wide rating, i(judge) j(target)
rename rating* S*
list, noobs
* Run congerkappa program on demo data set in memory
congerkappa
I have run only a single validation test of this code against the data in Table 2.16 in Gwet (p. 35) and have replicated the Conger's kappa = .23343 as calculated by Gwet on p. 34. Please test this code on other data with known Conger's kappas before relying on it.

I don't know if Conger's kappa for multiple raters is available in Stata, but it is available in R via the irr package, using the kappam.fleiss function and specifying the exact option. For information on the irr package in R, see http://cran.r-project.org/web/packages/irr/irr.pdf#page.12 .
After installing and loading the irr package in R, you can view a demo data set and Conger's kappa calculation using the following code.
data(diagnoses)
print(diagnoses)
kappam.fleiss(diagnoses, exact=TRUE)
I hope someone else here can help with a Stata solution, as you requested, but this may at least provide a solution if you can't find it in Stata.

In response to Dimitriy's comment below, I believe Stata's native kappa command applies either to two unique raters or to more than two non-unique raters.
The original poster may also want to consider the icc command in Stata, which allows for multiple unique raters.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Stata user defined Program, Probit and Maximization Options - stata

Your probit_test program is telling Stata to behave as if it was version 10.1. In 10.1, the Broyden–Fletcher–Goldfarb–Shanno algorithm was not yet available, so it errors. Change the version to something more current (bfgs was introduced in v11), and it should run fine.

Related

Power analysis via simulations in Stata version 15.1

Using egen and ordering a variable simultaneously

How to display r-squared for multiple models using outreg2

Summarizing a variable in Stata and extracting standard deviation

Is Conger's kappa available in Stata?

Categories

Resources