Creating/Changing table format using either estout or est table? - stata

I would like to produce a table which includes panel panel bootstrapped standard errors below the normal standard errors as well as the significance level at the end. I'm new to Stata and was able to produce the following structure:
est table RE REboot FE FEboot ,se stats(N)
where FEboot & REboot correspond to the panel bootstrapped standard errors. The desired output might look like:
RE FE
Var1 1.109541 1.109541
(-.3294736) (-.3294736)
( boot se ) ( boot se )

Use estout and matmap, both from SSC (ssc install <command>):
sysuse auto, clear
// clear previously stored estimates
estimates clear
eststo clear
// regression with bootstrap
eststo: regress price weight mpg, vce(bootstrap)
// compute non-bootstrap std errors
matrix evmodel = vecdiag(e(V_modelbased))
matmap evmodel evmodel, map(sqrt(#))
// add to stored results
estadd matrix evmodel
// VCV matrix (non-bootstrap)
matrix list e(V_modelbased)
// std error vector (non-bootstrap)
matrix list evmodel
// evmodel are non-bootstrap std errors; se are bootstrap std errors;
estout *, cells(b(star fmt(3)) evmodel(par fmt(2)) se(par fmt(2)))
When Stata computes bootstrapped std errors, it saves both the bootstrapped and non-bootstrapped VCV matrices. You can compute the necessary std errors, add them to the collection of stored results, and finally, output to a table.

Related

Is there a code for including a p-value to test for normality of variables?

I'd like to include a statistic in my summary statistics table within Stata, with the summarize command. Is there any possibility or other convenient way to include p-values (of the normality test) of the variables included? Would be very helpful. I'm using Stata 17.
There are many tests for normality, and several are included in Stata. The Shapiro-Wilk test is a modern classic, and quite popular. The more recent Doornik-Hansen test has been calibrated over a wide range of situations. Here's a token example:
. sysuse auto, clear
(1978 automobile data)
. swilk mpg
Shapiro–Wilk W test for normal data
Variable | Obs W V z Prob>z
-------------+------------------------------------------------------
mpg | 74 0.94821 3.335 2.627 0.00430
. mvtest normality mpg
Test for multivariate normality
Doornik-Hansen chi2(2) = 12.366 Prob>chi2 = 0.0021
The P-value is typically returned as an r-class value, so read the documentation for each command, and try return list after running each.

Drop random effects parameters from output table in Stata

I want to create a regression table (using esttab) from a mixed-effects regression estimated via xtmixed in Stata, but I want the output without the random effects parameters. How can I drop the random effects parameters from the output table? E.g., in the case of two variables...
xtmixed (Dependent Variable) (Independent variable) || (Grouping Variable))
... I don't want the lns1_1_1 and the lnsig_e values in my esttab-table. What is the best way to do this?
There is a keep() and a drop() option. For example:
webuse productivity, clear
xtmixed gsp private emp hwy water other unemp || region: || state:, mle
estimates store m1
// check result names
matrix list e(b)
// with -keep()- option
estout m1, keep(private emp hwy water other unemp gsp:_cons)
// with -drop()- option
estout m1, drop(lns1_1_1:_cons lns2_1_1:_cons lnsig_e:_cons)
In the context of multiple equation estimation, resulting matrices have elements with two-part names. The general form is equation-name:varname. The result of matrix list shows this. Afterwards, just use the appropriate names in the keep() and drop() options.
See [U] 14.2 Row and column names for more details on the naming conventions.
(Recall esttab is a wrapper for estout.)

use estimates from proc glm to make prediciton on another dataset

I'm not so familiar with SAS proc glm. All I have done using proc glm so far is to output parameter estimates and predicted values on training datasets. But I also need to use the fitted model to make prediction on testing dataset. (both point estimates and interval estimates)
Here is my code.
ods output ParameterEstimates=Pi_Parameters FitStatistics=Pi_Summary PredictedValues=Pi_Fitted;
proc glm data=Train_Pi;
class Area Fo5 Tye M0 M1 M2 M3;
model Pi = Dow Area Fo5 Tye M0|HC M1|HC M2|HC M3|HC/solution p ss3 /*tolerance*/;
run;
But how to proceed to next step? something like predict(Model_from_Train_Pi,Test_Pi)
If you're on SAS 9.4 see Jake's answer from this question:
How to predict probability in logistic regression in SAS?
If not on 9.4, my answer applies for adding the data in to the original data set.
A third option is PROC SCORE - documentation has an example for proc reg that's almost identical to your question:
http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_score_sect018.htm

Emulate Stata 8 clustered bootstrapped regression

I'm trying to store a series of scalars along the coefficients of a bootstrapped regression model. The code below looks like the example from the Stata [P]rogramming manual for postfile, which is apparently intended for use with such procedures.
The problem is with the // commented lines, which fail to work. More specifically, the problem seems to be that the syntax below worked in Stata 8 but fails to work in Stata 9+ after some change in the bootstrap procedure.
cap pr drop bsreg
pr de bsreg
reg mpg weight gear_ratio
predict yhat
qui sum yhat
// sca mu = r(mean)
// post sim (mu)
end
sysuse auto, clear
postfile sim mu using results , replace
bootstrap, cluster(foreign) reps(5) seed(6112): bsreg
postclose sim
use results, clear
Adding version 8 to the code did not solve the issue. Would anyone know what is wrong with this procedure, and how to fix it for execution in Stata 9+? The problem has been raised in the past and more recently, but without finding an answer.
Sorry for the long description, it's a long problem.
I've presented the issue as if it's a programming one because I'm using this code to replicate some health inequalities research. It's necessary to bootstrap the entire procedure, not just the reg model. I have some quibbles with the methodology, but nothing that would stop me from replicating the analysis.
Adding noisily to the bootstrap showed a problem with the predict command. Here's a fix using a tempvar macro.
cap pr drop bsreg
pr de bsreg
reg mpg weight gear_ratio
tempvar yhat
predict `yhat'
qui sum `yhat'
sca mu = r(mean)
post sim (mu)
end
sysuse auto, clear
postfile sim mu using results , replace
bootstrap, cluster(foreign) reps(5) seed(6112): bsreg
postclose sim
use results, clear

Stata: adding coefficients to estout

I want to output the results of several regressions as a nicely formatted LaTeX table and am happy to see that for most cases the estout package on SSC seems to do exactly that.
However, what I want is a bit special: Between the table section that shows the coefficient estimates and their standard errors, and the section that shows R^2 and the like, I would like to add a section that shows point estimates and standard errors (bonus points for stars) for particular linear combinations of the coefficients. Both point estimates and standard errors are easily computed via lincom but the best solution I've found so far for getting these numbers into the table involves the massily hacky addition of these numbers, one estadd scalar ... at a time. Is there a more elegant way to do this?
Example code:
sysuse auto
eststo, title("Model 1"): regress price weight mpg
lincom weight+mpg
estadd scalar skal r(estimate)
estadd scalar skalsd r(se)
eststo, title("Model 2"): regress price weight mpg foreign
lincom weight+mpg
estadd scalar skal r(estimate)
estadd scalar skalsd r(se)
label variable foreign "Car type (1=foreign)"
estout, cells(b(star fmt(3)) t(par fmt(2))) ///
stats(skal skalsd r2 N, labels("Linear Combination" "S.E." R-squared "N. of cases")) ///
label legend varlabels(_cons Constant)
You can use the layout, star and fmt sub-options to format the added scalars like this:
estout, cells(b(star fmt(3)) t(par fmt(2))) ///
stats(skal skalsd r2 N, layout(# (#) # #) star(skal) labels("Linear Combination" "S.E." R-squared "N. of cases") fmt(%9.2f %9.2f %9.2f %12.0f)) ///
label legend varlabels(_cons Constant)
These options are documented here. As far as I know, the method in your question is the only way to do this.
There's a problem with the answer above. I think the original poster wanted statistical significance stars based on a test of the lincom (weight+mpg) vs zero. That is not what star(skal) does.
As the documentation states: star[(scalarlist)] to specify that the overall significance of the model be denoted by stars. The stars are attached to the scalar statistics specified in scalarlist." Emphasis mine.
star(skal) in the previous example will place significance stars next to the estimates of skal, but they will be from an F-test of the overall significance of the regression model (I think). Not from lincom.
When I use this strategy on other specifications, the stars attached are clearly not based on the ones from lincom. I'm not sure if it's possible to use lincom/esttab to get the stars from lincom. To say nothing of nlcom!