Add column headings for coefficients and standard errors in an esttab table - stata

I am trying to use the community-contributed command estout to produce regression tables in the wide format (i.e. a separate column for coefficients and a separate column for standard errors), where there is a heading (eg "coefficient" and "s.e.") above each column.
A reproducible example using the auto dataset:
sysuse auto, clear
regress mpg weight i.foreign
estimates store m1
regress mpg weight length i.foreign
estimates store m2
esttab m1 m2, wide b(3) se(3)
esttab m1 m2, wide plain b(3) se(3)
This results in output almost exactly what I am after, but does not have the headings (eg "coefficient" and "s.e.") above each column:
esttab m1 m2, wide b(3) se(3)
----------------------------------------------------------------------
(1) (2)
mpg mpg
----------------------------------------------------------------------
weight -0.007*** (0.001) -0.004** (0.002)
0.foreign 0.000 (.) 0.000 (.)
1.foreign -1.650 (1.076) -1.708 (1.067)
length -0.083 (0.055)
_cons 41.680*** (2.166) 50.537*** (6.246)
----------------------------------------------------------------------
N 74 74
----------------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
I suspect my preferred output is possible, because if I use the plain option, I get the headers ("b" and "se", although I would like to be able to rename both if possible):
esttab m1 m2, wide plain b(3) se(3)
m1 m2
b se b se
weight -0.007 0.001 -0.004 0.002
0.foreign 0.000 . 0.000 .
1.foreign -1.650 1.076 -1.708 1.067
length -0.083 0.055
_cons 41.680 2.166 50.537 6.246
N 74 74
My desired output would look like this:
----------------------------------------------------------------------
(1) (2)
mpg mpg
coefficient s.e. coefficient s.e.
----------------------------------------------------------------------
weight -0.007*** (0.001) -0.004** (0.002)
0.foreign 0.000 (.) 0.000 (.)
1.foreign -1.650 (1.076) -1.708 (1.067)
length -0.083 (0.055)
_cons 41.680*** (2.166) 50.537*** (6.246)
----------------------------------------------------------------------
N 74 74
----------------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
Also, while the output above is in text format for reproducibility, in my real output, I'm trying to produce tables in rich text format.

The following works for me:
sysuse auto, clear
regress mpg weight i.foreign
estimates store m1
regress mpg weight length i.foreign
estimates store m2
esttab m1 m2, cells("b(fmt(3) star) se(fmt(3) par)") collabels("coefficient" "s.e.")
----------------------------------------------------------------------
(1) (2)
mpg mpg
coefficient s.e. coefficient s.e.
----------------------------------------------------------------------
weight -0.007*** (0.001) -0.004** (0.002)
0.foreign 0.000 (.) 0.000 (.)
1.foreign -1.650 (1.076) -1.708 (1.067)
length -0.083 (0.055)
_cons 41.680*** (2.166) 50.537*** (6.246)
----------------------------------------------------------------------
N 74 74
----------------------------------------------------------------------

Related

Stata: estpost+ttest with "reverse" option?

I currently struggle to apply the reverse option to my ttest.
Here is a toy example:
sysuse auto,clear
local varlist mpg price weight
eststo foreign: quietly estpost summarize `varlist' if foreign==0
eststo domestic: quietly estpost summarize `varlist' if foreign==1
eststo diff: quietly estpost ttest mpg, by(foreign) unequal **reverse**
esttab foreign domestic diff
The following works:
sysuse auto,clear
local varlist mpg price weight
eststo foreign: quietly estpost summarize `varlist' if foreign==0
eststo domestic: quietly estpost summarize `varlist' if foreign==1
eststo diff: quietly estpost ttest mpg, by(foreign) unequal reverse
esttab foreign domestic diff
Note:
ttest mpg, by(foreign) unequal reverse
works.
After reading the estpost documentation, it seems the package currently does not support the option.
Eventually, I need to create a table of ttests for about 20 variables and reverse the result for b/t. I'm vey thankful for workarounds!
You can do everything using regression like this (even in small samples):
. sysuse auto, clear
(1978 automobile data)
. /* reverse t-test */
. ttest mpg, by(foreign) unequal reverse
Two-sample t test with unequal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. err. Std. dev. [95% conf. interval]
---------+--------------------------------------------------------------------
Foreign | 22 24.77273 1.40951 6.611187 21.84149 27.70396
Domestic | 52 19.82692 .657777 4.743297 18.50638 21.14747
---------+--------------------------------------------------------------------
Combined | 74 21.2973 .6725511 5.785503 19.9569 22.63769
---------+--------------------------------------------------------------------
diff | 4.945804 1.555438 1.771556 8.120053
------------------------------------------------------------------------------
diff = mean(Foreign) - mean(Domestic) t = 3.1797
H0: diff = 0 Satterthwaite's degrees of freedom = 30.5463
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.9983 Pr(|T| > |t|) = 0.0034 Pr(T > t) = 0.0017
. /* calculate means */
. eststo means: qui reg mpg ibn.foreign, vce(hc2) nocons
. /* regression equivalent using robust veriance with a bias correction and t-test DoF */
. eststo diff: qui margins, dydx(foreign) df(`=scalar(df_t)') post
. esttab means diff, label se
----------------------------------------------------
(1) (2)
Mileage (m~)
----------------------------------------------------
Domestic 19.83*** 0
(0.658) (.)
Foreign 24.77*** 4.946**
(1.410) (1.555)
----------------------------------------------------
Observations 74 74
----------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Difference between two means and medians in stata

I want to export p-value of means and median difference to latex from stata. I tried the code below and it works for p-value of means. However, I dont know how to add the analysis on the difference between medians. Can you please help me with this?
eststo control: quietly estpost summarize a b c if treated == 0
eststo treated: quietly estpost summarize a b c if treated == 1
eststo diff: quietly estpost ttest a b c, by(treated) unequal
esttab using means_medians.tex, replace mlabels("Treated" "Control" "Difference")
cells("mean(pattern(1 1 0) fmt(2)) sd(pattern(1 1 0 ) fmt(3)) b(star pattern(0 0 1) fmt(2))") label
You can use epctile that is implemented as a proper estimation commands and returns e(V) and the estimation table. (Disregard the ugly output in the first part. I wrote it about 10 years ago for a specific project, but it seems like I have never made it pretty enough for the general use.)
. sysuse auto, clear
(1978 Automobile Data)
. epctile mpg, p(50)
Mean estimation Number of obs = 74
--------------------------------------------------------------
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
__000006 | -.027027 .058435 -.1434878 .0894338
--------------------------------------------------------------
Percentile estimation
------------------------------------------------------------------------------
mpg | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
p50 | 20 .75 26.67 0.000 18.53003 21.46997
------------------------------------------------------------------------------
This command is not on SSC, it is on my page, so to install it, follow the prompts in findit epctile.

Stata multinomial regression - post-estimation Wald test

I've conducted a multinomial logistic regression analysis in Stata, followed by a Wald test, and was hoping someone could confirm that my code is doing what I think it's doing.
NB: I'm using some of Stata's example data to illustrate. The analysis I'm running for this illustration is completely meaningless, but uses the same procedure as my 'real' analysis, other than the fact that my real analysis also includes some probability weights and other covariates.
sysuse auto.dta
First, I run a multinomial logistic regression, predicting 'Repair Record' from 'Foreign' and 'Price':
mlogit rep78 i.foreign price, base(1) rrr nolog
Multinomial logistic regression Number of obs = 69
LR chi2(8) = 31.15
Prob > chi2 = 0.0001
Log likelihood = -78.116372 Pseudo R2 = 0.1662
------------------------------------------------------------------------------
rep78 | RRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1 | (base outcome)
-------------+----------------------------------------------------------------
2 |
foreign |
Foreign | .7822853 1672.371 -0.00 1.000 0 .
price | 1.000414 .0007027 0.59 0.556 .9990375 1.001792
_cons | .5000195 1.669979 -0.21 0.836 .000718 348.2204
-------------+----------------------------------------------------------------
3 |
foreign |
Foreign | 686842 1.30e+09 0.01 0.994 0 .
price | 1.000462 .0006955 0.66 0.507 .9990996 1.001826
_cons | 1.254303 4.106511 0.07 0.945 .0020494 767.6863
-------------+----------------------------------------------------------------
4 |
foreign |
Foreign | 6177800 1.17e+10 0.01 0.993 0 .
price | 1.000421 .0006999 0.60 0.547 .9990504 1.001794
_cons | .5379627 1.7848 -0.19 0.852 .0008067 358.7452
-------------+----------------------------------------------------------------
5 |
foreign |
Foreign | 2.79e+07 5.29e+10 0.01 0.993 0 .
price | 1.000386 .0007125 0.54 0.587 .9989911 1.001784
_cons | .146745 .5072292 -0.56 0.579 .0001676 128.4611
------------------------------------------------------------------------------
Second, I want to know whether the 'Foreign' coefficient for outcome category 4 is significantly different to the 'Foreign' coefficient for outcome category 5. So, I run a Wald test:
test [4]1.foreign = [5]1.foreign
( 1) [4]1.foreign - [5]1.foreign = 0
chi2( 1) = 2.72
Prob > chi2 = 0.0988
From this, I conclude that the 'Foreign' coefficient for outcome category 4 is NOT significantly different to the 'Foreign' coefficient for outcome category 5. Put more simply, the association between 'Foreign' and 'Repair 4' (compared to 'Repair 1') is equal to the association between 'Foreign' and 'Repair 5' (compared to 'Repair 1') .
Is my code for the Wald test, and my inferences about what it's doing and showing, correct?
Additionally, to what was discussed in the comments you can also perform a likelihood-ratio test using the following code.
sysuse auto.dta
qui mlogit rep78 i.foreign price, base(1) rrr nolog
estimate store unrestricted
constraint 1 [4]1.foreign = [5]1.foreign
qui mlogit rep78 i.foreign price, base(1) rrr nolog constraints(1)
estimate store restricted
lrtest unrestricted restricted
The output of the test shows the same conclusion as the Wald test, but it has better properties as explained below.
Likelihood-ratio test LR chi2(1) = 3.13
(Assumption: restricted nested in unrestricted) Prob > chi2 = 0.0771
Quoting the official documentation from mlogit
The results produced by test are an approximation based on the estimated covariance matrix of the coefficients. Because the probability of being uninsured is low, the log-likelihood may be nonlinear for the uninsured. Conventional statistical wisdom is not to trust the asymptotic answer under these circumstances but to perform a likelihood-ratio test instead.

Stata xtoverid command error

I have panel data and my regression is of the form:
s_roa1 = s_roa + c_roa
I am new to Stata and i am trying to use the xtoverid command for a robust hausman test to help me choose between a fixed or random effects model:
xtoverid s_roa1 s_roa c_roa, fe i (year)
However, I get the following error:
varlist not allowed
Can anyone help me understand what does this suggest?
First of all, xtoverid is a community-contributed command, something which you fail to make clear in your question. It is customary and useful to provide this information right from the start, so others know that you do not refer to an official, built-in command.
Second, this is a post-estimation command, which means you run it directly after you estimate your model using xtreg, xtivreg, xtivreg2 or xthtaylor.
The help file provided by the authors offers an enlightening example:
. webuse nlswork
(National Longitudinal Survey. Young Women 14-26 years of age in 1968)
. tsset idcode year
panel variable: idcode (unbalanced)
time variable: year, 68 to 88, but with gaps
delta: 1 unit
.
. gen age2=age^2
(24 missing values generated)
. gen black=(race==2)
.
. xtivreg ln_wage age (tenure = union south), fe i(idcode)
Fixed-effects (within) IV regression Number of obs = 19,007
Group variable: idcode Number of groups = 4,134
R-sq: Obs per group:
within = . min = 1
between = 0.1261 avg = 4.6
overall = 0.0869 max = 12
Wald chi2(2) = 142054.65
corr(u_i, Xb) = -0.6875 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
ln_wage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
tenure | .2450528 .0382041 6.41 0.000 .1701741 .3199314
age | -.0650873 .0126167 -5.16 0.000 -.0898156 -.040359
_cons | 2.826672 .2451883 11.53 0.000 2.346112 3.307232
-------------+----------------------------------------------------------------
sigma_u | .71990151
sigma_e | .64315554
rho | .55612637 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4133,14871) = 1.53 Prob > F = 0.0000
------------------------------------------------------------------------------
Instrumented: tenure
Instruments: age union south
------------------------------------------------------------------------------
.
. xtoverid
Test of overidentifying restrictions:
Cross-section time-series model: xtivreg fe
Sargan-Hansen statistic 0.965 Chi-sq(1) P-value = 0.3259
. xtoverid, robust
Test of overidentifying restrictions:
Cross-section time-series model: xtivreg fe robust
Sargan-Hansen statistic 0.960 Chi-sq(1) P-value = 0.3271
. xtoverid, cluster(idcode)
Test of overidentifying restrictions:
Cross-section time-series model: xtivreg fe robust cluster(idcode)
Sargan-Hansen statistic 0.495 Chi-sq(1) P-value = 0.4818
From Stata's command prompt, type help xtoverid for more details.

What command in Stata 12 do I use to interpret the coefficients of the Limited Dependent Variable model?

I am running the following code:
oprobit var1 var2 var3 var4 var5 var2##var3 var4##var5 var6 var7 etc.
Without the interaction terms I could have used the following code to interpret the coefficients:
mfx compute, predict(outcome(2))
[for outcome equaling 2 (in total I have 4 outcomes)]
But since mfx does not work with the interaction terms, I get an error.
I tried to use
margins command, but it did not work either!!!
margins var2 var3 var4 var5 var2##var3 var4##var5 var6 var7 etc... , post
margins works ONLY for the interaction terms: (margins var2 var3 var4 var5, post)
What command do I use to be able to interpret BOTH interaction and regular variables?
Finally, to use simple language, my question is: given the regression model above, what command can I use to interpret the coefficients?
mfx is an old command that has been replaced with margins. That is why it does not work with factor variable notation that you used to define the interactions. I am not clear what you actually intended to calculate with the margins command.
Here's an example of how you can get the average marginal effects on the probability of outcome 2:
. webuse fullauto
(Automobile Models)
. oprobit rep77 i.foreign c.weight c.length##c.mpg
Iteration 0: log likelihood = -89.895098
Iteration 1: log likelihood = -76.800575
Iteration 2: log likelihood = -76.709641
Iteration 3: log likelihood = -76.709553
Iteration 4: log likelihood = -76.709553
Ordered probit regression Number of obs = 66
LR chi2(5) = 26.37
Prob > chi2 = 0.0001
Log likelihood = -76.709553 Pseudo R2 = 0.1467
--------------------------------------------------------------------------------
rep77 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
1.foreign | 1.514739 .4497962 3.37 0.001 .633155 2.396324
weight | -.0005104 .0005861 -0.87 0.384 -.0016593 .0006384
length | .0969601 .0348506 2.78 0.005 .0286542 .165266
mpg | .4747249 .2241349 2.12 0.034 .0354286 .9140211
|
c.length#c.mpg | -.0020602 .0013145 -1.57 0.117 -.0046366 .0005161
---------------+----------------------------------------------------------------
/cut1 | 17.21885 5.386033 6.662419 27.77528
/cut2 | 18.29469 5.416843 7.677877 28.91151
/cut3 | 19.66512 5.463523 8.956814 30.37343
/cut4 | 21.12134 5.515901 10.31038 31.93231
--------------------------------------------------------------------------------
. margins, dydx(*) predict(outcome(2))
Average marginal effects Number of obs = 66
Model VCE : OIM
Expression : Pr(rep77==2), predict(outcome(2))
dy/dx w.r.t. : 1.foreign weight length mpg
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.foreign | -.2002434 .0576487 -3.47 0.001 -.3132327 -.087254
weight | .0000828 .0000961 0.86 0.389 -.0001055 .0002711
length | -.0088956 .003643 -2.44 0.015 -.0160356 -.0017555
mpg | -.012849 .0085546 -1.50 0.133 -.0296157 .0039178
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
If you want the prediction, rather than the marginal effect, try
margins, predict(outcome(2))
The marginal effect of just the interaction term is harder to calculate in a non-linear model. Details here.
The marginal effects for positive outcomes, Pr(depvar1=1, depvar2=1), are
. mfx compute, predict(p11)
The marginal effects for Pr(depvar1=1, depvar2=0) are
. mfx compute, predict(p10)
The marginal effects for Pr(depvar1=0, depvar2=1) are
. mfx compute, predict(p01)