Create table with variable means and differences using esttab - stata

I want to generate a table in Stata that contains means, differences and t-values for 4 different groups.
In particular, say you have 2x2 study design and want to display the mean and standard deviation of the outcome variable and add a further column that tests for the difference across treatment one and a further row that contains the difference and t-value across treatment two.
I have the following code:
clear
sysuse auto
gen large_trunk = (trunk > 14)
gen price_large_trunk = price if large_trunk == 1
gen price_small_trunk = price if large_trunk == 0
eststo price_domestic: qui estpost sum price_large_trunk price_small_trunk if foreign == 0
eststo price_foreign: qui estpost sum price_large_trunk price_small_trunk if foreign == 1
eststo diff: qui estpost ttest price_large_trunk price_small_trunk, by(foreign)
eststo diff2: qui estpost ttest price if foreign == 0, by(large_trunk)
eststo diff3: qui estpost ttest price if foreign == 1, by(large_trunk)
esttab price_domestic price_foreign diff diff2 diff3, ///
cells("mean(pattern(1 1 0) fmt(2)) b(star pattern(0 0 1) fmt(2))" "sd(pattern(1 1 0) par) t(pattern(0 0 1) par)") ///
mtitle("Domestic" "Foreign" "Difference") ///
nonumbers noobs ///
coeflabels(price "Difference") ///
notes
The output is:
----------------------------------------------------------------------------------------------------------------
Domestic Foreign Difference diff2 diff3
mean/sd mean/sd b/t mean/sd b/t mean/sd b/t
----------------------------------------------------------------------------------------------------------------
price_larg~k 6900.13 6186.00 714.13
(3164.76) (2186.93) (0.48)
price_smal~k 4850.57 6443.12 -1592.55
(2608.98) (2794.83) (-1.81)
Difference -2049.56* 257.12
(-2.45) (0.19)
----------------------------------------------------------------------------------------------------------------
As you can see this is a 7x3 table. Ideally I would want to have the elements in the last row in columns one and two and discard all columns after column three.
On a related note, I also wonder if I can suppress the summary statistics in the table header (I mean the mean/sd and b/t under the treatment names).

You need to modify your code as follows:
sysuse auto, clear
est clear
gen large_trunk = (trunk > 14)
gen price_large_trunk = price if large_trunk == 1
gen price_small_trunk = price if large_trunk == 0
qui estpost sum price_large_trunk price_small_trunk if foreign == 0
qui ttest price if foreign == 0, by(large_trunk)
estadd local diff2 `= round(r(mu_1) - r(mu_2), .01)'
estadd local tdiff2 (`= round(`r(t)', .01)')
eststo price_domestic
qui estpost sum price_large_trunk price_small_trunk if foreign == 1
qui ttest price if foreign == 1, by(large_trunk)
estadd local diff2 `= round(r(mu_1) - r(mu_2), .01)'
estadd local tdiff2 (`= round(`r(t)', .01)')
eststo price_foreign
eststo diff1: qui estpost ttest price_large_trunk price_small_trunk, by(foreign)
esttab price_domestic price_foreign diff1, ///
stats(diff2 tdiff2, label("Difference" " ") ) ///
cells("mean(pattern(1 1 0) fmt(2)) b(star pattern(0 0 1) fmt(2))" "sd(pattern(1 1 0) par) t(pattern(0 0 1) par)") ///
mtitle("Domestic" "Foreign" "Difference") collabels(none) ///
nonumbers noobs ///
coeflabels(price "Difference") ///
notes
Result:
------------------------------------------------------
Domestic Foreign Difference
------------------------------------------------------
price_larg~k 6900.13 6186.00 714.13
(3164.76) (2186.93) (0.48)
price_smal~k 4850.57 6443.12 -1592.55
(2608.98) (2794.83) (-1.81)
------------------------------------------------------
Difference -2049.56 257.12
(-2.45) (.19)
------------------------------------------------------
Or perhaps:
esttab price_domestic price_foreign diff1, ///
stats(diff2 tdiff2, label("Difference" " ") ) ///
cells("mean(pattern(1 1 0) fmt(2)) b(star pattern(0 0 1) fmt(2))" "sd(pattern(1 1 0) par) t(pattern(0 0 1) par)") ///
mtitle("Domestic" "Foreign" "Difference") collabels(none) ///
nonumbers noobs ///
coeflabels(price "Difference") ///
notes gaps prefoot(" ")
------------------------------------------------------
Domestic Foreign Difference
------------------------------------------------------
price_larg~k 6900.13 6186.00 714.13
(3164.76) (2186.93) (0.48)
price_smal~k 4850.57 6443.12 -1592.55
(2608.98) (2794.83) (-1.81)
Difference -2049.56 257.12
(-2.45) (.19)
------------------------------------------------------
Or even:
esttab price_domestic price_foreign diff1, ///
stats(diff2 tdiff2, label("Difference" " ") ) ///
cells("mean(pattern(1 1 0) fmt(2)) b(star pattern(0 0 1) fmt(2))" "sd(pattern(1 1 0) par) t(pattern(0 0 1) par)") ///
mtitle("Domestic" "Foreign" "Difference") collabels(none) ///
nonumbers noobs ///
coeflabels(price "Difference") ///
notes gaps plain
Domestic Foreign Difference
price_larg~k 6900.13 6186.00 714.13
(3164.76) (2186.93) (0.48)
price_smal~k 4850.57 6443.12 -1592.55
(2608.98) (2794.83) (-1.81)
Difference -2049.56 257.12
(-2.45) (.19)

Related

Frequency Table with Esttab

I'm interested in outputting the transpose of the tabulate command using Stata:
sysuse auto, clear
eststo: estpost tab foreign
esttab, cells(b)
Such that the output would be
--------------------------------------------------
Domestic Foreign Total
--------------------------------------------------
Domestic 52. 22. 74
--------------------------------------------------
I can do something similar with the following:
sysuse auto, clear
est clear
tab foreign, gen(newp_`i')
eststo : estpost tabstat newp_1 newp_2 , stat(sum ) column(variables)
drop newp*
esttab, cells("newp_1 newp_2") ///
compress unstack ///
nonumbers nodepvars noobs ///
mtitles("Domestic" "Foreign")
...
------------------------------
Domestic
newp_1 newp_2
------------------------------
sum 52 22
------------------------------
But I run into issues when I try to add counts of other variables (Ideally I'd like est2 to be appended vertically instead of horizontally)
sysuse auto, clear
est clear
tab foreign, gen(newp_`i')
eststo : estpost tabstat newp_1 newp_2 , stat(sum ) column(variables)
drop newp*
esttab, cells("newp_1 newp_2") ///
compress unstack ///
nonumbers nodepvars noobs ///
mtitles("Domestic" "Foreign")
g example = (mpg >=22)
tab example, gen(newp_`i')
eststo : estpost tabstat newp_1 newp_2 , stat(sum ) column(variables)
esttab, cells("newp_1 newp_2") unstack ///
compress ///
nonumbers nodepvars noobs ///
mtitles("Domestic" "Foreign")
...
--------------------------------------------------
Domestic Foreign
newp_1 newp_2 newp_1 newp_2
--------------------------------------------------
sum 52 22 43 31
--------------------------------------------------
My desired output is:
---------------------------------
newp_1 newp_2
---------------------------------
sum (foreign) 52 22
sum (price) 43 31
---------------------------------
The following seems to work:
sysuse auto, clear
est clear
rename foreign binary1
g binary2 = (mpg >=22)
unab lst : binary*
foreach i in `lst'{
tab `i', gen(test_`v')
eststo: estpost tabstat test_*, statistics(sum) columns(statistics)
drop test_*
}
esttab , ///
replace cell(sum ) ///
compress ///
nonumbers rename("test_1" "Yes" "test_2" "No") ///
mtitles("Binary1" "Binary2")
matrix transp = r(coefs)'
esttab matrix(transp), compress eqlabels(,merge)

Replace/delete line in tex output of esttab

I am trying to get rid of a line with the statistics label/name "mean" in the output of the Stata code:
**********************************************************
**** Produce table with mean values of hhi_r, jsi_r, top4_r, hhi, jsi in
decending order of hhi_r
** sort output by mean hhi_r
drop mean group
egen mean = mean(hhi_r), by(name)
egen group = group(mean name)
replace group = -group
labmask group, values(name)
label var group "`: var label name'"
** done sorting
label var hhi_r "rank(\$HHI\$)"
label var jsi_r "rank(\$C1_I\$)"
label var top4_r "rank(\$T4\$)"
label var hhi "\$HHI\$"
label var jsi "\$C1_I\$"
eststo clear
eststo t1: estpost tabstat jsi_r, by(group) statistics(mean) columns(statistics) listwise nototal
eststo t2: estpost tabstat top4_r, by(group) statistics(mean) columns(statistics) listwise nototal
eststo t3: estpost tabstat hhi, by(group) statistics(mean) columns(statistics) listwise nototal
eststo t4: estpost tabstat jsi, by(group) statistics(mean) columns(statistics) listwise nototal
eststo t0: estpost tabstat hhi_r, by(group) statistics(count mean sd p10 p25 p50 p75 p90) columns(statistics) listwise nototal
esttab t0 t1 t2 t3 t4 using "$Tables/hhi_r.tex", replace style(tex) type ///
cells("mean(fmt(%15.0fc))") alignment(r) substitute(\_ _) ///
noobs nonumber varlabels(`e(labels)') varwidth(20) mlabels("rank(HHI)" "rank(C1)" "rank(T4)" "HHI" "C1")
The output is a TeX file that includes as the second line the label/name "mean" of the statistics:
{
\def\sym#1{\ifmmode^{#1}\else\(^{#1}\)\fi}
\begin{tabular}{l*{5}{r}}
\hline\hline
& rank(HHI)& rank(C1)& rank(T4)& HHI& C1\\
& mean& mean& mean& mean& mean\\
\hline
624: Social Assistance& 2,332& 4& 4& 10,000& 1\\
425: Wholesale Electronic Markets and Agents and Broker& 2,332& & & 10,000& \\
113: Forestry and Logging& 2,332& & & 10,000& \\
115: Support Activities for Agriculture and Forestry& 2,326& & & 9,477& \\
314: Textile Product Mills& 2,324& & & 7,501& \\
112: Animal Production and Aquaculture& 2,321& & & 6,968& \\
492: Couriers and Messengers& 2,318& 1,225& & 6,821& 484\\
811: Repair and Maintenance& 2,316& 2& 2& 5,976& 1\\
\hline\hline
\end{tabular}
}
How can I suppress this line or replace the names/labels with the strings in the mlabels() option of the community-contributed command esttab?
Using Stata's toy auto dataset as an example, the following works for me:
sysuse auto, clear
eststo clear
eststo t1: estpost tabstat price, by(foreign) statistics(mean) columns(statistics) listwise nototal
eststo t2: estpost tabstat mpg, by(foreign) statistics(mean) columns(statistics) listwise nototal
eststo t3: estpost tabstat weight, by(foreign) statistics(mean) columns(statistics) listwise nototal
eststo t4: estpost tabstat length, by(foreign) statistics(mean) columns(statistics) listwise nototal
eststo t0: estpost tabstat mpg, by(foreign) statistics(count mean sd p10 p25 p50 p75 p90) columns(statistics) listwise nototal
esttab t0 t1 t2 t3 t4 using "table.tex", replace style(tex) ///
type cells(`"mean(fmt(%15.0fc) label(" "))"') alignment(r) substitute(\_ _) ///
noobs nonumber varlabels(`e(labels)') varwidth(20) ///
mlabels("rank(HHI)" "rank(C1)" "rank(T4)" "HHI" "C1")
type table.tex
{
\def\sym#1{\ifmmode^{#1}\else\(^{#1}\)\fi}
\begin{tabular}{l*{5}{r}}
\hline\hline
& rank(HHI)& rank(C1)& rank(T4)& HHI& C1\\
& & & & & \\
\hline
Domestic & 20& 6,072& 20& 3,317& 196\\
Foreign & 25& 6,385& 25& 2,316& 169\\
\hline\hline
\end{tabular}
}
You basically substitute each statistic name with a space by specifying the label(" ") suboption in cells(mean()).
Alternatively, to eliminate the line completely use the option collabels(none):
esttab t0 t1 t2 t3 t4 using "table.tex", replace style(tex) ///
type cells(`"mean(fmt(%15.0fc))"') alignment(r) substitute(\_ _) ///
noobs nonumber varlabels(`e(labels)') varwidth(20) ///
mlabels("rank(HHI)" "rank(C1)" "rank(T4)" "HHI" "C1") collabels(none)
type table.tex
{
\def\sym#1{\ifmmode^{#1}\else\(^{#1}\)\fi}
\begin{tabular}{l*{5}{r}}
\hline\hline
& rank(HHI)& rank(C1)& rank(T4)& HHI& C1\\
\hline
Domestic & 20& 6,072& 20& 3,317& 196\\
Foreign & 25& 6,385& 25& 2,316& 169\\
\hline\hline
\end{tabular}
}

How to produce summary statistics table

I would like to produce a summary statistics table looking like the one below:
Female Male
p1
p10
p50
p99
However, with estpost and esttab, I can only produce a table like the following:
(1) (2)
p1/p10/p5~99 p1/p10/p5~99
-3.756124 -4.159476
1.009338 -1.210738
.3221763 .2945236
.8658271 .8658271
.9871135 .9871135
The code i am using is the following:
estpost summarize math_std if female == 1 , detail
eststo female
estpost summarize math_std if female == 0 , detail
eststo male
esttab female male , cells(p1 p10 p50 p95 p99) noobs
How can I put the column labels in the desired place?
Here's a solution relying on the creation of a matrix with the relevant results:
sysuse auto, clear
quietly summarize price if foreign == 1 , detail
matrix foreign = r(p1) \ r(p10) \ r(p50) \ r(p95) \ r(p99)
quietly summarize price if foreign == 0 , detail
matrix domestic = r(p1) \ r(p10) \ r(p50) \ r(p95) \ r(p99)
matrix both = foreign , domestic
matrix rownames both = p1 p10 p50 p95 p99
matrix colnames both = foreign domestic
esttab matrix(both), mlabels(none)
--------------------------------------
foreign domestic
--------------------------------------
p1 3748 3291
p10 3895 3955
p50 5759 4782.5
p95 11995 13594
p99 12990 15906
--------------------------------------

How can I stack equations using estout?

I am creating a summary statistics table using the community-contributed command estout.
The code looks like this:
sysuse auto, clear
eststo clear
eststo: estpost ttest price mpg weight headroom trunk if rep78 ==3, by(foreign)
eststo: estpost ttest price mpg weight headroom trunk if rep78 ==4, by(foreign)
estout, cells("mu_1 mu_2 b(star)")
The result looks as follows:
--------------------------------------------------------------------------------------------
est1 est2
mu_1 mu_2 b mu_1 mu_2 b
--------------------------------------------------------------------------------------------
price 6607.074 4828.667 1778.407 5881.556 6261.444 -379.8889
mpg 19 23.33333 -4.333333 18.44444 24.88889 -6.444444**
weight 3442.222 2010 1432.222*** 3532.222 2207.778 1324.444***
headroom 3.222222 2.666667 .5555556 3.444444 2.5 .9444444*
trunk 15.59259 12.33333 3.259259 16.66667 10.33333 6.333333**
--------------------------------------------------------------------------------------------
I would like to know how I could stack est1 and est2 on top of each other.
The command estout cannot automatically stack results from stored estimates. Consequently, the use of eststo is redundant. In this case, the easiest way to obtain the desired output is to simply create two matrices with the results and stack one on top of the other.
For example:
sysuse auto, clear
matrix A = J(5, 3, .)
local i 0
foreach var of varlist price mpg weight headroom trunk {
local ++i
ttest `var' if rep78 == 3, by(foreign)
matrix A[`i', 1] = r(mu_1)
matrix A[`i', 2] = r(mu_2)
matrix A[`i', 3] = r(mu_1) - r(mu_2)
local matnamesA `matnamesA' "rep78==3:`var'"
}
matrix rownames A = `matnamesA'
matrix B = J(5, 3, .)
local i 0
foreach var of varlist price mpg weight headroom trunk {
local ++i
ttest `var' if rep78 == 4, by(foreign)
matrix B[`i', 1] = r(mu_1)
matrix B[`i', 2] = r(mu_2)
matrix B[`i', 3] = r(mu_1) - r(mu_2)
local matnamesB `matnamesB' "rep78==4:`var'"
}
matrix rownames B = `matnamesB'
matrix C = A \ B
esttab matrix(C), nomtitles collabels("mu_1" "mu_2" "diff")
---------------------------------------------------
mu_1 mu_2 diff
---------------------------------------------------
rep78==3
price 6607.074 4828.667 1778.407
mpg 19 23.33333 -4.333333
weight 3442.222 2010 1432.222
headroom 3.222222 2.666667 .5555556
trunk 15.59259 12.33333 3.259259
---------------------------------------------------
rep78==4
price 5881.556 6261.444 -379.8889
mpg 18.44444 24.88889 -6.444444
weight 3532.222 2207.778 1324.444
headroom 3.444444 2.5 .9444444
trunk 16.66667 10.33333 6.333333
---------------------------------------------------

Exporting results from regressions in Excel

I want to store results from ordinary least squares (OLS) regressions in Stata within a double loop.
Here is the structure of my code:
foreach i2 of numlist 1 2 3{
foreach i3 of numlist 1 2 3 4{
quiet: eststo: reg dep covariates, robust
}
}
The end goal is to have a table in Excel composed by twelve rows (one for each model) and seven columns (number of observations, estimated constant, five estimated coefficients).
Any suggestion on how can I do this?
Such a table can be created simply by using the community-contributed command esttab:
sysuse auto, clear
eststo clear
eststo m1: quietly regress price weight
eststo m2: quietly regress price weight mpg
quietly esttab
matrix A = r(coefs)'
matrix C = r(stats)'
tokenize "`: rownames A'"
forvalues i = 1 / `=rowsof(A)' {
if strmatch("``i''", "*b*") matrix B = nullmat(B) \ A[`i', 1...]
}
matrix C = B , C
matrix rownames C = "Model 1" "Model 2"
Result:
esttab matrix(C) using table.csv, eqlabels(none) mlabels(none) varlabels("Model 1" "Model 2")
----------------------------------------------------------------
weight mpg _cons N
----------------------------------------------------------------
Model 1 2.044063 -6.707353 74
Model 2 1.746559 -49.51222 1946.069 74
----------------------------------------------------------------