I would like to produce a summary statistics table looking like the one below:
Female Male
p1
p10
p50
p99
However, with estpost and esttab, I can only produce a table like the following:
(1) (2)
p1/p10/p5~99 p1/p10/p5~99
-3.756124 -4.159476
1.009338 -1.210738
.3221763 .2945236
.8658271 .8658271
.9871135 .9871135
The code i am using is the following:
estpost summarize math_std if female == 1 , detail
eststo female
estpost summarize math_std if female == 0 , detail
eststo male
esttab female male , cells(p1 p10 p50 p95 p99) noobs
How can I put the column labels in the desired place?
Here's a solution relying on the creation of a matrix with the relevant results:
sysuse auto, clear
quietly summarize price if foreign == 1 , detail
matrix foreign = r(p1) \ r(p10) \ r(p50) \ r(p95) \ r(p99)
quietly summarize price if foreign == 0 , detail
matrix domestic = r(p1) \ r(p10) \ r(p50) \ r(p95) \ r(p99)
matrix both = foreign , domestic
matrix rownames both = p1 p10 p50 p95 p99
matrix colnames both = foreign domestic
esttab matrix(both), mlabels(none)
--------------------------------------
foreign domestic
--------------------------------------
p1 3748 3291
p10 3895 3955
p50 5759 4782.5
p95 11995 13594
p99 12990 15906
--------------------------------------
Related
Consider the following toy example:
sysuse auto, clear
tab foreign, sum(price)
| Summary of Price
Car type | Mean Std. Dev. Freq.
------------+------------------------------------
Domestic | 6,072.423 3,097.104 52
Foreign | 6,384.682 2,621.915 22
------------+------------------------------------
Total | 6,165.257 2,949.496 74
How can I save the results in an Excel file?
Using the community-contributed command esttab, the following works for me:
sysuse auto, clear
egen m_total = mean(price)
egen s_total = sd(price)
scalar mtotal = m_total
scalar stotal = s_total
scalar N = _N
collapse (mean) Mean=price (sd) StdDev=price (count) Freq = price, by(foreign)
set obs 3
replace Mean = mtotal in 3
replace StdDev = stotal in 3
replace Freq = N in 3
mkmat Mean StdDev Freq, matrix(A)
esttab matrix(A) using myfilename.xls, varlabels(r1 Domestic r2 Foreign r3 Total) ///
title(" Summary of Price") mlabels(none)
Summary of Price
---------------------------------------------------
Mean StdDev Freq
---------------------------------------------------
Domestic 6072.423 3097.104 52
Foreign 6384.682 2621.915 22
Total 6165.257 2949.496 74
---------------------------------------------------
I have executed the code snippet below:
foreach yr in 2000 2001 2002 2003 2004 2005 2006 {
eststo: ivregress 2sls y (var=z) c [aw=w] if yr==`yr'
estimate store r`yr'
}
coefplot r2000 r2001 r2002 r2003 r2004 r2005 r2006 , vertical keep(var)
This produced the following graph:
However, I want to change the label in x-axis to 2000, 2001, ..., 2006.
As you can see, I am using the community-contributed commandcoefplot but these coefficients are from separate regressions, and 2000 or 2001 are not variable names.
Is there any way to work around this?
Using Stata's auto toy dataset:
sysuse auto, clear
recode foreign (0 = 1) (1 = 2)
forvalues i = 1 / 2 {
eststo: regress mpg price if foreign == `i'
estimate store r`i'
}
The following does the trick:
coefplot (r1 \ r2), vertical keep(price) aseq swapnames
Or with custom labels:
coefplot (r1, aseq(Foreign 1) \ r2, aseq(Foreign 2)), vertical keep(price) swapnames
I have a simple question about the distinct command in Stata.
When using with a by prefix, can it return a one dimension matrix of r(N)?
For example:
sysuse auto,clear
bysort foreign: distinct rep78
Can I store a [2,1] matrix, with each row representing the number of distinct values of rep78?
The manual seems to suggest that it only stores the number of distinct values of the last by value.
You can easily create your own wrapper for that:
sysuse auto,clear
sort foreign
levelsof foreign, local(foreign_levels)
local number_of_foreign_levels : word count `foreign_levels'
matrix distinct_mat = J(`number_of_foreign_levels', 1, 0)
forvalues i = 1 / `number_of_foreign_levels' {
quietly distinct rep78 if foreign == `i' - 1
matrix distinct_mat[`i', 1] = r(ndistinct)
}
matrix list distinct_mat
distinct_mat[2,1]
c1
r1 5
r2 3
Note that the number of distinct observations is stored in r(ndistinct), not r(N).
Here is another way to get numbers of distinct values into a matrix.
. sysuse auto
(1978 Automobile Data)
. egen tag = tag(foreign rep78)
. tab foreign if tag, matcell(foo)
Car type | Freq. Percent Cum.
------------+-----------------------------------
Domestic | 5 62.50 62.50
Foreign | 3 37.50 100.00
------------+-----------------------------------
Total | 8 100.00
I am creating a summary statistics table using the community-contributed command estout.
The code looks like this:
sysuse auto, clear
eststo clear
eststo: estpost ttest price mpg weight headroom trunk if rep78 ==3, by(foreign)
eststo: estpost ttest price mpg weight headroom trunk if rep78 ==4, by(foreign)
estout, cells("mu_1 mu_2 b(star)")
The result looks as follows:
--------------------------------------------------------------------------------------------
est1 est2
mu_1 mu_2 b mu_1 mu_2 b
--------------------------------------------------------------------------------------------
price 6607.074 4828.667 1778.407 5881.556 6261.444 -379.8889
mpg 19 23.33333 -4.333333 18.44444 24.88889 -6.444444**
weight 3442.222 2010 1432.222*** 3532.222 2207.778 1324.444***
headroom 3.222222 2.666667 .5555556 3.444444 2.5 .9444444*
trunk 15.59259 12.33333 3.259259 16.66667 10.33333 6.333333**
--------------------------------------------------------------------------------------------
I would like to know how I could stack est1 and est2 on top of each other.
The command estout cannot automatically stack results from stored estimates. Consequently, the use of eststo is redundant. In this case, the easiest way to obtain the desired output is to simply create two matrices with the results and stack one on top of the other.
For example:
sysuse auto, clear
matrix A = J(5, 3, .)
local i 0
foreach var of varlist price mpg weight headroom trunk {
local ++i
ttest `var' if rep78 == 3, by(foreign)
matrix A[`i', 1] = r(mu_1)
matrix A[`i', 2] = r(mu_2)
matrix A[`i', 3] = r(mu_1) - r(mu_2)
local matnamesA `matnamesA' "rep78==3:`var'"
}
matrix rownames A = `matnamesA'
matrix B = J(5, 3, .)
local i 0
foreach var of varlist price mpg weight headroom trunk {
local ++i
ttest `var' if rep78 == 4, by(foreign)
matrix B[`i', 1] = r(mu_1)
matrix B[`i', 2] = r(mu_2)
matrix B[`i', 3] = r(mu_1) - r(mu_2)
local matnamesB `matnamesB' "rep78==4:`var'"
}
matrix rownames B = `matnamesB'
matrix C = A \ B
esttab matrix(C), nomtitles collabels("mu_1" "mu_2" "diff")
---------------------------------------------------
mu_1 mu_2 diff
---------------------------------------------------
rep78==3
price 6607.074 4828.667 1778.407
mpg 19 23.33333 -4.333333
weight 3442.222 2010 1432.222
headroom 3.222222 2.666667 .5555556
trunk 15.59259 12.33333 3.259259
---------------------------------------------------
rep78==4
price 5881.556 6261.444 -379.8889
mpg 18.44444 24.88889 -6.444444
weight 3532.222 2207.778 1324.444
headroom 3.444444 2.5 .9444444
trunk 16.66667 10.33333 6.333333
---------------------------------------------------
I want to store results from ordinary least squares (OLS) regressions in Stata within a double loop.
Here is the structure of my code:
foreach i2 of numlist 1 2 3{
foreach i3 of numlist 1 2 3 4{
quiet: eststo: reg dep covariates, robust
}
}
The end goal is to have a table in Excel composed by twelve rows (one for each model) and seven columns (number of observations, estimated constant, five estimated coefficients).
Any suggestion on how can I do this?
Such a table can be created simply by using the community-contributed command esttab:
sysuse auto, clear
eststo clear
eststo m1: quietly regress price weight
eststo m2: quietly regress price weight mpg
quietly esttab
matrix A = r(coefs)'
matrix C = r(stats)'
tokenize "`: rownames A'"
forvalues i = 1 / `=rowsof(A)' {
if strmatch("``i''", "*b*") matrix B = nullmat(B) \ A[`i', 1...]
}
matrix C = B , C
matrix rownames C = "Model 1" "Model 2"
Result:
esttab matrix(C) using table.csv, eqlabels(none) mlabels(none) varlabels("Model 1" "Model 2")
----------------------------------------------------------------
weight mpg _cons N
----------------------------------------------------------------
Model 1 2.044063 -6.707353 74
Model 2 1.746559 -49.51222 1946.069 74
----------------------------------------------------------------