How do I use outreg2 to display value labels in its output? - stata

Take this code
sysuse auto, clear
reg price mpg c.mpg#i.foreign
outreg2 using "example.txt", stats(coef) replace
This outputs
(1)
VARIABLES price
price
mpg -329.0***
0b.foreign#co.mpg 0
1.foreign#c.mpg 78.33**
Constant 12,596***
Observations 74
R-squared 0.289
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Ideally, I'd like it to display the value labels, as is done in the console's regression output:
-------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
mpg | -329.0368 61.46843 -5.35 0.000 -451.6014 -206.4723
|
foreign#c.mpg |
Foreign | 78.32918 29.78726 2.63 0.010 18.93508 137.7233
|
_cons | 12595.97 1235.936 10.19 0.000 10131.58 15060.35
-------------------------------------------------------------------------------
I don't need any of the other stats at the moment; I'm strictly including that last piece of output to show what I mean with the value labels. Searching through the documentation for outreg2 tells me how to display variable labels, but not value labels.
Also posted on Statalist.

As #Dimitriy points out, you can use estout, from SSC. An example:
sysuse auto, clear
reg price mpg c.mpg#i.foreign
estimates store m1, title(Model 1)
estout m1, label
You can add other statistics, stars and more. After installation (ssc install estout), read patiently help estout.

If you decode your variables and use xi, it will do the trick. Of course this solution assumes that you recode your variables, but if you want to stick with outreg2 is an easy solution.
sysuse auto, clear
set seed 1234
gen maxspeed = round(uniform()*3)+1
label define speed 1 "Light" 2 "Ridiculous" 3 "Ludicrous" 4 "Plaid"
label values maxspeed speed
decode maxspeed, gen(maxspeed_str)
decode foreign, gen(foreign_str)
xi: reg price mpg weight i.foreign_str*i.maxspeed_str
outreg2 using test, see text label
I used the example you asked in Statalist as it was your latest question.

Related

How to store confidence intervals from stata margins estimation?

stata experts,
I have been trying to find a way to store marginal estimations, including the p value and confidence interval.
Below is the code I have. All that I can get is the estimated marginal effect of variable I. Looks like I can't specify "ci" like what we can do for usual regression models. Is there a way to also store and present the other numbers from marginal estimations?
probit Y1 X
margin, dydx(X) post
est store m1
probit Y2 X
margins, dydx(X) post
est store m2
esttab m1 m2
esttab m1 m2, ci
Another related question is: how do I save marginal estimations for interaction terms? Example code below
probit Y2 year month year*month
margins year#month, asbalanced post
Thank you in advance!
Here's a way to grab p-values and confidence intervals after a margins command.
sysuse auto, clear
probit foreign price trunk
margin, dydx(price) post
eststo m1
The results from the margin command:
------------------------------------------------------------------------------
| Delta-method
| dy/dx std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
price | .0000268 .0000159 1.69 0.092 -4.36e-06 .000058
------------------------------------------------------------------------------
Then the p-value and confidence interval are recoverable from the stored matrices e(b) and e(V). To get the p-value, we need the z-score which is the point estimate over the standard error (e(b)[1,1]/sqrt(e(V)[1,1]). The rest is calculating the area in the two tails using normal.
The confidence interval is the point estimate e(b)[1,1] plus the standard error sqrt(e(V)[1,1]) times the critical value of z invnormal(0.975).
Shown with the output so that you can see the numbers line up:
. di "P-value: " normal(-abs(e(b)[1,1]/sqrt(e(V)[1,1])))*2
P-value: .09186065
. di "Upper bound: " e(b)[1,1] + sqrt(e(V)[1,1])*invnormal(0.975)
Upper bound: .00005796
. di "Lower bound: " e(b)[1,1] - sqrt(e(V)[1,1])*invnormal(0.975)
Lower bound: -4.361e-06
To put the p-value in a table, for example, you could use estadd:
estadd scalar pvalue = normal(-abs(e(b)[1,1]/sqrt(e(V)[1,1])))*2
And then esttab:
esttab m1, stats(pvalue, label("P-value"))
. esttab m1, stats(pvalue, label("P-value"))
----------------------------
(1)
----------------------------
price 0.0000268
(1.69)
----------------------------
P-value 0.0919
----------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

I want to create a table reporting results from two separate regressions using different variables

I have run two regressions on similar but different data sets and regressors the results of which I want to report next to each other for comparability but keep them in one table using estout/esttab. The finished product should look something like this.
Table
-----------------------------------------
Dep. Var.: a
-----------------------------------------
Regression 1 |Regression 2
-----------------------------------------
x_1 coeff.|x_2 coeff.
y_1 coeff.|y_2 coeff.
z_1 coeff.|z_2 coeff.
l_1 coeff.|
m_1 coeff.|
-----------------------------------------
Obs value|Obs value
-----------------------------------------
Hypothesis |
x_1=1,p-value |x_2=1,p-value
-----------------------------------------
I am able to create individual tables like this just fine but I have honestly no idea where to start here and documentation hasn't been very helpful either. I hope someone here can point me in the right direction.
I think the eststo/esttab syntax will help. Here's code which I think produces what you're after:
ssc install estout, replace
sysuse auto, clear
eststo m1: regress price mpg
test mpg=1
estadd scalar pvalue = r(p)
eststo m2: regress price mpg trunk [enter image description here][1]
test trunk=1
estadd scalar pvalue = r(p)
esttab m1 m2, b(3) se(3) stats(pvalue r2 N, fmt(3 3 0) label("P-value" "R-squared" "Observations")) mtitle("Regression 1" "Regression 2") nocons
It creates this table:
--------------------------------------------
(1) (2)
Regression 1 Regression 2
--------------------------------------------
mpg -238.894*** -220.165**
(53.077) (65.593)
trunk 43.559
(88.719)
--------------------------------------------
P-value 0.000 0.633
R-squared 0.220 0.222
Observations 74 74
--------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
The esttab guide is really great.

Why customized stats() only show for the second regression in the Stata esttab table? [duplicate]

This question already has an answer here:
How can I add a fixed effect model label to an estout table?
(1 answer)
Closed 4 years ago.
Here are the codes. I would like Year fixed effects to appear in both columns. However, it shows in the second column only. How to make this happen? Thanks.
sysuse auto, clear
eststo: quietly regress price weight mpg
eststo: quietly regress price weight mpg foreign
estadd local year_fe Yes
esttab, cells(b(fmt(a3)) t(fmt(2) par)) stats(year_fe r2 N, fmt(%-#s 3 0)
labels("Year FE"))
--------------------------------------
est1 est2
b/t b/t
--------------------------------------
weight 1.747 3.465
(2.72) (5.49)
mpg -49.51 21.85
(-0.57) (0.29)
foreign 3673.1
(5.37)
_cons 1946.1 -5853.7
(0.54) (-1.73)
--------------------------------------
Year FE Yes
r2 0.293 0.500
N 74 74
--------------------------------------
Just found out that I have to "estadd local year_fe Yes" after each regression.

esttab rename behavior with commas and colons

Say I have this data:
eststo clear
sysuse auto2, clear
I perform a regression:
reg mpg price turn
I would like to use esttab's rename functionality to rename the variables. However, when I do:
esttab, rename("price" "(1,2)" "turn" "(3,5)")
the commas disappear.
And when I do:
esttab, rename("price" "Var1: (1,2)" "turn" "Var2: (3,5)")
I get an error message.
I have tried creating a local and using that in rename, but I get an error there too.
local a "(1,2)"
display "`a'"
esttab, rename("price" "`a'")
But this just reproduces the comma problem.
How can I fix these two problems (especially the first one)?
esttab is a community-contributed command.
It looks like that both , and : are used for internal parsing, so you are out of luck if you want to use the rename() option.
However, you can fix both problems by doing the following:
eststo clear
sysuse auto2, clear
label variable price "(1,2)"
label variable turn "(3,5)"
reg mpg price turn
esttab, label
------------------------------------
(1)
Mileage (m~)
------------------------------------
(1,2) -0.000534**
(-3.38)
(3,5) -0.835***
(-7.89)
Constant 57.69***
(14.32)
------------------------------------
Observations 74
------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
And:
label variable price "Var1: (1,2)"
label variable turn "Var2: (3,5)"
reg mpg price turn
esttab, label
------------------------------------
(1)
Mileage (m~)
------------------------------------
Var1: (1,2) -0.000534**
(-3.38)
Var2: (3,5) -0.835***
(-7.89)
Constant 57.69***
(14.32)
------------------------------------
Observations 74
------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
EDIT:
The help file of estout confirms:
rename(matchlist) changes the names of individual coefficients, where
matchlist is
oldname newname [oldname newname ...]
oldname can be a parameter name (e.g. price) or a full name including
an equation specification (e.g. mean:price)...
The comma is probably used as a delimiter.

Percentile for COMPLEX SURVEY DATA while using subpop option

I am using a survey sample and am trying to analyze a subpopulation.
I am trying to get mean, median, 10th percentile and 90th percentile of a continuous varaible for my subpopulation of interets.
Stata website http://www.stata.com/support/faqs/statistics/percentiles-for-survey-data/ shows the metod to obtain median/percentiles.
However, I am interested in sub population and not the entire sample.
Can you please show me the appropriate commands to obtain any percentile while using a complex survey sample with sub population option?
You can use _pctile to get percentiles for a subpopulation without svyset, because the percentiles depend only on the weights. However to get standard errors and confidence intervals, you should download epctile by Stas Kolenikov (findit epctile in Stata) and svyset the data.
net describe epctile, from(http://web.missouri.edu/~kolenikovs/stata)
net install epctile.pkg
The auto data will provide the example, with the variable weight being the probability weight.
sysuse auto, clear
_pctile price if foreign==0 [pw = weight], p(25 50 75)
return list
scalars:
r(r1) = 4195
r(r2) = 5104
r(r3) = 6486
Compare to svysetting the data and calling epctile:
gen strat = rep78
gen mkr = substr(make,1,2)
svyset mkr [pw = weight], strata(strat)
epctile price, percentiles(25 50 75) subpop(if foreign==0) svy
Results:
Percentile estimation
------------------------------------------------------------------------------
| Linearized
price | Coef. Std. Err. z P>|z| [95% Conf. Interval]
p25 | 4195 108.5 38.66 0.000 3982.344 4407.656
p50 | 5104 320.5 15.93 0.000 4475.832 5732.168
p75 | 6486 2093 3.10 0.002 2383.795 10588.2