I wonder if you could help me figure it out a quite simple question of how I can save an extracted statistic from a regression in a separate dataset (or file), and also add more statistics from the other regression to it later?
For example, the statistic from one regression can be extracted as e(f) and from another one is also as e(f).
Roger Newson's parmest is great for dealing with "resulsets," which are Stata datasets created from the output of a Stata command. The help file has a nice example of combining three regressions into a single file that I modified here to include R^2 [stored in e(df_r)]:
sysuse auto, clear
tempfile tf1 tf2 tf3
parmby "reg price weight", lab saving(`"`tf1'"', replace) idnum(1) idstr(M1) escal(r2)
parmby "reg price foreign", lab saving(`"`tf2'"', replace) idnum(2) idstr(M2) escal(r2)
parmby "reg price weight foreign", lab saving(`"`tf3'"', replace) idnum(3) idstr(M3) escal(r2)
drop _all
append using `"`tf1'"' `"`tf2'"' `"`tf3'"'
list idnum idstr es_1, noobs nodis
Related
I want to create a forest plot with the metan ado in Stata. This works fine so far - but I can't solve two problems:
How do I manage that the value range on the x-axis does not already start at the headings but afterwards?
How do I label the column on the far right with "p-value"?
Can someone assist with this?
The code I used is:
metan or_num lci_num uci_num, label (namevar=Variable) by(sort_variable) xlabel(0,1,2,3,4,5,6) xsize(10) ysize(8) null(1) effect(Odds ratio) labtitle(Subgroup) nohet nobetween nooverall nosubgroup nosecsub lcols() rcols(p_num) astext(40) boxscale(0)
I´m trying to print my CI with esttab in this order:
My code with the cars.csv dataset form https://gist.github.com/noamross/e5d3e859aa0c794be10b#file-cars-csv:
clear all
import excel "C:\Users\luism\Desktop\archive/cars_usa.xlsx", sheet("hoja1") firstrow
destring mileage, force replace
reg price mileage
estimates store model_1
esttab model_1 using "C:\Users\luism\Desktop\archive/results.csv", replace beta ci
This is what I got when I separate by commas the results.csv file.
I would like to print the upper bound below the lower bound, not next to after I separate by commas the results.csv. Thanks
estadd allows you to add anything as a scalar and include it as follows:
sysuse auto , clear
reg price mpg
matrix results = r(table)
estadd scalar upperCI = results[rownumb(results,"ul"),colnumb(results,"mpg")]
estadd scalar lowerCI = results[rownumb(results,"ll"),colnumb(results,"mpg")]
esttab, stats(lowerCI upperCI)
This is exactly what Dimitry was suggesting in his comment. Wouter's comment is true to the extent that this isn't, strictly speaking, an "option" of esttab or estout. However, estadd provides a large amount of flexibility to esttab and it's worth knowing about.
I have the following model:
Y_{it} = alpha_i + B1*weight_{it} + B2*Dummy_Foreign_{i} + B3*(weight*Dummy_Foreign)_ {it} + e_{it}
and I am interested on the effect on Y of weight for foreign cars and to graph the evolution of the relevant coefficient across quantiles, with the respective standard errors. That is, I need to see the evolution of the coefficients (B1+ B3). I know this is a non-linear effect, and would require some sort of delta method to obtain the variance-covariance matrix to obtain the standard error of (B1+B3).
Before I delve into writing a program that attempts to do this, I thought I would try and ask if there is a way of doing it with grqreg. If this is not possible with grqreg, would someone please guide me into how they would start writing a code that computes the proper standard errors, and graphs the quantile coefficient.
For a cross section example of what I am trying to do, please see code below.
I use grqred to generate the evolution of the separate coefficients (but I need the joint one)-- One graph for the evolution of (B1+B3) with it's respective standard errors.
Thanks.
(I am using Stata 14.1 on Windows 10):
clear
sysuse auto
set scheme s1color
gen gptm = 1000/mpg
label var gptm "gallons / 1000 miles"
gen weight_foreign= weight*foreign
label var weight_foreign "Interaction weight and foreign car"
qreg gptm weight foreign weight_foreign , q(.5)
grqreg weight weight_foreign , ci ols olsci reps(40)
*** Question 1: How to constuct the plot of the coefficient of interest?
Your second question is off-topic here since it is statistical. Try the CV SE site or Statalist.
Here's how you might do (1) in a cross section, using margins and marginsplot:
clear
set more off
sysuse auto
set scheme s1color
gen gptm = 1000/mpg
label var gptm "gallons / 1000 miles"
sqreg gptm c.weight##i.foreign, q(10 25 50 75 95) reps(500) coefl
margins, dydx(weight) predict(outcome(q10)) predict(outcome(q25)) predict(outcome(q50)) predict(outcome(q75)) predict(outcome(q95)) at(foreign=(0 1))
marginsplot, xdimension(_predict) xtitle("Quantile") ///
legend(label(1 "Domestic") label(2 "Foreign")) ///
xlabel(none) xlabel(1 "Q10" 2 "Q25" 3 "Q50" 4 "Q75" 5 "Q95", add) ///
title("Marginal Effect of Weight By Origin") ///
ytitle("GPTM")
This produces a graph like this:
I didn't recast the CI here since it would look cluttered, but that would make it look more like your graph. Just add recastci(rarea) to the options.
Unfortunately, none of the panel quantile regression commands play nice with factor variables and margins. But we can hack something together. First, you can calculate the sums of coefficients with nlcom (instead of more natural lincom, which the lacks the post option), store them, and use Ben Jann's coefplot to graph them. Here's a toy example to give you the main idea where we will look at the effect of tenure for union members:
set more off
estimates clear
webuse nlswork, clear
gen tXu = tenure*union
local quantiles 1 5 10 25 50 75 90 95 99 // K quantiles that you care about
local models "" // names of K quantile models for coefplot to graph
local xlabel "" // for x-axis labels
local j=1 // counter for quantiles
foreach q of numlist `quantiles' {
qregpd ln_wage tenure union tXu, id(idcode) fix(year) quantile(`q')
nlcom (me_tu:_b[tenure]+_b[tXu]), post
estimates store me_tu`q'
local models `"`models' me_tu`q' || "'
local xlabel `"`xlabel' `j++' "Q{sub:`q'}""'
}
di "`models'
di `"`xlabel'"'
coefplot `models' ///
, vertical bycoefs rescale(100) ///
xlab(none) xlabel(`xlabel', add) ///
title("Marginal Effect of Tenure for Union Members On Each Conditional Quantile Q{sub:{&tau}}", size(medsmall)) ///
ytitle("Wage Change in Percent" "") yline(0) ciopts(recast(rcap))
This makes a dromedary curve, which suggests that the effect of tenure is larger in the middle of the wage distribution than at the tails:
Suppose I want to create the latex output for the example in ivprobit. That is:
use http://www.stata-press.com/data/r11/laborsup.dta
ivprobit fem_work fem_educ kids (other_inc = male_educ), first
margins, dydx(_all) pred(pr)
I want to have one column with the first stage, and one with the marginal effects.
What is the best way of doing this?
There might be a more clever way of doing this that does not involve re-estimating the first stage:
webuse laborsup
ivprobit fem_work fem_educ kids (other_inc = male_educ), first
margins, dydx(_all) predict(pr) post
estimates store ivp
reg other_inc male_educ fem_educ kids
estimates store first_stage
estout first_stage ivp, style(tex)
I need to calculate the Gini coefficient from disposable personal income data at LIS. According to a LIS training document, the Stata code to do this is:
di "** INCOME DISTRIBUTION II – Exercise 13 **"
program define bottop
qui sum ey [w=hweight*d4]
replace ey = .01*r(mean) if ey<.01*r(mean)
qui sum dpi [w=hweight*d4], de
replace ey = (10*r(p50)/(d4^.5)) if dpi>10*r(p50)
end
foreach file in $us00h $fi00h {
display "`file'"
use hweight d4 dpi if (!mi(dpi) & !(dpi==0)) using "`file'", clear
gen ey=dpi/(d4^0.5)
bottop
ineqdeco ey [w=hweight*d4]
}
I have simply copied and pasted this code from the training document. The snippets
qui sum ey [w=hweight*d4]
replace ey=0.01*r(mean) if ey<0.01*r(mean)
and
qui sum dpi [w=hweight*d4], de
replace ey=(10*r(p50)/(d4^0.5)) if dpi>10*r(p50)
are bottom and top coding, respectively.
When I tried to run this code, the variable hweight was not found. Does anyone know what the new name of hweight is at LIS? Or can anyone suggest how I might otherwise overcome this impasse?
I'm familiar with stata, but the sophistication of this code is beyond my ken.
Much appreciated.
Based on the varaiable definition list at the LIS Documentation page, it looks like the variable is now called HWGT
This is more of a second-best solution. However, the census of population provides income by brackets. If you are willing to do that, you can get the counts for every bracket. Have a top-coded bracket for the last one. Use the median income value within each bracket. Then you can directly apply the formula for the Gini coefficient. It is a second best because it is an approximation for the individaul-level data.
Why don't you try the fastgini command:
http://www.stata.com/statalist/archive/2007-02/msg00524.html
ssc install fastgini
fastgini income
return list
this should give you the gini for the variable income.
This package also allows for weights. Type
help fastgini
for more information