Plotting same coefficient over time - stata

I am using the coefplot package in Stata to plot how a coefficient changes depending on the model employed. In particular, I want to see how the coefficient of interest changes over time. I am plotting it vertically, so the x-axis could show the year to which each coefficient refers to. However, I am not able to label the x-axis accordingly (instead of showing the name of the variable I am interested, x1, it should state 1, 2 and 3. I would also like to omit the legend by using the option legend(off) but that does not work.
This is the code I am using:
reg y x1 x2 if year==1;
estimates store t1;
reg y x1 x2 if year==2;
estimates store t2;
reg y x1 x2 if year==3;
estimates store t3;
coefplot t1 t2 t3, drop(x2) vertical yline(0);
Any suggestion would be greatly appreciated.

A nonsensical example that uses two categories (and not time) can be easily adapted:
clear
set more off
sysuse auto
reg price weight rep78 if foreign
estimates store foreign
reg price weight rep78 if !foreign
estimates store not_foreign
matrix at = (1 / 2)
coefplot foreign || not_foreign, drop(rep78 _cons) vertical bycoefs
You can build the syntax within a loop, using a local. Then feed it to coefplot. To be precise, I mean something like the exemplary syntax:
year1 || year2 || ... || yearn
The final command would look something like:
coefplot `allyears', drop(<some_stuff>) vertical bycoefs
A complete example that does involves time:
clear
set more off
use http://www.stata-press.com/data/r12/nlswork.dta
forvalues i = 70/73 {
regress ln_w grade age if year == `i'
estimates store year`i'
local allyears `allyears' year`i' ||
local labels `labels' `i'
}
// check
display "`allyears'"
display `"`labels'"'
coefplot `allyears', keep(grade) vertical bycoefs bylabels(`labels')
If coefplot doesn't turn out to be flexible enough, you can always try with statsby and graph commands (help graph).

Related

How do I plot coefficients for multiple models in one graph using coefplot?

I have models that have different dependent variables (DVs), but using the same instrumental variable (IV), with two different types of identification strategies. In short I have:
Group1
model 1
model 2
model 3
model 4
Group2
model 1-2
model 2-2
model 3-2
model 4-2
I want to plot the coefficients where I can compare coefficient estimate of the IV of model 1 and model 1-2; model 2 and model 2-2 ... at once. I would like the eight coefficients to be plotted in one graph, since the IV of interest is the same for all models.
Is there a stylized way to do this?
For now, I only know how to plot coefs of model 1 and model 1-2 in one graph, but not all other ones. I want the coefficients to be plotted horizontally, according to model number (1, 2, 3, 4) and compare coefficients of model 1 and model 1-2; model 2 and model 2-2, for each model number on `x-axis'.
The community-contributed command coefplot is not meant to be used like that.
Nevertheless, below is a toy example of how you could get what you want:
sysuse auto, clear
estimates clear
// code for identification strategy 1
regress price mpg trunk length turn if foreign == 0
estimates store A
regress price trunk turn if foreign == 0
estimates store B
regress price weight trunk turn if foreign == 0
estimates store C
regress price trunk if foreign == 0
estimates store D
// code for identification strategy 2
regress price mpg trunk length turn if foreign == 1
estimates store E
regress price trunk turn if foreign == 1
estimates store F
regress price weight trunk turn if foreign == 1
estimates store G
regress price trunk if foreign == 1
estimates store H
Then you draw the plot as follows:
local gap1 : display _dup(30) " "
local gap2 : display _dup(400) " "
local label1 |`gap1'|`gap1'|`gap1'|`gap2'
local gap3 : display _dup(25) " "
local gap4 : display _dup(400) " "
local label2 DV1`gap3'DV2`gap3'DV3`gap3'DV4`gap4'
coefplot (A, offset(-0.38)) (E, offset(-0.33)) ///
(B, offset(-0.15)) (F, offset(-0.1)) ///
(C, offset(0.09)) (G, offset(0.135)) ///
(D, offset(0.3)) (H, offset(0.35)), ///
vertical keep(trunk) coeflabels(trunk = `""`label1'""`label2'""') ///
xlabel(, notick labgap(0)) legend(off)
The code here uses Stata's toy auto dataset to run a number of simple regressions for each foreign category. The same dependent variable price is used for illustration but you can use different variables in its place. The variable trunk is assumed here to be the variable of interest.
What the above toy code snippet basically does is to save each set of regression estimates and then simulate pairs of models. The labels DV are drawn based on a hack that i demonstrated recently in the following two questions:
coefplot: Putting names of regressions with vertical option
coefplot: Putting names of regressions on y-axis
The result is a very good approximation but will require experimentation from your part:

coefplot: several models with several coefficients each in one graph

I would like to display the coefficients (with their confidence intervals) of two regressions beneath one another.
Using Ben Jann's nice coefplot (ssc install coefplot), I can create a graph with one subgraph only where all coefficients from all models are included, but I do not succeed in ordering the coefficients by model rather than by coefficient.
Alternatively, I can create a graph with several subgraphs by coefficient, which is not I need: there should be one subgraph only, and a common scale for the coefficients.
Here comes a minimum example illustrating my needs and what I just described:
sysuse auto.dta, clear
reg price mpg rep78
eststo model1
reg price mpg rep78 weight
eststo model2
*what do I have: 2 models with 2 coefficients each (plus constant)
*what do I want: 1 graph with 2 models beneath one another,
*2 coefficients per model, 1 colour and legend entry per coefficient (not model!)
*common scale
*what is easy to get:
coefplot model1 model2, ///1 graph with all coefficients and models,
keep(mpg rep78) //but order is by coefficient, not by model
//how to add model names as ylabels?
*or 1 graph with 2 subgraphs by coefficient:
coefplot model1 || model2, ///
keep(mpg rep78) bycoefs
Can anyone help me in getting the graph I want optimally using coefplot?
As you can read from the notes in the example, the perfect solution would include one colour and legend entry per coefficient (not model) and the ylabels using the model names, but this is secondary.
I already tried a couple of the coefplot options, but it seems to me that most of them are for several equations from one model rather than for coefficients from different models.
I am not sure how to deal with the model names, but for the first part of your question, it seems to me that you could just do something like:
sysuse auto.dta, clear
reg price mpg rep78
eststo m1
reg price mpg rep78 weight
eststo m2
coefplot (m1) || (m2), ///
drop(_cons) byopts(row(2)) keep(mpg rep78)
Or do I misunderstand what you want?

Combining marker symbol and error bar in Stata twoway legend

I often make graphs that plot a mean or coefficient with a 95% error bar using -twoway scatter- and -twoway rcap-. The code below produces a legend with two entries: one for the mean marker symbol and one for the error bar. But I want the legend to display a single entry, showing the marker symbol and the error bar combined. Below is an example of how I usually make a graph.
sysuse auto
gen b = .
gen se = .
mean mpg if foreign == 1
replace b = _b[mpg] in 1
replace se = _se[mpg] in 1
mean mpg if foreign == 0
replace b = _b[mpg] in 2
replace se = _se[mpg] in 2
gen lb = b - (1.96 * se)
gen ub = b + (1.96 * se)
gen index = _n in 1/2
twoway scatter b index || rcap lb ub index, legend(order(1 "Mean" 2 "95% Interval"))
Is there an option in -legend- to allow me to overlay two legend entries in the way I want?
I don't really know how to do exactly what you want. It seems hard.
I also hate to waste the legend real estate, so one alternative is to label the means instead of using a legend (and add "With 95%CIs" to the title):
sysuse auto
reg mpg i.foreign
margins foreign, post
estimates store means
marginsplot, recast(scatter) xscale(reverse)
coefplot means
Another is to just use ciplot without any regression/summarization:
ciplot mpg, by(foreign) xscale(reverse)
coefplot and ciplot are both user-written.

Computing and plotting difference in group means

In what follows I plot the mean of an outcome of interest (price) by a grouping variable (foreign) for each possible value taken by the fake variable time:
sysuse auto, clear
gen time = rep78 - 3
bysort foreign time: egen avg_p = mean(price)
scatter avg_p time if (foreign==0 & time>=0) || ///
scatter avg_p time if (foreign==1 & time>=0), ///
legend(order(1 "Domestic" 2 "Foreign")) ///
ytitle("Average price") xlab(#3)
What I would like to do is to plot the difference in the two group means over time, not the two separate means.
I am surely missing something, but to me it looks complicated because the information about the averages is stored "vertically" (in avg_p).
The easiest way to do this is to arguably use linear regression to estimate the differences:
/* Regression Way */
drop if time < 0 | missing(time)
reg price i.foreign##i.time
margins, dydx(foreign) at(time =(0(1)2))
marginsplot, noci title("Foreign vs Domestic Difference in Price")
If regression is hard to wrap your mind around, the other is involves mangling the data with a reshape:
/* Transform the Data */
keep price time foreign
collapse (mean) price, by(time foreign)
reshape wide price, i(time) j(foreign)
gen diff = price1-price0
tw connected diff time
Here is another approach. graph dot will happily plot means.
sysuse auto, clear
set scheme s1color
collapse price if inrange(rep78, 3, 5), by(foreign rep78)
reshape wide price, i(rep78) j(foreign)
rename price0 Domestic
label var Domestic
rename price1 Foreign
label var Foreign
graph dot (asis) Domestic Foreign, over(rep78) vertical ///
marker(1, ms(Oh)) marker(2, ms(+))

Wrap local macro in double quotes that will be visible to esttab's cells() option

I am trying to lazily create a table of means and standard errors for a longish list of variables. It seems that the estout package from SSC and tabstat are the best tools, but I can't get the local macros to work properly to specify esttab's cells() option.
sysuse auto, clear
* build macro for `cells()` option
local i = 1
foreach v of varlist price weight displacement {
local cells "`cells'" " `v'"
if (`i' == 1) local cells "`cells'(fmt(%9.3gc))"
local ++i
}
* properly built
display "`cells'"
* but does not work with `esttab`
estpost tabstat price weight displacement, statistics(mean semean)
esttab ., cells("`cells'")
This yields an "empty" table.
. esttab ., cells("`cells'")
-------------------------
(1)
b
-------------------------
-------------------------
N 74
-------------------------
It seems that cells() needs to see double quotes, but my attempts to add them with single and double quotes at any point in the process. Is there a way to make this approach work? I would like to avoid manually generating the cells() argument.
* The following approach does work.
esttab ., cells("price(fmt(%9.3gc)) weight displacement")
This yields the correct table.
. esttab ., cells("price(fmt(%9.3gc)) weight displacement")
---------------------------------------------------
(1)
price weight displacement
---------------------------------------------------
mean 6,165 3,019 197
semean 343 90.3 10.7
---------------------------------------------------
N 74
---------------------------------------------------
#Nick has already given a solution to the problem. He claims only stylistic changes were made, but I suspect more.
The double quotes originally used by the poster introduce an additional word in the definition of local cells. That is clear when we count the words contained in the local macro using the extended macro function : word count. #Richard works with three variables, but we count four words. Carefull inspection shows that the additional, surprise word, is "", introduced in the loop its first time around.
In this case, using display to check the contents of the local is misleading, because the command will simply do away with "". As a result, we see on screen "three" words. Displaying each word (one by one), more clearly shows that the first one is a blank.
What this means is you are coding something like:
esttab ., cells(""" price weight displacement")
when you really mean
esttab ., cells("price weight displacement")
Below I post some code consistent with this hypothesis. To simplify exposition, I have stripped away unnecessary complications from the original code.
sysuse auto, clear
// build macro
local i = 1
foreach v of varlist price weight displacement {
local cells "`cells'" " `v'"
*local cells `cells' `v'
}
// check contents of macro cells
local wc : word count "`cells'"
display `wc'
forvalues i = 1/4 {
local w`i' : word `i' of "`cells'"
display "`w`i''"
}
// display a test
local test "" " price" " weight" " displacement"
local wct : word count "`test'"
display `wct' // four words also
// more displays
display "`cells'"
display """ price weight displacement" // same display result
// tables
// post
quietly estpost tabstat price weight displacement, statistics(mean semean)
// original with error
esttab ., cells("`cells'")
// original with error after dereferencing the local macro cells
esttab ., cells(""" price weight displacement")
Nick's solution, that doesn't use double quotes, solves the problem.
One clue might be that the assert is not passing:
sysuse auto, clear
* build macro for `cells()` option
local i = 1
foreach v of varlist price weight displacement {
local cells "`cells'" " `v'"
if (`i' == 1) local cells "`cells'(fmt(%9.3gc))"
local ++i
}
* properly built
display "`cells'"
* but does not work with `esttab`
estpost tabstat price weight displacement, statistics(mean semean)
display "`cells'"
esttab ., cells("`cells'")
local cells2 price(fmt(%9.3gc)) weight displacement
esttab ., cells("`cells2'")
assert "`cells'" == "`cells2'"
esttab ., cells(price(fmt(%9.3gc)) weight displacement)
This works for me with Stata 13.1 and updated estout from SSC. My changes from your code were intended just as stylistic, but see an answer from #Roberto Ferrer.
I did get an error with an older version lurking on my machine, so updating appears to be at least part of the solution.
sysuse auto, clear
* build macro for `cells()` option
local i = 1
foreach v of varlist price weight displacement {
local cells `cells' `v'
if (`i' == 1) local cells `cells'(fmt(%9.3gc))
local ++i
}
* properly built
display "`cells'"
estpost tabstat price weight displacement, statistics(mean semean)
esttab ., cells("`cells'")