Controlling layout of multiple histograms - stata

I have used the by option of the histogram command to group my two categorical variables:
sysuse auto, clear
hist price, percent by(foreign rep78)
I would like to graph 8 histograms together on the same plot, with the following layout:
XXXX
XXXX
However, as the above graph shows Stata places the histograms by default as follows:
XXX
XXX
XX
How can I achieve a 2x4 orientation of histograms?
I have played around with the aspectratio() option, but this changes the aspect ratio of the individual graphs, not of the whole plot.

You need to specify the cols() sub-option in by():
sysuse auto, clear
histogram price, percent by(foreign rep78, cols(4))

Related

Title of second y-axis in stata

I am trying to create a graph with two y axes (below code), but the title of the right-side y-axis does not appear. Can anyone please help me with that?
sysuse auto, clear
generate kpl=mpg*0.425144
twoway (scatter mpg weight, color(navy) yaxis(1)) (scatter kpl weight, color(navy) yaxis(2) ylabel(4.25 8.5 12.75 17, axis(2)) ytitle(Kilometres per Litre, axis(2))), by(foreign, legend(off) note(Graphs by Car origin))
enter image description here
I think I understand what you want but I would approach it quite differently.
If you want a second scale of km per litre to compare with miles per gallon, that is just the same data points explained differently, just as you could show Celsius and Fahrenheit temperatures on different axes or calculate proportions and show percents, or vice versa.
Another variable holding km per litre makes this more difficult, not easier, as the values differ by the corresponding conversion factor.
Here I use mylabels from SSC, which must be installed before you can use it.
Naturally you don't need to show zero, but the identity 0 miles per gallon = 0 km per litre may make the point easier to follow.
sysuse auto, clear
set scheme s1color
mylabels 0(4)16, myscale(#/0.425144) local(yla)
scatter mpg weight, yaxis(1 2) yla(`yla', axis(2) ang(h)) yla(0(10)40, axis(1) ang(h)) ytitle(km per litre, axis(2)) ms(Oh)

Creating box plots of the gap between two groups by deciles

I work with Stata and I have math grades for two different groups: A and B.
I want to see the gap that exists between both groups in each decile. In addition I want to do a box plot of this gap for each decile (I want to have 10 box plots, one for each decile which shows the gap between group grades).
What I first did was to compute the deciles using xtile for both groups:
xtile decileA= mat if group==1, nq(10)
xtile decileB= mat if group==0, nq(10)
However, groups A and B do not have the same number of observations nor the same distribution. I thought of computing quantiles for each decile and group and subtracting them to get the difference in each decile at each quartile to create the boxplot. But I do not know how to proceed afterwards to create the graph, and given that I have a different number of observations in each group decile I do not know if it is correct to proceed this way.
If I try to use the pctile command and compute the difference at each decile, I lose all the variance in the data inside each decile. I only get median differences and not all the quantiles I want.
Example:
pctile decileA= mat if group==1, nq(10)
pctile decileB= mat if group==0, nq(10)
gen qdiff= decileA- decileB if _n<10
gen qtau=_n/10 if _n<10
graph box qdiff, over(tau)
I want to know if there is a way to do the graph I am intending to?
Cross-posted on Statalist.
There is certainly a way to accomplish what you want with a bit of effort, but if the goal is to make a comparison between the two groups at each decile with some notion of variability, you can easily get that from a simultaneous quantile regression and the SEs that it produces:
sysuse auto, clear
sqreg price i.foreign, quantile(.1 .2 .3 .4 .5 .6 .7 .8 .9)
margins, dydx(foreign) ///
predict(outcome(q10)) ///
predict(outcome(q20)) ///
predict(outcome(q30)) ///
predict(outcome(q40)) ///
predict(outcome(q50)) ///
predict(outcome(q60)) ///
predict(outcome(q70)) ///
predict(outcome(q80)) ///
predict(outcome(q90)) ///
post
marginsplot, yline(0) xlab(, grid) ylab(#10, grid angle(90))
This yields a graph showing that foreign origin is associated with a bigger price at higher deciles, with the exception of the top decile, though none of the differences are probably significant here given how much the CIs overlap:
You can even conduct formal hypothesis tests that the effects are equal like this:
. test _b[1.foreign:9._predict] = _b[1.foreign:8._predict]
( 1) - [1.foreign]8._predict + [1.foreign]9._predict = 0
chi2( 1) = 3.72
Prob > chi2 = 0.0537
With 74 cars, we cannot reject that the effect on the 80th and 90th percentile are the same even though the point estimates have the opposite signs but similar magnitude.

How do I format the numerical labels in a legend?

In Stata, the following code produces a contour plot:
sysuse auto, clear
twoway (contour headroom mpg price), legend(on)
The plot looks as follows:
How do I change the legend to automatically format its labels as 1.0, 2.0 etc.?
I would like to do this without having to manually create labels and pass them to the legend option.
In my actual application, I'm trying to get this to work for a map that comes from the spmap command, but it's difficult to provide a minimal working example with that command.
It looks like you want to re-format the numbering in the clegend of the contour plot (as opposed to the 'traditional' legend).
In this case, you just need to specify the zlabel option and the required format:
sysuse auto, clear
twoway contour headroom mpg price, zlabel(#5, format(%2.1f))
This will produce the desired output:

Stata: Storing only part of a FE regression output for graphing

I am running a regression with two fixed effects categories (country and year, is economic macro data). Since I am using xtreg, one is autohid, but the other is a variable:
xtreg fiveyearyg taxratio i.year if taxratiocut == 1, i(wbcode1) fe cluster(wbcode1)
estimates store yi
I am running a number of these and I want to graph the coefficients for taxratio from each. But when I store the data, it stores both the taxratio coefficient, and the 50+ coefficients for the year fixed effects.
After a lot of searching, I cannot find any way to store (or recall) just part of the regression output, the one coefficient (with SEs) that I care about. Does anyone know a way to do that?
Here is how you can do that:
webuse grunfeld,clear
qui xtreg mvalue invest i.year,fe cluster(company)
//e(b) stores coefficient matrix and e(V) stores variance-covariance matrix. For details type: ereturn list after running the model
//let's say you want to extract only the coefficient on invest
mat coef_matrix=e(b)
scalar coef_invest=coef_matrix[1,1]
dis coef_invest
1.7178414
//to extract se of the the coefficient on invest
mat var_matrix=e(V)
mat diag_var_matrix=vecdiag(var_matrix) //diagonal elements are variances and the standard errors are square roots of these variances
matmap diag_var_matrix se_matrix , m(sqrt(#))) //you need to install matmap using ssc install matmap, you will get error if variance is negative
scalar se_invest=se_matrix[1,1]
dis se_invest
.14082153
Accessing coefficients is as easy as calling _b[varname]; analogously the corresponding standard errors: _se[varname].
An example:
webuse grunfeld, clear
qui xtreg mvalue invest i.year,fe cluster(company)
// coef for invest
display _b[invest]
// std error for invest
display _se[invest]
// displayed results in matrix
matrix list r(table)
For multiple-equation models use [eqno]_b[varname] where the preceding bracket contains an equation number.
More detail can be found in [U] 13.5 Accessing coefficients and standard errors.
Starting Stata 12, estimation commands also store results in r() [and not just e()]. Notice I listed r(table), which contains most results displayed by the estimation command xtreg.
You show interest in plotting coefficients, so you should read on the user-written command coefplot. Run ssc install coefplot to download and help coefplot to get started. It has many options.
Edit
A complete example that plots only coefficients for invest (leaving out those for year), using coefplot, and based on conditional regressions is:
clear
set more off
webuse grunfeld
xtreg mvalue invest i.year if time <= 10,fe cluster(company)
estimates store before10
xtreg mvalue invest i.year if time > 10,fe cluster(company)
estimates store after10
coefplot before10 after10, keep(invest)

Saving confidence limits in Stata

After running glm I can type matrix list r(table) and see a table of all of my results. If I wish, I can write slopes and SEs to variables, e.g., gen B=_b[x1] or gen se=_se[x1]. However, this does not work with the confidence limits, ll and ul. How can I access them in a similar manner?
I am not sure if the _b[] and _se[] results are associated with r(table)--I have thought they are products of e(b) and e(V).
Anyway, since you have r(table), you can just save the results into another matrix, and then use the regular matrix operations to put the lower bounds and upper bounds into new matrices. If for some reason transformation into variables is desired (for example, plotting), there's always -svmat-.
sysuse auto,clear
glm price mpg foreign, f(gaussian)
mat r=r(table)
matrix ll=r["ll",....]' // see -help matrix extraction-; transposed for svmat
svmat ll,names(ll) // lower bounds are in variable ll1