How to shade recession periods in Stata? - stata

I'm trying to shade recession period as a backdrop to a lineplot, using a downloaded NBER dummy.
I use this line of code:
twoway area usrec year
and get this linked graph:
How can I make the graph use bars rather than spikes? (Variable is a dummy, it is 4 or zero)

Nick's comment is really the answer to this question.
Here is an example from help nbercycles by Christopher Baum.
First install nbercycles and freduse (for the MWE data) from SSC:
ssc install nbercycles
ssc install freduse
Then load the data and generate and tsset set a monthly date:
freduse MPRIME, clear
generate ym = mofd(daten)
tsset ym, monthly
Then plot the prime rates and shade the recessions:
nbercycles MPRIME if tin(1970m1,1990m1), ///
file(nber2.do) replace

Related

Tabulate relative frequencies in Stata

I am trying to tabulate frequencies for a variable divided in two groups. That is, I would like to see how much a variable takes value "Yes" divided by both region and sex. Now, this is easy to do in Stata using "tab" and option row, but I have trouble exporting it. To make it clear, I am able to build the table with absolute frequencies in this way:
eststo formalyes: estpost tab regionwb_c female if fin22a==1
eststo formalno: estpost tab regionwb_c female if fin22a==0
eststo formalt: estpost tab regionwb_c female
estout formalyes formalno formalt using summformal.tex, replace varlabels(`e(labels)') unstack booktabs ///
mgroups("Yes" "No" "Tot", pattern(1 1 1) prefix(\multicolumn{#span}{c}{) suffix(}) span erepeat(\cmidrule(lr){#span})) fragment
This, put in my latex code, produces this relatively nice table:table1
Now what I would like to do is to reproduce the exact same table, but to have the relative and not absolute frequencies there. Now normally to my understanding if you want the relative frequencies you can have
tab x y, row nofreq
but if you try to combine this with estpost it does not work. Are there any hints? I tried working it out with tabout, but all i was able to produce is this:
tabout regionwb_c female using trial.tex, replace percent style(tex) c(mean fin22a) sum
Which gives this:table2
Where, as you can see, I am pretty lost. I am sorry if the question sounds silly but I struggled finding results online or on the tabout manual. I hope somebody can help me.
I have not worked with tabout before, but maybe one way to work around it could be to just program new variables containing the male and female relative frequencies by regionwb_c using the egen command for example (like in this link enter link description here. Then you could just pass these relative frequencies variables in your table.
Could that maybe help you? Good Luck!

Moving average using forvalues - Stata

I am struggling with a question in Cameron and Trivedi's "Microeconometrics using Stata". The question concerns a cross-sectional dataset with two key variables, log of annual earnings (lnearns) and annual hours worked (hours).
I am struggling with part 2 of the question, but I'll type the whole thing for context.
A moving average of y after data are sorted by x is a simple case of nonparametric regression of y on x.
Sort the data by hours.
Create a centered 15-period moving average of lnearns with ith observation yma_i = 1/25(sum from j=-12 to j=12 of y_i+j). This is easiest using the command forvalues.
Plot this moving average against hours using the twoway connected graph command.
I'm unsure what command(s) to use for a moving average of cross-sectional data. Nor do I really understand what a moving average over one-period data shows.
Any help would be great and please say if more information is needed.
Thanks!
Edit1:
Should be able to download the dataset from here https://www.dropbox.com/s/5d8qg5i8xdozv3j/mus02psid92m.dta?dl=0. It is a small extract from the 1992 Individual-level data from the Panel Study of Income Dynamics - used in the textbook.
Still getting used to the syntax, but here is my attempt at it
sort hours
gen yma=0
1. forvalues i = 1/4290 {
2. quietly replace yma = yma + (1/25)(lnearns[`i'-12] to lnearns[`i'+12])
3. }
There are other ways to do this, but I created a variable for each lag and lead, then take the sum of all of these variables and the original then divide by 25 as in the equation you provided:
sort hours
// generate variables for the 12 leads and lags
forvalues i = 1/12 {
gen lnearns_plus`i' = lnearns[_n+`i']
gen lnearns_minus`i' = lnearns[_n-`i']
}
// get the sum of the lnearns variables
egen yma = rowtotal(lnearns_* lnearns)
// get the number of nonmissing lnearns variables
egen count = rownonmiss(lnearns_* lnearns)
// get the average
replace yma = yma/count
// clean up
drop lnearns_* count
This gives you the variable you are looking for (the moving average) and also does not simply divide by 25 because you have many missing observations.
As to your question of what this shows, my interpretation is that it will show the local average for each hours variable. If you graph lnearn on the y and hours on the x, you get something that looks crazy becasue there is a lot of variation, but if you plot the moving average it is much more clear what the trend is.
In fact this dataset can be read into a suitable directory by
net from http://www.stata-press.com/data/musr
net install musr
net get musr
u mus02psid92m, clear
This smoothing method is problematic in that sort hours doesn't have a unique result in terms of values of the response being smoothed. But an implementation with similar spirit is possible with rangestat (SSC).
sort hours
gen counter = _n
rangestat (mean) mean=lnearns (count) n=lnearns, interval(counter -12 12)
There are many other ways to smooth. One is
gen binhours = round(hours, 50)
egen binmean = mean(lnearns), by(binhours)
scatter lnearns hours, ms(Oh) mc(gs8) || scatter binmean binhours , ms(+) mc(red)
Even better would be to use lpoly.

Weighted binomial confidence interval in Stata

I am trying to compute a binomial confidence interval for a dummy variable after specifying the survey design in Stata with the svyset command but I get the following error: ci is not supported by svy with vce(linearized)
svyset [pweight=My_weight]
svy: ci Variable, binomial
I have also tried the following code:
ci Variable [pweight=My_weight], binomial
But got the error: pweight not allowed
Binomial confidence intervals are calculated as proportions in Stata 14 (Stata 13 uses binomial). This makes sense because the mean of a dummy variable is the proportion of 1's. Look at the help file here: http://www.stata.com/help.cgi?ci
So you likely want a command like:
ci proportions Variable [pweight=My_weight]
From the help file, it looks like only fweights may be allowed here.
Originally I thought that a better way might be to grab your CI from the means output. Here is an example modified from the svy help file.
webuse nhanes2f
svyset psuid [pweight=finalwgt]
svy: mean sex
But OP is right, this doesn't adjust for the binomial distribution.

Emulate Stata 8 clustered bootstrapped regression

I'm trying to store a series of scalars along the coefficients of a bootstrapped regression model. The code below looks like the example from the Stata [P]rogramming manual for postfile, which is apparently intended for use with such procedures.
The problem is with the // commented lines, which fail to work. More specifically, the problem seems to be that the syntax below worked in Stata 8 but fails to work in Stata 9+ after some change in the bootstrap procedure.
cap pr drop bsreg
pr de bsreg
reg mpg weight gear_ratio
predict yhat
qui sum yhat
// sca mu = r(mean)
// post sim (mu)
end
sysuse auto, clear
postfile sim mu using results , replace
bootstrap, cluster(foreign) reps(5) seed(6112): bsreg
postclose sim
use results, clear
Adding version 8 to the code did not solve the issue. Would anyone know what is wrong with this procedure, and how to fix it for execution in Stata 9+? The problem has been raised in the past and more recently, but without finding an answer.
Sorry for the long description, it's a long problem.
I've presented the issue as if it's a programming one because I'm using this code to replicate some health inequalities research. It's necessary to bootstrap the entire procedure, not just the reg model. I have some quibbles with the methodology, but nothing that would stop me from replicating the analysis.
Adding noisily to the bootstrap showed a problem with the predict command. Here's a fix using a tempvar macro.
cap pr drop bsreg
pr de bsreg
reg mpg weight gear_ratio
tempvar yhat
predict `yhat'
qui sum `yhat'
sca mu = r(mean)
post sim (mu)
end
sysuse auto, clear
postfile sim mu using results , replace
bootstrap, cluster(foreign) reps(5) seed(6112): bsreg
postclose sim
use results, clear

Stata: adding coefficients to estout

I want to output the results of several regressions as a nicely formatted LaTeX table and am happy to see that for most cases the estout package on SSC seems to do exactly that.
However, what I want is a bit special: Between the table section that shows the coefficient estimates and their standard errors, and the section that shows R^2 and the like, I would like to add a section that shows point estimates and standard errors (bonus points for stars) for particular linear combinations of the coefficients. Both point estimates and standard errors are easily computed via lincom but the best solution I've found so far for getting these numbers into the table involves the massily hacky addition of these numbers, one estadd scalar ... at a time. Is there a more elegant way to do this?
Example code:
sysuse auto
eststo, title("Model 1"): regress price weight mpg
lincom weight+mpg
estadd scalar skal r(estimate)
estadd scalar skalsd r(se)
eststo, title("Model 2"): regress price weight mpg foreign
lincom weight+mpg
estadd scalar skal r(estimate)
estadd scalar skalsd r(se)
label variable foreign "Car type (1=foreign)"
estout, cells(b(star fmt(3)) t(par fmt(2))) ///
stats(skal skalsd r2 N, labels("Linear Combination" "S.E." R-squared "N. of cases")) ///
label legend varlabels(_cons Constant)
You can use the layout, star and fmt sub-options to format the added scalars like this:
estout, cells(b(star fmt(3)) t(par fmt(2))) ///
stats(skal skalsd r2 N, layout(# (#) # #) star(skal) labels("Linear Combination" "S.E." R-squared "N. of cases") fmt(%9.2f %9.2f %9.2f %12.0f)) ///
label legend varlabels(_cons Constant)
These options are documented here. As far as I know, the method in your question is the only way to do this.
There's a problem with the answer above. I think the original poster wanted statistical significance stars based on a test of the lincom (weight+mpg) vs zero. That is not what star(skal) does.
As the documentation states: star[(scalarlist)] to specify that the overall significance of the model be denoted by stars. The stars are attached to the scalar statistics specified in scalarlist." Emphasis mine.
star(skal) in the previous example will place significance stars next to the estimates of skal, but they will be from an F-test of the overall significance of the regression model (I think). Not from lincom.
When I use this strategy on other specifications, the stars attached are clearly not based on the ones from lincom. I'm not sure if it's possible to use lincom/esttab to get the stars from lincom. To say nothing of nlcom!