I want to make a table that displays all levels of my variables. For that I use the ibn notation. However, it seems that factor variable notation is not compatible with estpost and esttab. Is there a way to export summary statistics with factor variable notation to excel/word?
estpost sum ibn.male ibn.edu_lvl ibn.age_cat
Error: factor variables and time series operators not allowed.
All of my variables are not categorical on their own, but encoded, so I think the issue is the ibn prefix. Is it possible to have it somehow exported to excel?
UPD.: xi, noomit: estpost sum i.varname displayed summary statistics by levels, however it displayed the values of variables, e.g. (male == 1, male ==0) instead of labels of values (male, female)
estpost summarize expects a varlist as an input (see help estpost), and ibn.male, ibn.edu_lvl and ibn.age_cat are not variables, which is the cause of your error. You could calculate summary statistics by the levels of a variable, but estpost summarize doesn't support a by() option. estpost tabstat does, so you could use this instead.
Related
I want to create in Stata a coefplot variable: however, in one of the models I want to show there is no value for the estimate which I report, but instead I want to report the constant.
How is that possible?
sysuse auto, clear
regress price weight
coefplot, drop(weight) rename(_cons = abcdef)
I am trying to tabulate frequencies for a variable divided in two groups. That is, I would like to see how much a variable takes value "Yes" divided by both region and sex. Now, this is easy to do in Stata using "tab" and option row, but I have trouble exporting it. To make it clear, I am able to build the table with absolute frequencies in this way:
eststo formalyes: estpost tab regionwb_c female if fin22a==1
eststo formalno: estpost tab regionwb_c female if fin22a==0
eststo formalt: estpost tab regionwb_c female
estout formalyes formalno formalt using summformal.tex, replace varlabels(`e(labels)') unstack booktabs ///
mgroups("Yes" "No" "Tot", pattern(1 1 1) prefix(\multicolumn{#span}{c}{) suffix(}) span erepeat(\cmidrule(lr){#span})) fragment
This, put in my latex code, produces this relatively nice table:table1
Now what I would like to do is to reproduce the exact same table, but to have the relative and not absolute frequencies there. Now normally to my understanding if you want the relative frequencies you can have
tab x y, row nofreq
but if you try to combine this with estpost it does not work. Are there any hints? I tried working it out with tabout, but all i was able to produce is this:
tabout regionwb_c female using trial.tex, replace percent style(tex) c(mean fin22a) sum
Which gives this:table2
Where, as you can see, I am pretty lost. I am sorry if the question sounds silly but I struggled finding results online or on the tabout manual. I hope somebody can help me.
I have not worked with tabout before, but maybe one way to work around it could be to just program new variables containing the male and female relative frequencies by regionwb_c using the egen command for example (like in this link enter link description here. Then you could just pass these relative frequencies variables in your table.
Could that maybe help you? Good Luck!
I'm doing an analysis of the Current Population Survey. I have a wage variable (wage), a time-series variable (qtr), and an observational weight (pworwgt). Each quarter has thousands of observations.
I can easily make a table showing the weighted average wage in each quarter:
table qtr [iw=pworwgt], contents(mean wage)
What I want to do, however, is graph this easily within Stata. I tried to use egen to make a variable containing the mean by qtr, but egen mean() does not allow for weights.
One of the many ways this can be done is with a regression followed by two margins* commands:
webuse hanley
table rating [iw=pop], contents(mean disease)
reg disease i.rating [iw=pop]
margins rating
marginsplot, noci
This has the advantage of not altering your data in any way.
I want to create a table in Stata with the estout package to show the mean of a variable split by 2 groups (year and binary indicator) in an efficient way.
I found a solution, which is to split the main variable cash_at into 2 groups by hand through the generation of new variables, e.g. cash_at1 and cash_at2. Then, I can generate summary statistics with tabstat and get output with esttab.
estpost tabstat cash_at1 cash_at2, stat(mean) by(year)
esttab, cells("cash_at1 cash_at2")
Link to current result: http://imgur.com/2QytUz0
However, I'd prefer a horizontal table (e.g. year on the x axis) and a way to do it without splitting the groups by hand - is there a way to do so?
My preference in these cases is for year to be in rows and the statistic (e.g. mean) in the columns, but if you want to do it the other way around, there should be no problem.
For a table like the one you want it suffices to have the binary variable you already mention (which I name flag) and appropriate labeling. You can use the built-in table command:
clear all
set more off
* Create example data
set seed 8642
set obs 40
egen year = seq(), from(1985) to (2005) block(4)
gen cash = floor(runiform()*500)
gen flag = round(runiform())
list, sepby(year)
* Define labels
label define lflag 0 "cash0" 1 "cash1"
label values flag lflag
* Table
table flag year, contents(mean cash)
In general, for tables, apart from the estout module you may want to consider also the user-written command tabout. Run ssc describe tabout for more information.
On the other hand, it's not clear what you mean by "splitting groups by hand". You show no code for this operation, but as long as it's general enough for your purposes (and practical) I think you should allow for it. The code might not be as elegant as you wish but if it's doing what it's supposed to, I think it's alright. For example:
clear all
set more off
set seed 8642
set obs 40
* Create example data
egen year = seq(), from(1985) to (2005) block(4)
gen cash = floor(runiform()*500)
gen flag = round(runiform())
* Data management
gen cash0 = cash if flag == 0
gen cash1 = cash if flag == 1
* Table
estpost tabstat cash*, stat(mean) by(year)
esttab, cells("cash0 cash1")
can be used for a table like the one you give in your original post. It's true you have two extra lines and variables, but they may be harmless. I agree with the idea that in general, efficiency is something you worry about once your program is behaving appropriately; unless of course, the lack of it prevents you from reaching that state.
I want to output the results of several regressions as a nicely formatted LaTeX table and am happy to see that for most cases the estout package on SSC seems to do exactly that.
However, what I want is a bit special: Between the table section that shows the coefficient estimates and their standard errors, and the section that shows R^2 and the like, I would like to add a section that shows point estimates and standard errors (bonus points for stars) for particular linear combinations of the coefficients. Both point estimates and standard errors are easily computed via lincom but the best solution I've found so far for getting these numbers into the table involves the massily hacky addition of these numbers, one estadd scalar ... at a time. Is there a more elegant way to do this?
Example code:
sysuse auto
eststo, title("Model 1"): regress price weight mpg
lincom weight+mpg
estadd scalar skal r(estimate)
estadd scalar skalsd r(se)
eststo, title("Model 2"): regress price weight mpg foreign
lincom weight+mpg
estadd scalar skal r(estimate)
estadd scalar skalsd r(se)
label variable foreign "Car type (1=foreign)"
estout, cells(b(star fmt(3)) t(par fmt(2))) ///
stats(skal skalsd r2 N, labels("Linear Combination" "S.E." R-squared "N. of cases")) ///
label legend varlabels(_cons Constant)
You can use the layout, star and fmt sub-options to format the added scalars like this:
estout, cells(b(star fmt(3)) t(par fmt(2))) ///
stats(skal skalsd r2 N, layout(# (#) # #) star(skal) labels("Linear Combination" "S.E." R-squared "N. of cases") fmt(%9.2f %9.2f %9.2f %12.0f)) ///
label legend varlabels(_cons Constant)
These options are documented here. As far as I know, the method in your question is the only way to do this.
There's a problem with the answer above. I think the original poster wanted statistical significance stars based on a test of the lincom (weight+mpg) vs zero. That is not what star(skal) does.
As the documentation states: star[(scalarlist)] to specify that the overall significance of the model be denoted by stars. The stars are attached to the scalar statistics specified in scalarlist." Emphasis mine.
star(skal) in the previous example will place significance stars next to the estimates of skal, but they will be from an F-test of the overall significance of the regression model (I think). Not from lincom.
When I use this strategy on other specifications, the stars attached are clearly not based on the ones from lincom. I'm not sure if it's possible to use lincom/esttab to get the stars from lincom. To say nothing of nlcom!