I do following fixed effects regression with a loop. I always get an error "option if is not allowed"!
levelsof Sic, local(Sic)
xtset Year
foreach i of local Sic {
xtreg y mq r d, fe if Sic == `i'
eststo
}
if i do the same regression with a normal OLS regression, its working without any problems. why?
The if qualifier should come before the comma, options afterwards.
levelsof Sic, local(Sic)
foreach i of local Sic {
eststo: xtreg y mq r d if Sic == `i', fe
}
I assume it worked with OLS because you did not have to specify any options.
Related
I am constructing a table of means and p-values from ttest. How can I get all of this in the same esttab table? Here is a MWE:
Get the sample, save it as a temporary file, create a local with the variables we will consider, create a local that is the length of the first local:
sysuse auto2, clear
*create two groups: 0 and 1
gen group = _n<37
tempfile a
save `a'
local vars "price headroom trunk weight"
local vars_n: word count `vars'
ssc install estout
eststo clear
Calculate the means of group 0 (column 1) and group 2 (column 2):
*group 0 means
use `a', clear
keep if group==0
eststo: estpost sum `vars'
*group 1 means
use `a', clear
keep if group==1
eststo: estpost sum `vars'
Conduct t-tests for each variable (is there an easier way to do this?):
*t-test
*create blank matrix
matrix pval = J(`vars_n',1,.)
use `a', clear
forvalues i=1/`vars_n' {
local var `: word `i' of `vars''
ttest `var', by(group)
*add the two-sided p-value to matrix
matrix pval[`i',1]=r(p)
}
This previous block of code saves the p-values (column 3) into a matrix.
Use esttab to output the results:
esttab, cells(mean(fmt(2))) collabels(none) nodepvars nonumber replace label
esttab matrix(pval, fmt(2 0))
My issue is that I need to have the p-values in the same esttab as the means, but I currently have them in a matrix. How can I use something like eststo: estpost to get them so that I can use esttab (as opposed to esttab matrix)? Or is there a better way to do all of this? My goal is to run esttab, cells(mean(fmt(2))) collabels(none) nodepvars nonumber replace label and have it create a table with the first two columns being the means and the third column being the p-values.
All the information you need is in estpost ttest, so an easy solution would be this:
sysuse auto2, clear
gen group = _n<37
local vars price headroom trunk weight
estpost ttest `vars', by(group)
esttab ., cells("mu_1 mu_2 p") nonumber label
-----------------------------------------------------------
mu_1 mu_2 p
-----------------------------------------------------------
Price 5847.526 6500.639 .3445597
Headroom (in.) 2.828947 3.166667 .0861206
Trunk space (.. ft.) 12.39474 15.19444 .0041618
Weight (lbs.) 2654.474 3404.722 .0000115
-----------------------------------------------------------
Observations 74
-----------------------------------------------------------
I need to run regressions on both OLS and fixed effects panel models. The dependent variable is arranged by a group variable. The OLS proceeds like this:
sysuse data, clear
bysort group: reg depVar expVar1 expVar2
That works as it should. However, I have not managed to make this work with panel data:
sysuse data, clear
xtset id year
bysort group: xtreg depVar expVar1 expVar2, fe
However, an error terminates the process after defining the panel variables because there are duplicate observations. That is not a "real" error because after sorting by group there would be no duplicates.
I know that I could reshape the data to wide format and type a separate line for each estimation, but I am wondering whether there are other, more convenient ways around this.
In principle it works (see code below):
webuse airacc, clear
xtset airline time, delta(1)
xtreg relsize pmi ait, fe
gen indicator = round(runiform())
bys indic: xtreg relsize pmi ait, fe
The problem seems to be the dublicates. I never came across such a problem. However, you could run two separate regressions: i) preserve the data, ii) drop observations from group x iii) run the regression iv) restore the data and go to step i). This should yield identical results as long as groups are mutually exclusive.
webuse airacc, clear
xtset airline time, delta(1)
xtreg relsize pmi ait, fe
gen indicator = round(runiform())
preserve
drop if indic == 1
xtreg relsize pmi ait, fe
restore
preserve
drop if indic == 0
xtreg relsize pmi ait, fe
restore
bys indic: xtreg relsize pmi ait, fe
I am using the coefplot package in Stata to plot how a coefficient changes depending on the model employed. In particular, I want to see how the coefficient of interest changes over time. I am plotting it vertically, so the x-axis could show the year to which each coefficient refers to. However, I am not able to label the x-axis accordingly (instead of showing the name of the variable I am interested, x1, it should state 1, 2 and 3. I would also like to omit the legend by using the option legend(off) but that does not work.
This is the code I am using:
reg y x1 x2 if year==1;
estimates store t1;
reg y x1 x2 if year==2;
estimates store t2;
reg y x1 x2 if year==3;
estimates store t3;
coefplot t1 t2 t3, drop(x2) vertical yline(0);
Any suggestion would be greatly appreciated.
A nonsensical example that uses two categories (and not time) can be easily adapted:
clear
set more off
sysuse auto
reg price weight rep78 if foreign
estimates store foreign
reg price weight rep78 if !foreign
estimates store not_foreign
matrix at = (1 / 2)
coefplot foreign || not_foreign, drop(rep78 _cons) vertical bycoefs
You can build the syntax within a loop, using a local. Then feed it to coefplot. To be precise, I mean something like the exemplary syntax:
year1 || year2 || ... || yearn
The final command would look something like:
coefplot `allyears', drop(<some_stuff>) vertical bycoefs
A complete example that does involves time:
clear
set more off
use http://www.stata-press.com/data/r12/nlswork.dta
forvalues i = 70/73 {
regress ln_w grade age if year == `i'
estimates store year`i'
local allyears `allyears' year`i' ||
local labels `labels' `i'
}
// check
display "`allyears'"
display `"`labels'"'
coefplot `allyears', keep(grade) vertical bycoefs bylabels(`labels')
If coefplot doesn't turn out to be flexible enough, you can always try with statsby and graph commands (help graph).
I am trying to match the predict option after boxcox in Stata 13 with my code using the steps described in Stata manual (page 5).
Following is the sample code I used:
sysuse auto,clear
local indepvar weight foreign length
qui boxcox price `indepvar' ,model(lhsonly)lrtest
qui predict yhat1
qui predict resid1, residuals
//yhat2 and resid2 computed using the procedure described in Stata manual
set more off
set type double
mat coef=e(b)
local nosvar=colsof(coef)-2
qui gen constant=1
local varname weight foreign length constant
local coefname weight foreign length _cons
//step 1: compute residuals first
forvalues k = 1/`nosvar'{
local varname1 : word `k' of `varname'
local coefname1 : word `k' of `coefname'
qui gen xb`varname1'=`varname1'*_b[`coefname1']
}
qui egen xb=rowtotal(xb*)
qui gen resid=(price^(_b[theta:_cons]))-xb
//step 2: compute predicted value
qui gen yhat2=.
local noobs=_N
local theta=_b[theta:_cons]
forvalues j=1/`noobs'{
qui gen temp`j'=.
forvalues i=1/`noobs'{
qui replace temp`j'=((`theta'*(xb[`j']+resid[`i']))+1)^(1/`theta') if _n==`i'
}
qui sum temp`j'
local tempmean`j'=r(mean)
qui replace yhat2=`tempmean`j'' if _n==`j'
drop temp`j'
}
drop resid
qui gen double resid2=price-yhat2
sum yhat* resid*
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
yhat1 | 74 6254.224 2705.175 3428.361 21982.45
yhat2 | 74 1.000035 8.13e-06 1.000015 1.000054
resid1 | 74 -88.96723 2094.162 -10485.45 6980.013
resid2 | 74 6164.257 2949.496 3290 15905
Note: yhat1 and resid1 are based on Stata predict, while yhat2 and resid2 are based on my sample code. The comparison is needed to make sure the marginal effect I computed is correct (margins doesn't compute the marginal effect after boxcox).
Your definition of the first residual is wrong, because you missed the definition of y^(\lambda) on page 3 of the Manual. See also the Methods and Formulas section in the Manual entry for boxcox itself.
Translated to your problem, in the line
qui gen resid=(price^(_b[theta:_cons]))-xb
the term
price^(_b[theta:_cons])
should be:
(price^(_b[theta:_cons])-1)/_b[theta:_cons]
I am trying to run regressions in Stata by company_id using a large dataset. The goal is to get a line for each company_id with results of the regression. I am using the following code that gives me the beta coefficient, std error, adj r-squared and N. But I also need to include the Durbin Watson statistic and have not been successful doing that so far. Can someone help? Thanks.
statsby _b _se r2 = e(r2_a) _N, by (company_id) saving($path\SC_results_`i'.dta, replace): regress ret sptr_ret
A small program that combines regress and dwstat into one command should help. Here's an attempt.
capture program drop reg_dw
program reg_dw, rclass
syntax varlist
regress `varlist'
dwstat
return scalar dw=r(dw)
end
webuse invest2,clear
gen index=_n
tsset index
statsby _b _se r2 = e(r2_a) dw=r(dw) _N, by (company) saving(x.dta, replace): reg_dw invest market
use x, clear
tab _eq2_dw