Using mean versus qui mean in Stata - stata

I have a following code with the output:
sysuse auto,clear
clear matrix
local vars price weight length
foreach i of local vars{
qui mean `i'
mat `i'=r(table)
scalar mean_`i'=`i'[1,1]
scalar se_`i'=`i'[2,1]
scalar n_`i'=`i'[7,1]+1
scalar sd_`i'=se_`i'*sqrt(n_`i')
mat des_`i'=(n_`i',mean_`i',sd_`i')
mat colnames des_`i'=Observations Mean SD
mat rownames des_`i'=`i'
mat des_result=nullmat(des_result)\des_`i'
}
estout matrix(des_result,fmt(0 4 4))
---------------------------------------------------
des_result
Observations Mean SD
---------------------------------------------------
price . . .
weight . . .
length . . .
---------------------------------------------------
Now if I only change qui mean `i' to mean `i', I have an output:
sysuse auto,clear
clear matrix
local vars price weight length
foreach i of local vars{
mean `i'
mat `i'=r(table)
scalar mean_`i'=`i'[1,1]
scalar se_`i'=`i'[2,1]
scalar n_`i'=`i'[7,1]+1
scalar sd_`i'=se_`i'*sqrt(n_`i')
mat des_`i'=(n_`i',mean_`i',sd_`i')
mat colnames des_`i'=Observations Mean SD
mat rownames des_`i'=`i'
mat des_result=nullmat(des_result)\des_`i'
}
estout matrix(des_result,fmt(0 4 4))
---------------------------------------------------
des_result
Observations Mean SD
---------------------------------------------------
price 74 6165.2568 2949.4959
weight 74 3019.4595 777.1936
length 74 187.9324 22.2663
---------------------------------------------------
I was wondering why I am not getting the output when I use qui. Note this has nothing to do with estout. It has to do with mean.

I don't use estout (SSC/SJ) but this shows (as others have reported) my complete failure to reproduce the reported problem that the matrix of desired results is empty when quietly is applied:
. about
Stata/SE 13.1 for Windows (64-bit x86-64)
Revision 03 Jul 2014
Copyright 1985-2013 StataCorp LP
local and personal details edited out
. sysuse auto,clear
(1978 Automobile Data)
. clear matrix
. local vars price weight length
. foreach i of local vars{
2. qui mean `i'
3. mat `i'=r(table)
4. scalar mean_`i'=`i'[1,1]
5. scalar se_`i'=`i'[2,1]
6. scalar n_`i'=`i'[7,1]+1
7. scalar sd_`i'=se_`i'*sqrt(n_`i')
8. mat des_`i'=(n_`i',mean_`i',sd_`i')
9. mat colnames des_`i'=Observations Mean SD
10. mat rownames des_`i'=`i'
11. mat des_result=nullmat(des_result)\des_`i'
12. }
. mat li des_result
des_result[3,3]
Observations Mean SD
price 74 6165.2568 2949.4959
weight 74 3019.4595 777.19357
length 74 187.93243 22.26634

Related

Rename one particular column of a matrix in Stata

I am working on modyfinyg, renaming and arranging estimation results for final publication in Stata. What I have so far is:
sysuse auto, clear
* interaction model
gen int_mpg_mpg = mpg*mpg // generate interaction manually
qui regress price weight mpg int_mpg_mpg foreign
mat b=e(b) // store estimation results in matrix b
matrix list b // see the problem, colum 3 "int_mpg_mpg"
b[1,5]
weight mpg int_mpg_mpg foreign _cons
y1 3.0379251 -298.14637 5.8617627 3420.2156 -527.04589
* rename interaction with additional package erepost
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
capt prog drop replace_b
program replace_b, eclass
erepost b= b, rename
end
replace_b
eststo my_model
esttab my_model, label
This gives a nice version of the interaction term:
------------------------------------
(1)
Price
------------------------------------
b
Weight (lbs.) 3.038***
(3.84)
Mileage (mpg) -298.1
(-0.82)
Mileage (mpg) # Mi~) 5.862
(0.90)
Car type 3420.2***
(4.62)
Constant -527.0
(-0.08)
------------------------------------
Observations 74
------------------------------------
But imagine that I have many more models and variables. Then it becomes very unhandy to list all variables, all interactions etc. in this local coln
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
I am trying to only fix particular column names in the estimation matrix:
local colfix "b:c.mpg#c.mpg"
mat colnames b[1,3]= `colfix'
How can I access a particular column of a matrix in Stata?
Your intended new name seems the same as the existing name.
I don't know a way to change individual column names, but this may help:
. matrix foo = J(3, 4, 42)
. matrix colname foo = frog toad DRAGON griffin
. mat li foo
foo[3,4]
frog toad DRAGON griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
. local colnames : colnames foo
. local colnames : subinstr local colnames "DRAGON" "newt", all
. matrix colname foo = `colnames'
. matrix li foo
foo[3,4]
frog toad newt griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
It's a modest exercise to make that into a subroutine. The bigger deal is whether there is a way to do this in the estout suite, wonderful commands I never use.

How can I stack equations using estout?

I am creating a summary statistics table using the community-contributed command estout.
The code looks like this:
sysuse auto, clear
eststo clear
eststo: estpost ttest price mpg weight headroom trunk if rep78 ==3, by(foreign)
eststo: estpost ttest price mpg weight headroom trunk if rep78 ==4, by(foreign)
estout, cells("mu_1 mu_2 b(star)")
The result looks as follows:
--------------------------------------------------------------------------------------------
est1 est2
mu_1 mu_2 b mu_1 mu_2 b
--------------------------------------------------------------------------------------------
price 6607.074 4828.667 1778.407 5881.556 6261.444 -379.8889
mpg 19 23.33333 -4.333333 18.44444 24.88889 -6.444444**
weight 3442.222 2010 1432.222*** 3532.222 2207.778 1324.444***
headroom 3.222222 2.666667 .5555556 3.444444 2.5 .9444444*
trunk 15.59259 12.33333 3.259259 16.66667 10.33333 6.333333**
--------------------------------------------------------------------------------------------
I would like to know how I could stack est1 and est2 on top of each other.
The command estout cannot automatically stack results from stored estimates. Consequently, the use of eststo is redundant. In this case, the easiest way to obtain the desired output is to simply create two matrices with the results and stack one on top of the other.
For example:
sysuse auto, clear
matrix A = J(5, 3, .)
local i 0
foreach var of varlist price mpg weight headroom trunk {
local ++i
ttest `var' if rep78 == 3, by(foreign)
matrix A[`i', 1] = r(mu_1)
matrix A[`i', 2] = r(mu_2)
matrix A[`i', 3] = r(mu_1) - r(mu_2)
local matnamesA `matnamesA' "rep78==3:`var'"
}
matrix rownames A = `matnamesA'
matrix B = J(5, 3, .)
local i 0
foreach var of varlist price mpg weight headroom trunk {
local ++i
ttest `var' if rep78 == 4, by(foreign)
matrix B[`i', 1] = r(mu_1)
matrix B[`i', 2] = r(mu_2)
matrix B[`i', 3] = r(mu_1) - r(mu_2)
local matnamesB `matnamesB' "rep78==4:`var'"
}
matrix rownames B = `matnamesB'
matrix C = A \ B
esttab matrix(C), nomtitles collabels("mu_1" "mu_2" "diff")
---------------------------------------------------
mu_1 mu_2 diff
---------------------------------------------------
rep78==3
price 6607.074 4828.667 1778.407
mpg 19 23.33333 -4.333333
weight 3442.222 2010 1432.222
headroom 3.222222 2.666667 .5555556
trunk 15.59259 12.33333 3.259259
---------------------------------------------------
rep78==4
price 5881.556 6261.444 -379.8889
mpg 18.44444 24.88889 -6.444444
weight 3532.222 2207.778 1324.444
headroom 3.444444 2.5 .9444444
trunk 16.66667 10.33333 6.333333
---------------------------------------------------

Write a Stata eclass program that only returns a vector of p-values

Edit: After one day I cross-posted this Statalist.
I want to write my own eclass program to work with eststo and esttab from SSC's esttab.
I want to write a wrapper for ranksum so that I can get the p-values and add them to a table alongside the p-values from t-tests. From the ereturn help file it seems like I need to return a b matrix to comply with eclass, but I can't figure out how.
I commented the bad calls below. Making the table with means and t-tests is no problem (i.e., first 3 columns). But my Wilcoxon3 command gives an error. I have tried with and without returning b, but I can't figure out the error, which is invalid syntax.
I am OK with any solution that puts the ranksum p-values into an e() container.
capture program drop Wilcoxon3
program define Wilcoxon3, eclass
version 13
syntax varlist [if] [in], by(varname)
marksample touse
local names
foreach v of varlist `varlist' {
ranksum `v' if `touse', by(`by')
matrix b = nullmat(b), 1
matrix p = nullmat(p), 2*normprob(-abs(r(z)))
local names `names' "`v'"
}
matrix colnames p = `names'
// store results
ereturn post b
ereturn matrix p
end
sysuse auto, clear
eststo clear
/* store means and t-tests */
estpost summarize price weight if !foreign
eststo
estpost summarize price weight if foreign
eststo
estpost ttest price weight, by(foreign)
eststo
/* /1* try to store rank-sum test *1/ */
/* Wilcoxon3 price weight, by(foreign) */
/* eststo */
/* table of means and t-tests */
esttab, cells("mean(pattern(1 0 0) fmt(3) label(!Foreign)) mean(pattern(0 1 0) fmt(3) label(Foreign)) p(star pattern(0 0 1) fmt(3) label(p))")
/* /1* try table of means, t-tests, and rank-sum tests *1/ */
/* esttab, cells("mean(pattern(1 0 0 0) fmt(3) label(!Foreign)) mean(pattern(0 1 0 0) fmt(3) label(Foreign)) p(star pattern(0 0 1 0) fmt(3) label(p)) p(star pattern(0 0 0 1) fmt(3) label(p))") */
I received an answer on Statalist here.
This was a case of RTFM. The ereturn post syntax doesn't require rename and equal sign, but ereturn matrix does. This is clear if you focus on the syntax example provided at the top of the ereturn help page. The = is optional, but a new name is not.
The correct code is ereturn matrix p = p.

Exporting results from regressions in Excel

I want to store results from ordinary least squares (OLS) regressions in Stata within a double loop.
Here is the structure of my code:
foreach i2 of numlist 1 2 3{
foreach i3 of numlist 1 2 3 4{
quiet: eststo: reg dep covariates, robust
}
}
The end goal is to have a table in Excel composed by twelve rows (one for each model) and seven columns (number of observations, estimated constant, five estimated coefficients).
Any suggestion on how can I do this?
Such a table can be created simply by using the community-contributed command esttab:
sysuse auto, clear
eststo clear
eststo m1: quietly regress price weight
eststo m2: quietly regress price weight mpg
quietly esttab
matrix A = r(coefs)'
matrix C = r(stats)'
tokenize "`: rownames A'"
forvalues i = 1 / `=rowsof(A)' {
if strmatch("``i''", "*b*") matrix B = nullmat(B) \ A[`i', 1...]
}
matrix C = B , C
matrix rownames C = "Model 1" "Model 2"
Result:
esttab matrix(C) using table.csv, eqlabels(none) mlabels(none) varlabels("Model 1" "Model 2")
----------------------------------------------------------------
weight mpg _cons N
----------------------------------------------------------------
Model 1 2.044063 -6.707353 74
Model 2 1.746559 -49.51222 1946.069 74
----------------------------------------------------------------

Is there a way to keep only those variables with a certain label?

I have a very large dataset that needs to be refined to only certain variables. These variables' names are all over the place, but each wanted variable's label starts with "AW". Is there a way to use keep or an analogous command to keep only these variables?
For this you probably need a loop.
// create sample data
clear
set obs 1
generate x = 1
generate y = 2
generate z = 3
label variable x "AW test"
label variable y "XW test"
label variable z "AQ test"
describe
// do the job
foreach var of varlist x-z {
local lbl : variable label `var'
if substr("`lbl'",1,2)!="AW" drop `var'
}
describe
The second describe will show that only the variable x remains.
Official command ds and user-written command findname (Stata Journal: use search findname to download latest version) will identify such variables, after which keep can be used directly.
. sysuse auto, clear
(1978 Automobile Data)
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2007 17:45
size: 3,478 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
--------------------------------------------------------------------------------------
Sorted by: foreign
. ds, has(varlabel Mile*)
mpg
. findname , varlabeltext(Mile*)
mpg
. keep `r(varlist)'
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 1 13 Apr 2007 17:45
size: 444 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
mpg int %8.0g Mileage (mpg)
--------------------------------------------------------------------------------------
Sorted by:
Note: dataset has changed since last saved