I am working on modyfinyg, renaming and arranging estimation results for final publication in Stata. What I have so far is:
sysuse auto, clear
* interaction model
gen int_mpg_mpg = mpg*mpg // generate interaction manually
qui regress price weight mpg int_mpg_mpg foreign
mat b=e(b) // store estimation results in matrix b
matrix list b // see the problem, colum 3 "int_mpg_mpg"
b[1,5]
weight mpg int_mpg_mpg foreign _cons
y1 3.0379251 -298.14637 5.8617627 3420.2156 -527.04589
* rename interaction with additional package erepost
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
capt prog drop replace_b
program replace_b, eclass
erepost b= b, rename
end
replace_b
eststo my_model
esttab my_model, label
This gives a nice version of the interaction term:
------------------------------------
(1)
Price
------------------------------------
b
Weight (lbs.) 3.038***
(3.84)
Mileage (mpg) -298.1
(-0.82)
Mileage (mpg) # Mi~) 5.862
(0.90)
Car type 3420.2***
(4.62)
Constant -527.0
(-0.08)
------------------------------------
Observations 74
------------------------------------
But imagine that I have many more models and variables. Then it becomes very unhandy to list all variables, all interactions etc. in this local coln
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
I am trying to only fix particular column names in the estimation matrix:
local colfix "b:c.mpg#c.mpg"
mat colnames b[1,3]= `colfix'
How can I access a particular column of a matrix in Stata?
Your intended new name seems the same as the existing name.
I don't know a way to change individual column names, but this may help:
. matrix foo = J(3, 4, 42)
. matrix colname foo = frog toad DRAGON griffin
. mat li foo
foo[3,4]
frog toad DRAGON griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
. local colnames : colnames foo
. local colnames : subinstr local colnames "DRAGON" "newt", all
. matrix colname foo = `colnames'
. matrix li foo
foo[3,4]
frog toad newt griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
It's a modest exercise to make that into a subroutine. The bigger deal is whether there is a way to do this in the estout suite, wonderful commands I never use.
Related
for simplification, let's assume following script to create a simple regression table:
sysuse auto
eststo clear
qui regress price weight mpg
esttab using "table.rtf", cells(t) mtitles onecell nogap ///
stats(N, labels("Observations")) label ///
compress replace
eststo clear
Output:
(1)
.
t
Weight (lbs.) 2.723238
Mileage (mpg) -.5746808
Constant .541018
Observations 74
Question:
Would it be possible to mark every t-value above 0.5 or below 0.5 with an asterisk? (= greater than absolute value 0.5)
Please note: In the specific application case, I can't work with given p-values, and need a custom solution that works with thresholds of t.
Desired outcome:
(1)
.
t
Weight (lbs.) 2.723238*
Mileage (mpg) -.5746808*
Constant .541018*
Observations 74
Crossposting can be found here:
Thank you for your help!
You cannot do this directly with estout but the following works all the same:
sysuse auto, clear
regress price weight mpg
quietly esttab, mtitles onecell nogap stats(N, labels("Observations")) label ///
compress replace star staraux
matrix A = r(coefs)
matrix A = A[1...,2]
svmat A
generate A2 = "*" if abs(A1) >= 0.5
generate A4 = string(A1) + A2
local names : rownames A
generate A3 = ""
forvalues i = 1 / `: word count `names'' {
replace A3 = `"`: word `i' of `names''"' in `i'
}
list A3 A4 if !missing(A3)
+---------------------+
| A3 A4 |
|---------------------|
1. | weight 2.723238* |
2. | mpg -.5746808* |
3. | _cons .541018* |
+---------------------+
preserve
keep if !missing(A3)
export delimited A3 A4 using table.txt, delimiter(" ") novarnames
restore
You will have to do some more gymnastics to get the variable labels etc.
I want to store results from ordinary least squares (OLS) regressions in Stata within a double loop.
Here is the structure of my code:
foreach i2 of numlist 1 2 3{
foreach i3 of numlist 1 2 3 4{
quiet: eststo: reg dep covariates, robust
}
}
The end goal is to have a table in Excel composed by twelve rows (one for each model) and seven columns (number of observations, estimated constant, five estimated coefficients).
Any suggestion on how can I do this?
Such a table can be created simply by using the community-contributed command esttab:
sysuse auto, clear
eststo clear
eststo m1: quietly regress price weight
eststo m2: quietly regress price weight mpg
quietly esttab
matrix A = r(coefs)'
matrix C = r(stats)'
tokenize "`: rownames A'"
forvalues i = 1 / `=rowsof(A)' {
if strmatch("``i''", "*b*") matrix B = nullmat(B) \ A[`i', 1...]
}
matrix C = B , C
matrix rownames C = "Model 1" "Model 2"
Result:
esttab matrix(C) using table.csv, eqlabels(none) mlabels(none) varlabels("Model 1" "Model 2")
----------------------------------------------------------------
weight mpg _cons N
----------------------------------------------------------------
Model 1 2.044063 -6.707353 74
Model 2 1.746559 -49.51222 1946.069 74
----------------------------------------------------------------
I have a very large dataset that needs to be refined to only certain variables. These variables' names are all over the place, but each wanted variable's label starts with "AW". Is there a way to use keep or an analogous command to keep only these variables?
For this you probably need a loop.
// create sample data
clear
set obs 1
generate x = 1
generate y = 2
generate z = 3
label variable x "AW test"
label variable y "XW test"
label variable z "AQ test"
describe
// do the job
foreach var of varlist x-z {
local lbl : variable label `var'
if substr("`lbl'",1,2)!="AW" drop `var'
}
describe
The second describe will show that only the variable x remains.
Official command ds and user-written command findname (Stata Journal: use search findname to download latest version) will identify such variables, after which keep can be used directly.
. sysuse auto, clear
(1978 Automobile Data)
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2007 17:45
size: 3,478 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
--------------------------------------------------------------------------------------
Sorted by: foreign
. ds, has(varlabel Mile*)
mpg
. findname , varlabeltext(Mile*)
mpg
. keep `r(varlist)'
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 1 13 Apr 2007 17:45
size: 444 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
mpg int %8.0g Mileage (mpg)
--------------------------------------------------------------------------------------
Sorted by:
Note: dataset has changed since last saved
I have a following code with the output:
sysuse auto,clear
clear matrix
local vars price weight length
foreach i of local vars{
qui mean `i'
mat `i'=r(table)
scalar mean_`i'=`i'[1,1]
scalar se_`i'=`i'[2,1]
scalar n_`i'=`i'[7,1]+1
scalar sd_`i'=se_`i'*sqrt(n_`i')
mat des_`i'=(n_`i',mean_`i',sd_`i')
mat colnames des_`i'=Observations Mean SD
mat rownames des_`i'=`i'
mat des_result=nullmat(des_result)\des_`i'
}
estout matrix(des_result,fmt(0 4 4))
---------------------------------------------------
des_result
Observations Mean SD
---------------------------------------------------
price . . .
weight . . .
length . . .
---------------------------------------------------
Now if I only change qui mean `i' to mean `i', I have an output:
sysuse auto,clear
clear matrix
local vars price weight length
foreach i of local vars{
mean `i'
mat `i'=r(table)
scalar mean_`i'=`i'[1,1]
scalar se_`i'=`i'[2,1]
scalar n_`i'=`i'[7,1]+1
scalar sd_`i'=se_`i'*sqrt(n_`i')
mat des_`i'=(n_`i',mean_`i',sd_`i')
mat colnames des_`i'=Observations Mean SD
mat rownames des_`i'=`i'
mat des_result=nullmat(des_result)\des_`i'
}
estout matrix(des_result,fmt(0 4 4))
---------------------------------------------------
des_result
Observations Mean SD
---------------------------------------------------
price 74 6165.2568 2949.4959
weight 74 3019.4595 777.1936
length 74 187.9324 22.2663
---------------------------------------------------
I was wondering why I am not getting the output when I use qui. Note this has nothing to do with estout. It has to do with mean.
I don't use estout (SSC/SJ) but this shows (as others have reported) my complete failure to reproduce the reported problem that the matrix of desired results is empty when quietly is applied:
. about
Stata/SE 13.1 for Windows (64-bit x86-64)
Revision 03 Jul 2014
Copyright 1985-2013 StataCorp LP
local and personal details edited out
. sysuse auto,clear
(1978 Automobile Data)
. clear matrix
. local vars price weight length
. foreach i of local vars{
2. qui mean `i'
3. mat `i'=r(table)
4. scalar mean_`i'=`i'[1,1]
5. scalar se_`i'=`i'[2,1]
6. scalar n_`i'=`i'[7,1]+1
7. scalar sd_`i'=se_`i'*sqrt(n_`i')
8. mat des_`i'=(n_`i',mean_`i',sd_`i')
9. mat colnames des_`i'=Observations Mean SD
10. mat rownames des_`i'=`i'
11. mat des_result=nullmat(des_result)\des_`i'
12. }
. mat li des_result
des_result[3,3]
Observations Mean SD
price 74 6165.2568 2949.4959
weight 74 3019.4595 777.19357
length 74 187.93243 22.26634
When using Stata's describe (docs):
sysuse auto
d
I get the following output:
Contains data from /usr/local/stata13/ado/base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2013 17:45
size: 3,182 (_dta has notes)
----------------------------------------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
----------------------------------------------------------------------------------------------------------------------------
Sorted by: foreign
Is it possible to remove the value label from the displayed results?
If you want to see everything but value label names, that's programmable but cannot I think be achieved by option choices.
Here is a first approximation to such a program:
program mydescribe
version 8.2
syntax [varlist]
di
foreach v of local varlist {
di abbrev("`v'", 20) _c
di "{col 22}`: type `v''" _c
di "{col 30}`: format `v''" _c
di "{col 38}`: variable label `v''"
}
end