Rename one particular column of a matrix in Stata - stata

I am working on modyfinyg, renaming and arranging estimation results for final publication in Stata. What I have so far is:
sysuse auto, clear
* interaction model
gen int_mpg_mpg = mpg*mpg // generate interaction manually
qui regress price weight mpg int_mpg_mpg foreign
mat b=e(b) // store estimation results in matrix b
matrix list b // see the problem, colum 3 "int_mpg_mpg"
b[1,5]
weight mpg int_mpg_mpg foreign _cons
y1 3.0379251 -298.14637 5.8617627 3420.2156 -527.04589
* rename interaction with additional package erepost
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
capt prog drop replace_b
program replace_b, eclass
erepost b= b, rename
end
replace_b
eststo my_model
esttab my_model, label
This gives a nice version of the interaction term:
------------------------------------
(1)
Price
------------------------------------
b
Weight (lbs.) 3.038***
(3.84)
Mileage (mpg) -298.1
(-0.82)
Mileage (mpg) # Mi~) 5.862
(0.90)
Car type 3420.2***
(4.62)
Constant -527.0
(-0.08)
------------------------------------
Observations 74
------------------------------------
But imagine that I have many more models and variables. Then it becomes very unhandy to list all variables, all interactions etc. in this local coln
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
I am trying to only fix particular column names in the estimation matrix:
local colfix "b:c.mpg#c.mpg"
mat colnames b[1,3]= `colfix'
How can I access a particular column of a matrix in Stata?

Your intended new name seems the same as the existing name.
I don't know a way to change individual column names, but this may help:
. matrix foo = J(3, 4, 42)
. matrix colname foo = frog toad DRAGON griffin
. mat li foo
foo[3,4]
frog toad DRAGON griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
. local colnames : colnames foo
. local colnames : subinstr local colnames "DRAGON" "newt", all
. matrix colname foo = `colnames'
. matrix li foo
foo[3,4]
frog toad newt griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
It's a modest exercise to make that into a subroutine. The bigger deal is whether there is a way to do this in the estout suite, wonderful commands I never use.

Related

Stata output table: Conditional symbols in estout

for simplification, let's assume following script to create a simple regression table:
sysuse auto
eststo clear
qui regress price weight mpg
esttab using "table.rtf", cells(t) mtitles onecell nogap ///
stats(N, labels("Observations")) label ///
compress replace
eststo clear
Output:
(1)
.
t
Weight (lbs.) 2.723238
Mileage (mpg) -.5746808
Constant .541018
Observations 74
Question:
Would it be possible to mark every t-value above 0.5 or below 0.5 with an asterisk? (= greater than absolute value 0.5)
Please note: In the specific application case, I can't work with given p-values, and need a custom solution that works with thresholds of t.
Desired outcome:
(1)
.
t
Weight (lbs.) 2.723238*
Mileage (mpg) -.5746808*
Constant .541018*
Observations 74
Crossposting can be found here:
Thank you for your help!
You cannot do this directly with estout but the following works all the same:
sysuse auto, clear
regress price weight mpg
quietly esttab, mtitles onecell nogap stats(N, labels("Observations")) label ///
compress replace star staraux
matrix A = r(coefs)
matrix A = A[1...,2]
svmat A
generate A2 = "*" if abs(A1) >= 0.5
generate A4 = string(A1) + A2
local names : rownames A
generate A3 = ""
forvalues i = 1 / `: word count `names'' {
replace A3 = `"`: word `i' of `names''"' in `i'
}
list A3 A4 if !missing(A3)
+---------------------+
| A3 A4 |
|---------------------|
1. | weight 2.723238* |
2. | mpg -.5746808* |
3. | _cons .541018* |
+---------------------+
preserve
keep if !missing(A3)
export delimited A3 A4 using table.txt, delimiter(" ") novarnames
restore
You will have to do some more gymnastics to get the variable labels etc.

Exporting results from regressions in Excel

I want to store results from ordinary least squares (OLS) regressions in Stata within a double loop.
Here is the structure of my code:
foreach i2 of numlist 1 2 3{
foreach i3 of numlist 1 2 3 4{
quiet: eststo: reg dep covariates, robust
}
}
The end goal is to have a table in Excel composed by twelve rows (one for each model) and seven columns (number of observations, estimated constant, five estimated coefficients).
Any suggestion on how can I do this?
Such a table can be created simply by using the community-contributed command esttab:
sysuse auto, clear
eststo clear
eststo m1: quietly regress price weight
eststo m2: quietly regress price weight mpg
quietly esttab
matrix A = r(coefs)'
matrix C = r(stats)'
tokenize "`: rownames A'"
forvalues i = 1 / `=rowsof(A)' {
if strmatch("``i''", "*b*") matrix B = nullmat(B) \ A[`i', 1...]
}
matrix C = B , C
matrix rownames C = "Model 1" "Model 2"
Result:
esttab matrix(C) using table.csv, eqlabels(none) mlabels(none) varlabels("Model 1" "Model 2")
----------------------------------------------------------------
weight mpg _cons N
----------------------------------------------------------------
Model 1 2.044063 -6.707353 74
Model 2 1.746559 -49.51222 1946.069 74
----------------------------------------------------------------

Is there a way to keep only those variables with a certain label?

I have a very large dataset that needs to be refined to only certain variables. These variables' names are all over the place, but each wanted variable's label starts with "AW". Is there a way to use keep or an analogous command to keep only these variables?
For this you probably need a loop.
// create sample data
clear
set obs 1
generate x = 1
generate y = 2
generate z = 3
label variable x "AW test"
label variable y "XW test"
label variable z "AQ test"
describe
// do the job
foreach var of varlist x-z {
local lbl : variable label `var'
if substr("`lbl'",1,2)!="AW" drop `var'
}
describe
The second describe will show that only the variable x remains.
Official command ds and user-written command findname (Stata Journal: use search findname to download latest version) will identify such variables, after which keep can be used directly.
. sysuse auto, clear
(1978 Automobile Data)
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2007 17:45
size: 3,478 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
--------------------------------------------------------------------------------------
Sorted by: foreign
. ds, has(varlabel Mile*)
mpg
. findname , varlabeltext(Mile*)
mpg
. keep `r(varlist)'
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 1 13 Apr 2007 17:45
size: 444 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
mpg int %8.0g Mileage (mpg)
--------------------------------------------------------------------------------------
Sorted by:
Note: dataset has changed since last saved

Using mean versus qui mean in Stata

I have a following code with the output:
sysuse auto,clear
clear matrix
local vars price weight length
foreach i of local vars{
qui mean `i'
mat `i'=r(table)
scalar mean_`i'=`i'[1,1]
scalar se_`i'=`i'[2,1]
scalar n_`i'=`i'[7,1]+1
scalar sd_`i'=se_`i'*sqrt(n_`i')
mat des_`i'=(n_`i',mean_`i',sd_`i')
mat colnames des_`i'=Observations Mean SD
mat rownames des_`i'=`i'
mat des_result=nullmat(des_result)\des_`i'
}
estout matrix(des_result,fmt(0 4 4))
---------------------------------------------------
des_result
Observations Mean SD
---------------------------------------------------
price . . .
weight . . .
length . . .
---------------------------------------------------
Now if I only change qui mean `i' to mean `i', I have an output:
sysuse auto,clear
clear matrix
local vars price weight length
foreach i of local vars{
mean `i'
mat `i'=r(table)
scalar mean_`i'=`i'[1,1]
scalar se_`i'=`i'[2,1]
scalar n_`i'=`i'[7,1]+1
scalar sd_`i'=se_`i'*sqrt(n_`i')
mat des_`i'=(n_`i',mean_`i',sd_`i')
mat colnames des_`i'=Observations Mean SD
mat rownames des_`i'=`i'
mat des_result=nullmat(des_result)\des_`i'
}
estout matrix(des_result,fmt(0 4 4))
---------------------------------------------------
des_result
Observations Mean SD
---------------------------------------------------
price 74 6165.2568 2949.4959
weight 74 3019.4595 777.1936
length 74 187.9324 22.2663
---------------------------------------------------
I was wondering why I am not getting the output when I use qui. Note this has nothing to do with estout. It has to do with mean.
I don't use estout (SSC/SJ) but this shows (as others have reported) my complete failure to reproduce the reported problem that the matrix of desired results is empty when quietly is applied:
. about
Stata/SE 13.1 for Windows (64-bit x86-64)
Revision 03 Jul 2014
Copyright 1985-2013 StataCorp LP
local and personal details edited out
. sysuse auto,clear
(1978 Automobile Data)
. clear matrix
. local vars price weight length
. foreach i of local vars{
2. qui mean `i'
3. mat `i'=r(table)
4. scalar mean_`i'=`i'[1,1]
5. scalar se_`i'=`i'[2,1]
6. scalar n_`i'=`i'[7,1]+1
7. scalar sd_`i'=se_`i'*sqrt(n_`i')
8. mat des_`i'=(n_`i',mean_`i',sd_`i')
9. mat colnames des_`i'=Observations Mean SD
10. mat rownames des_`i'=`i'
11. mat des_result=nullmat(des_result)\des_`i'
12. }
. mat li des_result
des_result[3,3]
Observations Mean SD
price 74 6165.2568 2949.4959
weight 74 3019.4595 777.19357
length 74 187.93243 22.26634

How can I modify output of Stata's describe command?

When using Stata's describe (docs):
sysuse auto
d
I get the following output:
Contains data from /usr/local/stata13/ado/base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2013 17:45
size: 3,182 (_dta has notes)
----------------------------------------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
----------------------------------------------------------------------------------------------------------------------------
Sorted by: foreign
Is it possible to remove the value label from the displayed results?
If you want to see everything but value label names, that's programmable but cannot I think be achieved by option choices.
Here is a first approximation to such a program:
program mydescribe
version 8.2
syntax [varlist]
di
foreach v of local varlist {
di abbrev("`v'", 20) _c
di "{col 22}`: type `v''" _c
di "{col 30}`: format `v''" _c
di "{col 38}`: variable label `v''"
}
end