How can I modify output of Stata's describe command? - stata

When using Stata's describe (docs):
sysuse auto
d
I get the following output:
Contains data from /usr/local/stata13/ado/base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2013 17:45
size: 3,182 (_dta has notes)
----------------------------------------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
----------------------------------------------------------------------------------------------------------------------------
Sorted by: foreign
Is it possible to remove the value label from the displayed results?

If you want to see everything but value label names, that's programmable but cannot I think be achieved by option choices.
Here is a first approximation to such a program:
program mydescribe
version 8.2
syntax [varlist]
di
foreach v of local varlist {
di abbrev("`v'", 20) _c
di "{col 22}`: type `v''" _c
di "{col 30}`: format `v''" _c
di "{col 38}`: variable label `v''"
}
end

Related

Rename one particular column of a matrix in Stata

I am working on modyfinyg, renaming and arranging estimation results for final publication in Stata. What I have so far is:
sysuse auto, clear
* interaction model
gen int_mpg_mpg = mpg*mpg // generate interaction manually
qui regress price weight mpg int_mpg_mpg foreign
mat b=e(b) // store estimation results in matrix b
matrix list b // see the problem, colum 3 "int_mpg_mpg"
b[1,5]
weight mpg int_mpg_mpg foreign _cons
y1 3.0379251 -298.14637 5.8617627 3420.2156 -527.04589
* rename interaction with additional package erepost
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
capt prog drop replace_b
program replace_b, eclass
erepost b= b, rename
end
replace_b
eststo my_model
esttab my_model, label
This gives a nice version of the interaction term:
------------------------------------
(1)
Price
------------------------------------
b
Weight (lbs.) 3.038***
(3.84)
Mileage (mpg) -298.1
(-0.82)
Mileage (mpg) # Mi~) 5.862
(0.90)
Car type 3420.2***
(4.62)
Constant -527.0
(-0.08)
------------------------------------
Observations 74
------------------------------------
But imagine that I have many more models and variables. Then it becomes very unhandy to list all variables, all interactions etc. in this local coln
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
I am trying to only fix particular column names in the estimation matrix:
local colfix "b:c.mpg#c.mpg"
mat colnames b[1,3]= `colfix'
How can I access a particular column of a matrix in Stata?
Your intended new name seems the same as the existing name.
I don't know a way to change individual column names, but this may help:
. matrix foo = J(3, 4, 42)
. matrix colname foo = frog toad DRAGON griffin
. mat li foo
foo[3,4]
frog toad DRAGON griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
. local colnames : colnames foo
. local colnames : subinstr local colnames "DRAGON" "newt", all
. matrix colname foo = `colnames'
. matrix li foo
foo[3,4]
frog toad newt griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
It's a modest exercise to make that into a subroutine. The bigger deal is whether there is a way to do this in the estout suite, wonderful commands I never use.

Stata output table: Conditional symbols in estout

for simplification, let's assume following script to create a simple regression table:
sysuse auto
eststo clear
qui regress price weight mpg
esttab using "table.rtf", cells(t) mtitles onecell nogap ///
stats(N, labels("Observations")) label ///
compress replace
eststo clear
Output:
(1)
.
t
Weight (lbs.) 2.723238
Mileage (mpg) -.5746808
Constant .541018
Observations 74
Question:
Would it be possible to mark every t-value above 0.5 or below 0.5 with an asterisk? (= greater than absolute value 0.5)
Please note: In the specific application case, I can't work with given p-values, and need a custom solution that works with thresholds of t.
Desired outcome:
(1)
.
t
Weight (lbs.) 2.723238*
Mileage (mpg) -.5746808*
Constant .541018*
Observations 74
Crossposting can be found here:
Thank you for your help!
You cannot do this directly with estout but the following works all the same:
sysuse auto, clear
regress price weight mpg
quietly esttab, mtitles onecell nogap stats(N, labels("Observations")) label ///
compress replace star staraux
matrix A = r(coefs)
matrix A = A[1...,2]
svmat A
generate A2 = "*" if abs(A1) >= 0.5
generate A4 = string(A1) + A2
local names : rownames A
generate A3 = ""
forvalues i = 1 / `: word count `names'' {
replace A3 = `"`: word `i' of `names''"' in `i'
}
list A3 A4 if !missing(A3)
+---------------------+
| A3 A4 |
|---------------------|
1. | weight 2.723238* |
2. | mpg -.5746808* |
3. | _cons .541018* |
+---------------------+
preserve
keep if !missing(A3)
export delimited A3 A4 using table.txt, delimiter(" ") novarnames
restore
You will have to do some more gymnastics to get the variable labels etc.

Exporting results from regressions in Excel

I want to store results from ordinary least squares (OLS) regressions in Stata within a double loop.
Here is the structure of my code:
foreach i2 of numlist 1 2 3{
foreach i3 of numlist 1 2 3 4{
quiet: eststo: reg dep covariates, robust
}
}
The end goal is to have a table in Excel composed by twelve rows (one for each model) and seven columns (number of observations, estimated constant, five estimated coefficients).
Any suggestion on how can I do this?
Such a table can be created simply by using the community-contributed command esttab:
sysuse auto, clear
eststo clear
eststo m1: quietly regress price weight
eststo m2: quietly regress price weight mpg
quietly esttab
matrix A = r(coefs)'
matrix C = r(stats)'
tokenize "`: rownames A'"
forvalues i = 1 / `=rowsof(A)' {
if strmatch("``i''", "*b*") matrix B = nullmat(B) \ A[`i', 1...]
}
matrix C = B , C
matrix rownames C = "Model 1" "Model 2"
Result:
esttab matrix(C) using table.csv, eqlabels(none) mlabels(none) varlabels("Model 1" "Model 2")
----------------------------------------------------------------
weight mpg _cons N
----------------------------------------------------------------
Model 1 2.044063 -6.707353 74
Model 2 1.746559 -49.51222 1946.069 74
----------------------------------------------------------------

Is there a way to keep only those variables with a certain label?

I have a very large dataset that needs to be refined to only certain variables. These variables' names are all over the place, but each wanted variable's label starts with "AW". Is there a way to use keep or an analogous command to keep only these variables?
For this you probably need a loop.
// create sample data
clear
set obs 1
generate x = 1
generate y = 2
generate z = 3
label variable x "AW test"
label variable y "XW test"
label variable z "AQ test"
describe
// do the job
foreach var of varlist x-z {
local lbl : variable label `var'
if substr("`lbl'",1,2)!="AW" drop `var'
}
describe
The second describe will show that only the variable x remains.
Official command ds and user-written command findname (Stata Journal: use search findname to download latest version) will identify such variables, after which keep can be used directly.
. sysuse auto, clear
(1978 Automobile Data)
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2007 17:45
size: 3,478 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
--------------------------------------------------------------------------------------
Sorted by: foreign
. ds, has(varlabel Mile*)
mpg
. findname , varlabeltext(Mile*)
mpg
. keep `r(varlist)'
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 1 13 Apr 2007 17:45
size: 444 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
mpg int %8.0g Mileage (mpg)
--------------------------------------------------------------------------------------
Sorted by:
Note: dataset has changed since last saved

use xtile by year using weights

I have data with income variable, with weight, and I want to calculate the 5% quantiles by year.
Is there a way to do that?
For the weight I can use regular xtile:
xtile quan = salary [aw=weight], n(20)
And for the years I can use xtile from egenmore:
egen quan = xtile(salary), by(year) nq(20)
But how can I do it for weights and by year together?
There is a weights() option, as stated in help egenmore:
clear
set more off
sysuse auto
keep mpg foreign weight
// egenmore
egen mpg4 = xtile(mpg), by(foreign) nq(4) weights(weight)
// compare with xtile
xtile mpg4_1 = mpg [aweight=weight] if foreign, nq(4)
xtile mpg4_2= mpg [aweight=weight] if !foreign, nq(4)
egen mpg42 = rowtotal(mpg4_1 mpg4_2)
assert mpg4 == mpg42
sort foreign mpg weight
list, sepby(foreign)
In the ado-file for egen's xtile function, you can check how weights are set:
if "`weights'" ~= "" {
local weight "[aw = `weights']"
}
See viewsource _gxtile.ado.