When using Stata's describe (docs):
sysuse auto
d
I get the following output:
Contains data from /usr/local/stata13/ado/base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2013 17:45
size: 3,182 (_dta has notes)
----------------------------------------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
----------------------------------------------------------------------------------------------------------------------------
Sorted by: foreign
Is it possible to remove the value label from the displayed results?
If you want to see everything but value label names, that's programmable but cannot I think be achieved by option choices.
Here is a first approximation to such a program:
program mydescribe
version 8.2
syntax [varlist]
di
foreach v of local varlist {
di abbrev("`v'", 20) _c
di "{col 22}`: type `v''" _c
di "{col 30}`: format `v''" _c
di "{col 38}`: variable label `v''"
}
end
Related
I am working on modyfinyg, renaming and arranging estimation results for final publication in Stata. What I have so far is:
sysuse auto, clear
* interaction model
gen int_mpg_mpg = mpg*mpg // generate interaction manually
qui regress price weight mpg int_mpg_mpg foreign
mat b=e(b) // store estimation results in matrix b
matrix list b // see the problem, colum 3 "int_mpg_mpg"
b[1,5]
weight mpg int_mpg_mpg foreign _cons
y1 3.0379251 -298.14637 5.8617627 3420.2156 -527.04589
* rename interaction with additional package erepost
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
capt prog drop replace_b
program replace_b, eclass
erepost b= b, rename
end
replace_b
eststo my_model
esttab my_model, label
This gives a nice version of the interaction term:
------------------------------------
(1)
Price
------------------------------------
b
Weight (lbs.) 3.038***
(3.84)
Mileage (mpg) -298.1
(-0.82)
Mileage (mpg) # Mi~) 5.862
(0.90)
Car type 3420.2***
(4.62)
Constant -527.0
(-0.08)
------------------------------------
Observations 74
------------------------------------
But imagine that I have many more models and variables. Then it becomes very unhandy to list all variables, all interactions etc. in this local coln
local coln "b:weight b:mpg b:c.mpg#c.mpg b:foreign b:_cons"
mat colnames b= `coln'
I am trying to only fix particular column names in the estimation matrix:
local colfix "b:c.mpg#c.mpg"
mat colnames b[1,3]= `colfix'
How can I access a particular column of a matrix in Stata?
Your intended new name seems the same as the existing name.
I don't know a way to change individual column names, but this may help:
. matrix foo = J(3, 4, 42)
. matrix colname foo = frog toad DRAGON griffin
. mat li foo
foo[3,4]
frog toad DRAGON griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
. local colnames : colnames foo
. local colnames : subinstr local colnames "DRAGON" "newt", all
. matrix colname foo = `colnames'
. matrix li foo
foo[3,4]
frog toad newt griffin
r1 42 42 42 42
r2 42 42 42 42
r3 42 42 42 42
It's a modest exercise to make that into a subroutine. The bigger deal is whether there is a way to do this in the estout suite, wonderful commands I never use.
for simplification, let's assume following script to create a simple regression table:
sysuse auto
eststo clear
qui regress price weight mpg
esttab using "table.rtf", cells(t) mtitles onecell nogap ///
stats(N, labels("Observations")) label ///
compress replace
eststo clear
Output:
(1)
.
t
Weight (lbs.) 2.723238
Mileage (mpg) -.5746808
Constant .541018
Observations 74
Question:
Would it be possible to mark every t-value above 0.5 or below 0.5 with an asterisk? (= greater than absolute value 0.5)
Please note: In the specific application case, I can't work with given p-values, and need a custom solution that works with thresholds of t.
Desired outcome:
(1)
.
t
Weight (lbs.) 2.723238*
Mileage (mpg) -.5746808*
Constant .541018*
Observations 74
Crossposting can be found here:
Thank you for your help!
You cannot do this directly with estout but the following works all the same:
sysuse auto, clear
regress price weight mpg
quietly esttab, mtitles onecell nogap stats(N, labels("Observations")) label ///
compress replace star staraux
matrix A = r(coefs)
matrix A = A[1...,2]
svmat A
generate A2 = "*" if abs(A1) >= 0.5
generate A4 = string(A1) + A2
local names : rownames A
generate A3 = ""
forvalues i = 1 / `: word count `names'' {
replace A3 = `"`: word `i' of `names''"' in `i'
}
list A3 A4 if !missing(A3)
+---------------------+
| A3 A4 |
|---------------------|
1. | weight 2.723238* |
2. | mpg -.5746808* |
3. | _cons .541018* |
+---------------------+
preserve
keep if !missing(A3)
export delimited A3 A4 using table.txt, delimiter(" ") novarnames
restore
You will have to do some more gymnastics to get the variable labels etc.
I want to store results from ordinary least squares (OLS) regressions in Stata within a double loop.
Here is the structure of my code:
foreach i2 of numlist 1 2 3{
foreach i3 of numlist 1 2 3 4{
quiet: eststo: reg dep covariates, robust
}
}
The end goal is to have a table in Excel composed by twelve rows (one for each model) and seven columns (number of observations, estimated constant, five estimated coefficients).
Any suggestion on how can I do this?
Such a table can be created simply by using the community-contributed command esttab:
sysuse auto, clear
eststo clear
eststo m1: quietly regress price weight
eststo m2: quietly regress price weight mpg
quietly esttab
matrix A = r(coefs)'
matrix C = r(stats)'
tokenize "`: rownames A'"
forvalues i = 1 / `=rowsof(A)' {
if strmatch("``i''", "*b*") matrix B = nullmat(B) \ A[`i', 1...]
}
matrix C = B , C
matrix rownames C = "Model 1" "Model 2"
Result:
esttab matrix(C) using table.csv, eqlabels(none) mlabels(none) varlabels("Model 1" "Model 2")
----------------------------------------------------------------
weight mpg _cons N
----------------------------------------------------------------
Model 1 2.044063 -6.707353 74
Model 2 1.746559 -49.51222 1946.069 74
----------------------------------------------------------------
I have a very large dataset that needs to be refined to only certain variables. These variables' names are all over the place, but each wanted variable's label starts with "AW". Is there a way to use keep or an analogous command to keep only these variables?
For this you probably need a loop.
// create sample data
clear
set obs 1
generate x = 1
generate y = 2
generate z = 3
label variable x "AW test"
label variable y "XW test"
label variable z "AQ test"
describe
// do the job
foreach var of varlist x-z {
local lbl : variable label `var'
if substr("`lbl'",1,2)!="AW" drop `var'
}
describe
The second describe will show that only the variable x remains.
Official command ds and user-written command findname (Stata Journal: use search findname to download latest version) will identify such variables, after which keep can be used directly.
. sysuse auto, clear
(1978 Automobile Data)
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2007 17:45
size: 3,478 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
--------------------------------------------------------------------------------------
Sorted by: foreign
. ds, has(varlabel Mile*)
mpg
. findname , varlabeltext(Mile*)
mpg
. keep `r(varlist)'
. d
Contains data from C:\Program Files\Stata10\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 1 13 Apr 2007 17:45
size: 444 (99.9% of memory free) (_dta has notes)
--------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
mpg int %8.0g Mileage (mpg)
--------------------------------------------------------------------------------------
Sorted by:
Note: dataset has changed since last saved
I have data with income variable, with weight, and I want to calculate the 5% quantiles by year.
Is there a way to do that?
For the weight I can use regular xtile:
xtile quan = salary [aw=weight], n(20)
And for the years I can use xtile from egenmore:
egen quan = xtile(salary), by(year) nq(20)
But how can I do it for weights and by year together?
There is a weights() option, as stated in help egenmore:
clear
set more off
sysuse auto
keep mpg foreign weight
// egenmore
egen mpg4 = xtile(mpg), by(foreign) nq(4) weights(weight)
// compare with xtile
xtile mpg4_1 = mpg [aweight=weight] if foreign, nq(4)
xtile mpg4_2= mpg [aweight=weight] if !foreign, nq(4)
egen mpg42 = rowtotal(mpg4_1 mpg4_2)
assert mpg4 == mpg42
sort foreign mpg weight
list, sepby(foreign)
In the ado-file for egen's xtile function, you can check how weights are set:
if "`weights'" ~= "" {
local weight "[aw = `weights']"
}
See viewsource _gxtile.ado.