Storing a numerical variable in a local macro? - list

I have a variable called "count," which contains the number of subjects who attend each of 1300 study visits. I would like to store these values in a local macro and display them one by one using a for loop.
E.g.,
local count_disp = count
forvalues i = 1/1300 {
disp `i' of `count_disp'
}
However, I'm unsure how to store the entire list of the count variable in a macro or how to call each "word" in the macro.
Is this possible in Stata?

In case you only want to display all values in order, then it is easier to skip the intermediate step of creating the macro. You can just display the values row by row like this:
* Example generated by -dataex-. For more info, type help dataex
clear
input byte count
11
22
33
end
* Loop over the number of observation and display
* the value of variable count for that row number
count
forvalues i = 1/`r(_N)' {
* Display value
display count[`i']
}

Related

How to take maximum of rolling window without SSC packages

How can I create a variable in Stata that contains the maximum of a dynamic rolling window applied to another variable? The rolling window must be able to change iteratively within a loop.
max(numvar, L1.numvar, L2.numvar) will give me what I want for a single window size, but how can I change the window size iteratively within the loop?
My current code for calculating the rolling sum (credit to #Nick Cox for the algorithm):
generate var1lagged = 0
forval k = -2(1)2 {
if `k' < 0 {
local k1 = -(`k')
replace var1lagged = var1lagged + L`k1'.var1
}
else replace var1lagged = var1lagged + F`k'.var1
}
How would I achieve the same sort of flexibility but with the maximum, minimum, or average of the window?
In the simplest case suppose K at least 1 is given as the number of lags in the window
local arg L1.numvar
forval k = 2/`K' {
local arg `arg', L`k'.numvar
}
gen wanted = max(`arg')
If the window includes the present value, that is just a twist
local arg numvar
forval k = 1/`K' {
local arg `arg', L`k'.numvar
}
gen wanted = max(`arg')
More generally, numvar would not be a specific variable name, but would be a local macro including such a name.
EDIT 1
This returns missing as a result only if all arguments are missing. If you wanted to insist on missing as a result if any argument is missing, then go
gen wanted = cond(missing(`arg'), ., max(`arg'))
EDIT 2
Check out rolling more generally. Otherwise for a rolling mean you calculate directly you need to work out (1) the sum as in the question (2) the number of non-missing values.
The working context of the OP evidently rules out installing community-contributed commands; otherwise I would recommend rangestat and rangerun (SSC). Note that many community-contributed commands have been published via the Stata Journal, GitHub or user sites.

Day of the week effect - excluding dummy variables not individually

I want to test the day of the week effect of stock returns. The stata code I have written works, but looks fairly inefficient.
// 1) Monday effect
eststo:reg return day_dummy2 day_dummy3 day_dummy4 day_dummy5
// 2) Tuesday effect
eststo:reg return day_dummy1 day_dummy3 day_dummy4 day_dummy5
// 3) Wednesday effect
eststo:reg return day_dummy1 day_dummy2 day_dummy4 day_dummy5
and so on.
Is there a way to write a code with the same function (excluding one day at a time) with e.g. a foreach loop?
Thank you very much for your help!
A bit clunky, perhaps, but you could use Stata's macro (see help extended_fcn) functions to iteratively exclude one of your listed variables and generate the list of remaining variables.
local vars "day1 day2 day3 day4 day5 day6 day7"
forvalues i = 1/7 {
local varexclude : word `i' of `vars'
local varsout`i' : subinstr local vars "`varexclude'" ""
// insert -estout- command here
}
macro list // to verify the individual `varsout`i'' local macros
You can obtain the initial varlist with ds day*, which stores the variable list in r(varlist).

Extract the mean from svy mean result in Stata

I am able to extract the mean into a matrix as follows:
svy: mean age, over(villageid)
matrix villagemean = e(b)'
clear
svmat village
However, I also want to merge this mean back to the villageid. My current thinking is to extract the rownames of the matrix villagemean like so:
local names : rownames villagemean
Then try to turn this macro names into variable
foreach v in names {
gen `v' = "``v''"
}
However, the variable names is empty. What did I do wrong? Since a lot of this is copied from Stata mailing list, I particularly don't understand the meaning of local names : rownames villagemean.
It's not completely clear to me what you want, but I think this might be it:
clear
set more off
*----- example data -----
webuse nhanes2f
svyset [pweight=finalwgt]
svy: mean zinc, over(sex)
matrix eb = e(b)
*----- what you want -----
levelsof sex, local(levsex)
local wc: word count `levsex'
gen avgsex = .
forvalues i = 1/`wc' {
replace avgsex = eb[1,`i'] if sex == `:word `i' of `levsex''
}
list sex zinc avgsex in 1/10
I make use of two extended macro functions:
local wc: word count `levsex'
and
`:word `i' of `levsex''
The first one returns the number of words in a string; the second returns the nth token of a string. The help entry for extended macro functions is help extended_fcn. Better yet, read the manuals, starting with: [U] 18.3 Macros. You will see there (18.3.8) that I use an abbreviated form.
Some notes on your original post
Your loop doesn't do what you intend (although again, not crystal clear to me) because you are supplying a list (with one element: the text name). You can see it running and comparing:
local names 1 2 3
foreach v in names {
display "`v'"
}
foreach v in `names' {
display "`v'"
}
foreach v of local names {
display "`v'"
}
You need to read the corresponding help files to set that right.
As for the question in your original post, : rownames is another extended macro function but for matrices. See help matrix, #11.
My impression is that for the kind of things you are trying to achieve, you need to dig deeper into the manuals. Furthermore, If you have not read the initial chapters of the Stata User's Guide, then you must do so.

SAS function that will only use non-missing values for a variable?

I am trying to create a new variable that is the sum of other variables. Should be simple enough, however if one of the variables that is being used in the calculation of the new variable has a missing value, then the new variable has a missing value as well, when I want it to just sum across the remaining non-missing variables. For example, the data may look like:
a b c d e
1 . 3 2 6
The new variable is calculated as
newvar=a+b+c+d+e
For the above row, SAS returns a missing value for newvar because b is missing, when I would like it to return
newvar=a+c+d+e
as the answer. Is there a simple way to get SAS to do this?
Sure thing: just use the SUM function:
data _null_;
a=1;
b=.;
c=3;
d=2;
e=6;
newvar = sum(a,b,c,d,e);
put newvar=;
run;

Perform Fisher Exact Test from aggregated using Stata

I have a set of data like below:
A B C D
1 2 3 4
2 3 4 5
They are aggregated data which ABCD constitutes a 2x2 table, and I need to do Fisher exact test on each row, and add a new column for the p-value of the Fisher exact test for that row.
I can use fisher.exact and loop to do it in R, but I can't find a command in Stata for Fisher exact test.
You are thinking in R terms, and that is often fruitless in Stata (just as it is impossible for a Stata guy to figure out how to do by ... : regress in R; every package has its own paradigm and its own strengths).
There are no objects to add columns to. May be you could say a little bit more as to what you need to do, eventually, with your p-values, so as to find an appropriate solution that your Stata collaborators would sympathize with.
If you really want to add a new column (generate a new variable, speaking Stata), then you might want to look at tabulate and its returned values:
clear
input x y f1 f2
0 0 5 10
0 1 7 12
1 0 3 8
1 1 9 5
end
I assume that your A B C D stand for two binary variables, and the numbers are frequencies in the data. You have to clear the memory, as Stata thinks about one data set at a time.
Then you could tabulate the results and generate new variables containing p-values, although that would be a major waste of memory to create variables that contain a constant value:
tabulate x y [fw=f1], exact
return list
generate p1 = r(p_exact)
tabulate x y [fw=f2], exact
generate p2 = r(p_exact)
Here, [fw=variable] is a way to specify frequency weights; I typed return list to find out what kind of information Stata stores as the result of the procedure. THAT'S the object-like thing Stata works with. R would return the test results in the fisher.test()$p.value component, and Stata creates returned values, r(component) for simple commands and e(component) for estimation commands.
If you want a loop solution (if you have many sets), you can do this:
forvalues k=1/2 {
tabulate x y [fw=f`k'], exact
generate p`k' = r(p_exact)
}
That's the scripting capacity in which Stata, IMHO, is way stronger than R (although it can be argued that this is an extremely dirty programming trick). The local macro k takes values from 1 to 2, and this macro is substituted as ``k'` everywhere in the curly bracketed piece of code.
Alternatively, you can keep the results in Stata short term memory as scalars:
tabulate x y [fw=f1], exact
scalar p1 = r(p_exact)
tabulate x y [fw=f2], exact
scalar p2 = r(p_exact)
However, the scalars are not associated with the data set, so you cannot save them with the
data.
The immediate commands like cci suggested here would also have returned values that you can similarly retrieve.
HTH, Stas
Have a look the cci command with the exact option:
cci 10 15 30 10, exact
It is part of the so-called "immediate" commands. They allow you to do computations directly from the arguments rather than from data stored in memory. Have a look at help immediate
Each observation in the poster's original question apparently consisted of the four counts in one traditional 2 x 2 table. Stas's code applied to data of individual observations. Nick pointed out that -cci- can analyze a b c d data. Here's code that applies -cci to each table and, like Stas's code, adds the p-values to the data set. The forvalues i = 1/`=_N' statement tells Stata to run the loop from the first to the last observation. a[`i'] refers to the the value of the variable `a' in the i-th observation.
clear
input a b c d
10 2 8 4
5 8 2 1
end
gen exactp1 = .
gen exactp2 =.
label var exactp1 "1-sided exact p"
label var exactp2 "2-sided exact p"
forvalues i = 1/`=_N'{
local a = a[`i']
local b = b[`i']
local c = c[`i']
local d = d[`i']
qui cci `a' `b' `c' `d', exact
replace exactp1 = r(p1_exact) in `i'
replace exactp2 = r(p_exact) in `i'
}
list
Note that there is no problem in giving a local macro the same name as a variable.