I'm trying to nest IF statements together, so that when it reads through the data, it will be able to categorize it into 3 separate types. The screenshot shows my formula thus far, but its returning VALUE! after I enter it.
As you didn't include your formula this should be helpful:
=IF(ISNUMBER(MATCH(C6,{"A","B","C"},0)),"Slip...",IF(ISNUMBER(MATCH(C6,{"D","E","F"},0)),"Personal...","Not"))
Per your comment your formula should be:
=IF(ISNUMBER(MATCH(C6,{"Slip Trip Fall","Fall down Stairs","Knock, Trip or Fall within bus"},0)),"Slip Trip Fall", IF(ISNUMBER(MATCH(C6,{"Alighting incident","Boarding incident","Struck by object","Personal Injury Event","Wheelchair / Buggy incident"},0)),"Personal Injury Event",IF(ISNUMBER(MATCH(C6,{"Collision","Collision incident"},0)),"Personal Injury Event",FALSE)))
Related
I have a column of values that are a number out of 10. So, it could be 2/10, 3/10, 4/10 and so on, all the way up to 10/10. To be clear, these are not dates, but simply showing how many questions the student answered correctly out of 10.
I'm trying to use conditional formatting to highlight them a certain color depending upon the score they got. For 9/10 and 10/10, I'm wanting to use a certain color, but it doesn't seem to be working with REGEXMATCH or with OR. Also wanting to highlight all scores that are 6/10 or lower. I know that I could make this work by applying conditional formatting for each and every score with text contains but the problem I'm finding is that it thinks it's a date.
Is there a way to match multiple scores out of 10 using REGEXMATCH?
Link to Sheet
select column and change formatting to Plain text
now you can use formula like:
=REGEXMATCH(A1; "^9|10\/")
I am trying to setup a formula in one line that will calculate the proper date that a contract can be cancelled based on the Texas Addendum for Property Subject to Mandatory Owner's Association. Depending on 3 possible selections, Section A1, Section A2, or Section A3, the calculations for the possible termination of a contract vary.
My formula's work on their own, but not when combined into one long if statement.
Here are the 3 formula's. All work properly on their own.
=if(E12="A1",if(B17="",B20,B17+3),)
=if(E12="A2",if(B17="","",B17+3),)
=if(E12="A3",if(B17="",B20,""),)
However, when combined into one statement I get an #ERROR!.
I've tried multiple ways to write the formula but all get the same #ERROR!.
=if((E12="A2",if(B17="","",B17+3)),if(e12="A1",if(B17="",B20,B17+3)),if(E12="A3",if(B17="",B20,"")),)
=if((E12="A2",if(B17="","",B17+3)),if(e12="A1",if(B17="",B20,B17+3)),if(E12="A3",if(B17="",B20,""),))
=if((E12="A2",if(B17="","",B17+3),),if(e12="A1",if(B17="",B20,B17+3),),if(E12="A3",if(B17="",B20,""),))
Currently this is working as is as I have a final calculation in the necessary cell that takes the one value greater than zero.
=if(D31>0,D31,if(D32>0,D32,if(D33>0,D33)))
But it's not as clean as I'd like to have it. I'd prefer to have this as one single line calculation instead of in 4 different cells.
proper nesting is done like this:
=IF(E12="A1", IF(B17="", B20, B17+3),
IF(E12="A2", IF(B17="",, B17+3),
IF(E12="A3", IF(B17="", B20, ), )))
I have a list of circumstances and effects:
I want to generate a matrix with betas containing the values of betas. I am going to run the loop 10 times, because i am in fact going to bootstrap my observations.
So far I have tried:
local circumstances height weight
local effort training diet
foreach i in 1 10 {
reg outcome `circumstances' `effects'
* store in column i the values of betas of circumstances
* store in column i the values of betas of effort
}
Does anyone know what should the code look like in order to store those values?
Thank you
The pseudocode would first store in "column 1" the first lot of betas and then overwrite them (column 1) with the second lot of betas. Then it would do the same again for column 10 with the first lot of betas and the second lot of betas. That is a long way from anything that makes sense. Nothing in your pseudocode takes bootstrap samples from the dataset, although perhaps you are intending to add code for that later.
Stata doesn't really work with any idea of column numbers, although the idea makes sense to Mata.
Unless there are very specific reasons -- which you would need to spell out -- there is no need to write your own code ab initio for bootstrapping, as the whole point of bootstrap is to do that for you.
Here is complete code for a reproducible example of bootstrapping a silly regression:
sysuse auto, clear
bootstrap b_weight=_b[weight] b_price=_b[price] , reps(1000) seed(2803) : regress mpg weight price
See also the help for bootstrap to learn about its other options, including saving().
10 repetitions would be regarded as absurdly small for the number of bootstrap samples.
I am using PROC GLIMMIX to analyze repeated measures data about specific sexual events. The original data came from a weekly diary study of about 400 people. During each week they reported on behaviours from their most recent sexual encounter. We also have basline data on their demographics. 12 weeks of observation were collected and we had a high completion rate.
I would like to create a mixed effect model, but I am unsure exactly how this is done in SAS. I want to test the effect of event-specific factors as well as some person level demographics and would like to get odds ratios for each factor of interest. The outcome is whether or not drugs were used during the event and the explanatory factors will be things like age, gender, etc. as well as characteristics about the event (i.e., partner HIV status), whether the partner was a regular sexual partner, etc..
The code I'm working with follows this pattern:
PROC GLIMMIX DATA=work.dataset oddsratio;
CLASS VISIT_NUMBER PARTICIPANT_ID BINARY_EVENTLEVEL_OUTCOME BINARY_EVENTLEVEL_EXPLANATORY_FACTOR CATEGORICAL_PERSONLEVEL_EXPLANATORY_FACTOR;
MODEL BINARY_EVENTLEVEL_OUTCOME = BINARY_EVENTLEVEL_EXPLANATORY CATEGORICAL_PERSONLEVEL_EXPLANATORY_FACTOR /DIST=binary link=logit CL S ddfm=kr;
RANDOM ?????;
RUN;
option 1 for ?????: residual / subject=PARTICIPANT_ID
option 2 for ?????: INTERCEPT / subject=PARTICIPANT_ID
option 3 for ?????: VISIT_NUM / subject=PARTICIPANT_ID residual type=ar(1)
INTERCEPT / subject=VISIT_NUM(PARTICIPANT_ID)
option 4 for ?????: Other?
I am also unclear whether I should use ddfm=kr in my model statement or method=laplace in my proc statement -- both have been recommended elsewhere for this sort of repeated measures analysis.
I've come across several potential options for modelling this which often give similar results, but option 1 gives a statistically significant result for an event-level, while the others give non-significant results. The inclusion of the ddfm=kr makes the result of interest more significant. The method=laplace does not allow for option 1.
I may not be answering your question, but might be able to provide a couple of directions:
To start with the simplest part, your MODEL statement looks correct to me as you want to test event-level factors and person-level demographics which are thus considered as fixed effects.
Now, as far as the random effects are concerned:
the RANDOM statements you propose for options (1) and (2):
(1) RANDOM _residual_ / subject=PARTICIPANT_ID;
or
(2) RANDOM intercept / subject=PARTICIPANT_ID;
are modeling two different parts of the random effects: the R-side and the G-side, respectively.
If you are already familiar with PROC MIXED, you may want to notice that your option (1) of using RANDOM _residual_ in PROC GLIMMIX is equivalent to using the REPEATED statement in PROC MIXED that tells that you have repeated measures for PARTICIPANT_ID, which is clearly your case (Ref: "Comparing the GLIMMIX and MIXED Procedures")
As for option (3):
RANDOM VISIT_NUM / subject=PARTICIPANT_ID residual type=ar(1) INTERCEPT / subject=VISIT_NUM(PARTICIPANT_ID);
here you are modeling the time component of the repeated measures (visit_num) as a random effect, and this should be included when you believe that there would be a random variation of the response at each of the measurements times (i.e. at each event). At first glance, I don't believe this is relevant in your case, since you are taking this into account already by the fixed effects... but of course I may be wrong by not seeing your data.
Up to here is what I can contribute at this time.
As next steps for you to have a better understanding, I would suggest that you:
Read the Overview of the PROC GLIMMIX documentation, in particular the mathematical model specification and all 3 sections therein:
The Basic Model
G-Side and R-Side Random Effects and Covariance Structures
Relationship with Generalized Linear Models
If you are still unsure, ask your question at communities.sas.com which might be able to help you better.
HTH
I'm trying to save output from several hundred eststo's storing results of bivariate probability models into one excel file using esttab. It works for xtlogit(both ,re and ,pa), xtprobit (both ,re and ,pa) and for the linear probability model xtreg (both standard and ,fe. However, when I use xtreg y x i.year, fe I get the error message too many base levels specified. Google doesn't help me much.
I've been trying for an hour to create a reproducible example but the stata datasets all work fine. It does not seem to be due to the number of years or the fact that different specifications have data for different years. Still, the normal xtreg, fe' works, the problem only appears with time dummies. The weirdest thing is that it works for all subsets of my variables but not for the whole list (again just the time fixed effects specifications).
Does anyone have an idea how to proceed? Using drop(*.year) works whenever the problem does not arise (so in specifications where it works, I get outputs without the year dummies) but does not prevent the too many base levels specified error; ,nobaselevels has no apparent effect as well. Is there a way to remove the time fixed effects from eststo before I pass those on to esttab? Any workaround would be appreciated as well.
The problem you might be facing is that of Stata creating different base levels for the factor variable year, in different regressions.
Try fixing the factor variable base level beforehand with fvset:
fvset base <some_number> year
Check help fvset and the manual entry for details. Also, read the source given below, which contains more information.
Source: two posts from Statalist; one from Tim Wade and another by Jeff Pitblado.