SAS logical operators help (if at least two are true then...) - sas

SAS question. I am creating a variable where if different logic rules are met it takes on one of two values. Let's say the binary variable I'm creating is RiskScore and there are three conditions A, B, and C that determine which risk score an observation takes on. How would I do this in SAS?
Condition A: Age > 70
Condition B: Cholesterol > 200
Condition C: Has diabetes
If at least two of conditions A, B, or C are TRUE then RiskScore=High;
Else RiskScore=Low;
Thanks for your help!

In SAS true/false are 1/0, so if you add up your conditions and it's greater than
or equal to 2 then you're good to go.
if sum(age>70, chol>200, diabetes=1)>=2 then do;

Related

Use of the '<' operator in SAS

I have to convert some SAS code. In other programming languages I am used to < being used in comparisons e.g. in pseudo-code: If x < y then z
In SAS, what is the < operator achieving here:
intck(month,startdate,enddate)-(day(enddate)<day(startdate))
I have been able to understand the functions using the reference documentation but I can't see anything relating to how '<' is being used here.
Just to go into a little more detail about what the code you have there is doing, it's an old school method to determine the number of months from one date to the next (possibly to calculate a birthday, for example).
Originally, SAS functions intck and intnx only calculated the number of "firsts of the month" in between two dates (or similar for other intervals). So INTCK('month','31OCT2020'd, '01NOV2020'd) = 1, while INTCK('month','01OCT2020'd,'30NOV2020'd) = 1. Not ideal! So you'd add in this particular bit of code, -(day(enddate)<day(startdate)), which says "if it is not been a full month yet, subtract one". It's equivalent to this:
if day(enddate) < day(startdate) then diff = intck(month,startdate,enddate) - 1;
else diff = intck(month,startdate,enddate);
There's now a better way to do this (yay!). intck and 'intnx' are a bit different, but it's the same idea. For intck the argument is method, where c for "continuous" is what you want to compare same period in the month. For intnx it is the alignment option, where 's' means "same" (so, move to the same point in the month).
So your code now should be:
intck(month,startdate,enddate,'c')
The symbol < is an operator in that expression. It is not a function call , like INTNX() is in your expression.
SAS evaluates boolean expressions (like the less than test in your example) to 1 for TRUE and 0 for FALSE.
So your expression is subtracting 1 when the day of month of ENDDATE is smaller than the day of month of STARTDATE.
Note: You can also do the reverse, treat a number as a boolean expression. For example in a statement like:
if (BASELINE) then PERCENT_CHANGE = (VALUE-BASELINE) / BASELINE ;
A missing value or a value of zero in BASELINE will be treated as FALSE and so in those cases the assignment statement does not run.

Is there a way to generate a new variable only if it meets certain criteria?

I am trying to replicate the following Stata code in R:
gen UAPDL_1=sqrt((((Sanchez_1-Iglesias_1)^2)+((Casado_1-Iglesias_1)^2)+((Rivera_1-Iglesias_1)^2))/3) if maxIglesias_1==1
replace UAPDL_1=sqrt((((Sanchez_1-Rivera_1)^2)+((Casado_1-Rivera_1)^2)+((Iglesias_1-Rivera_1)^2))/3) if maxRivera_1==1
In other words, I am trying to make different calculations and generate a new variable with different values depending on certain conditions (in this case, they have value 1 in an another variable. I managed to create the variables to be met for making the calculation (maxIglesias==1 and maxRivera==1), but I am stuck in the generation of the UAPDL variable. I tried with case_when and ifelse, but in these cases these commands only let you define a certain value. Is there a way with mutate or dplyr (or any other package) to achieve this goal?
Welcome to SO!
Let me try to 'parse' your question for the sake of clarity.
You want to generate a variable UAPDL depending on the value of two distinct variables (maxIglesias_1 and maxRivera_1, which let's say correspond to values f(I) and f(R), respectively). Here I note that, according to the snippet of the code you posted, there is no guarantee that the two variables are mutually exclusive - i.e., you may have records with maxIglesias_1 == 1 AND maxRivera_1 == 1. In those cases, the order in which you run the commands matters, as they all end up valued f(R), or f(I) if you twist them.
However, in order to replicate the Stata commands that you posted (issue with the ordering included!) you should run
UAPDL_1 <- numeric(length(maxIglesias_1)) # generate the vector
UAPDL_1[maxIglesias_1 == 1] <- f(I)
UAPDL_1[maxRivera_1 == 1] <- f(R)
where I assume that maxIglesias_1 and maxIglesias_1 are two R objects of the same length as the original Stata matrix.
Good luck!

How to combine two IsNull condition in IF expression in Qlikview?

I have three column as A, B, C
I'm writing an expression for column D in Qlikview to find out whenever column B & C IsNull I need to replace the value of C in column D. Similarly Vice versa for the remaining columns.
Expression:
=if((IsNull(A) and IsNull(B)), C,if((IsNull(B) and IsNull(C)), A,.....)
But I'm not getting the values in my output.
Was there any issue in the above expression ?
Can someone help me on it .
try
if (rangesum(len(A),len(B))=0,C,if (rangesum(len(B),len(C))=0,A,.....
isNull is a problematic functions and many times does behave as expected.
It is best practice to use Len() instead.
also make sure you have a single value in A,B,C per row, otherwise it will not work

SAS - How to determine the number of variables in a used range?

I imagine what I'm asking is pretty basic, but I'm not entirely certain how to do it in SAS.
Let's say that I have a range of variables, or an array, x1-xn. I want to be able to run a program that uses the number of variables within that range as part of its calculation. But I want to write it in such a way that, if I add variables to that range, it will still function.
Essentially, I want to be able to create a variable that if I have x1-x6, the variable value is '6', but if I have x1-x7, the value is '7'.
I know that :
var1=n(of x1-x6)
will return the number of non-missing numeric variables.. but I want this to work if there are missing values.
I hope I explained that clearly and that it makes sense.
Couple of things.
First off, when you put a range like you did:
x1-x7
That will always evaluate to seven items, whether or not those variables exist. That simply evaluates to
x1 x2 x3 x4 x5 x6 x7
So it's not very interesting to ask how many items are in that, unless you're generating that through a macro (and if you are, you probably can have that macro indicate how many items are in it).
But the range x1--x7 or x: both are more interesting problems, so we'll continue.
The easiest way to do this is, if the variables are all of a single type (but an unknown type), is to create an array, and then use the dim function.
data _null_;
x3='ABC';
array _temp x1-x7;
count = dim(_temp);
put count=;
run;
That doesn't work, though, if there are multiple types (numeric and character) at hand. If there are, then you need to do something more complex.
The next easiest solution is to combine nmiss and n. This works if they're all numeric, or if you're tolerant of the log messages this will create.
data _null_;
x3='ABC';
count = nmiss(of x1-x7) + n(of x1-x7);
put count=;
run;
nmiss is number of missing, plus n is number of nonmissing numeric. Here x3 is counted with the nmiss group.
Unfortunately, there is not a c version of n, or we'd have an easier time with this (combining c and cmiss). You could potentially do this in a macro function, but that would get a bit messy.
Fortunately, there is a third option that is tolerant of character variables: combining countw with catx. Then:
data _null_;
x3='ABC';
x4=' ';
count = countw(catq('dm','|',of x1-x7),'|','q');
put count=;
run;
This will count all variables, numeric or character, with no conversion notes.
What you're doing here is concatenating all of the variables together with a delimiter between, so [x1]|[x2]|[x3]..., and then counting the number of "words" in that string defining word as thing delimited by "|". Even missing values will create something - so .|.|ABC|.|.|.|. will have 7 "words".
The 'm' argument to CATQ tells it to even include missing values (spaces) in the concatenation. The 'q' argument to COUNTW tells it to ignore delimiters inside quotes (which CATQ adds by default).
If you use a version before CATQ is available (sometime in 9.2 it was added I believe), then you can use CATX, but you lose the modifiers, meaning you have more trouble with empty strings and embedded delimiters.

Stata using if statements

I have a situation where I need to check a set of variables and replace a value. For example consider I have three variables a, b, c. I need to create a variable z = 1 if c or b or a == 1.
In other words I need to create a if loop that check c first to see whether c is 1; if not I want to check whether b is 1; if not a is 1. And if c is 1 then the loop has to stop or if b is 1 it need to stop from checking a.
My code is
gen z=.
foreach var in c b a{
if `var'=1 & z!=. {
replace z=1
}
else
z=.
}
}
This code fails to do what I require and I cannot wrap my head around it. I understand I could use a command like this
replace z=1 if (a==1|b==1|c==1)
but to my understanding this code check the condition
(a==1|b==1|c==1)
simultaneously. I require the loop to check each variable a b c step by step.
This is confused or confusing on several levels.
(0) There appears to be buried inside this a problem depending on when 1 first occurs in the dataset. If so, you need to spell that out much more clearly.
(1) You start by saying that you want a new value 1 if any of a, b or c is 1. For that, correct code could be (as you end)
replace z = 1 if a == 1 | b == 1 | c == 1
or alternatively
replace z = 1 if inlist(1, a, b, c)
But then you deny that is what you want and talk about a loop. But no loop is required to solve the problem you posed.
(2) The longer code segment you say you used contains illegal statements and could not possibly run. It also (almost certainly) confuses the if command you do not want and the if qualifier you may want. See this FAQ for explanation
(3) The correct spelling is "Stata", not "STATA", as has been corrected already several times in your posts. Please pay attention to this simplest of details.
(4) Please pay attention to the principle of a minimal reproducible example. https://stackoverflow.com/help/mcve