SAS: Difference between IF-THEN and IF-THEN-DO Statments? - sas

I am new to SAS and would like to know what are the difference Difference between "IF-THEN" and "IF-THEN-DO" statements in SAS?

Simplified you can say, if then is for one statement, if then do for a block of statements. If you use if without then in Datastep, it prevents output for the specific set.
Example:
data x;
set y;
if a = 1 then /*one statment is following*/
b=2;
if a = 1 then do; /* a block of statements is follwing till end statement, similar to brackets in other programming languages*/
b=2;
c=3;
end;
if a = 1; /*only when a = 1 data will be written to x*/
run;

SAS evaluates the expression in an IF-THEN statement to produce a result that is either non-zero, zero, or missing. A non-zero and nonmissing result causes the expression to be true; a result of zero or missing causes the expression to be false.
If the conditions that are specified in the IF clause are met, the IF-THEN statement executes a SAS statement for observations that are read from a SAS data set, for records in an external file, or for computed values. An optional ELSE statement gives an alternative action if the THEN clause is not executed. The ELSE statement, if used, must immediately follow the IF-THEN statement.
Using IF-THEN statements without the ELSE statement causes SAS to evaluate all IF-THEN statements. Using IF-THEN statements with the ELSE statement causes SAS to execute IF-THEN statements until it encounters the first true statement. Subsequent IF-THEN statements are not evaluated. (Source: support.sas.com)
The DO statement is the simplest form of DO group processing. The statements between the DO and END statements are called a DO group. You can nest DO statements within DO groups.
A simple DO statement is often used within IF-THEN/ELSE statements to designate a group of statements to be executed depending on whether the IF condition is true or false. (Source: support.sas.com)
Regards,
Vasilij

Related

C++ Single line If-Else within Loop

I read Why I don't need brackets for loop and if statement, but I don't have enough reputation points to reply with a follow-up question.
I know it's bad practice, but I have been challenged to minimise the lines of code I use.
Can you do this in any version of C++?
a_loop()
if ( condition ) statement else statement
i.e. does the if/else block count as one "statement"?
Similarly, does if/else if.../else count as one "statement"? Though doing so would become totally unreadable.
The post I mentioned above, only says things like:
a_loop()
if(condition_1) statement_a; // is allowed.
You can use ternary operator instead of if...else
while(true) return condition_1 ? a : b;
while seems redundant here if the value of its argument is always true so you can simply write
return condition_1 ? a : b;
Yes, syntactically you can do that.
if/else block is a selection-statement, which is a kind of statement.
N3337 6.4 Selection statements says:
selection-statement:
if ( condition ) statement
if ( condition ) statement else statement
switch ( condition ) statement

why is my If_Else code always one condition only printed while there are more true conditions in the else_if conditions

in if else statement, my code only prints one condition and the rest are left undone. What is the compiler executes if two or more conditions consecutively are true
I think so that you are confused with if else_if and else.
If is like the primary statement which will be first checked to be true and if it's then the code is executed.Else_if is like secondary statement which will be checked (if mentioned) only if the if statement is false. Else is the code to be executed if neither if nor else_if statement is True. I think you are trying to run both if and else if statement in the program keep in mind that if "if" statement is true then it will not execute the else_if and else statement and will pass after the execution of if block is done.

Skipping lines in Fortran

What lines are read in an Fortran IF-statement? I wondered which of these three lines will be read depending on the IF-statement outcome:
IF (x.GT.y) GOTO 100
x= y+1
100 x = y
Will the indexed row 100 always be read? Or is it skipped if the IF-statement returns True and x= y+1 is stored?
The three lines in the example form three distinct executable statements. They are not part of a single IF statement.
The IF statement is formed from the single (non-continued) line
IF (x.GT.y) GOTO 100
Once the IF statement has been executed, the flow of execution either moves on to the next line/statement (if the condition is false) or jumps to the statement labelled 100 (if true). Once execution has flowed, the IF statement exerts no further influence.
If execution reaches the second statement there is no further flow control to stop execution reaching the third statement (other than any defined assignment, defined + operator, or error caused by the evaluation of the right-hand side, etc).
On reaching the third line:
recall that the IF statement is in the long-forgotten past: the label 100 and the "jump target" are irrelevant to the present
the statement having a label does not change how the statement is evaluated (just how it can be reached).
That is, there is nothing which says "this line should be treated in a special way".
Yes, the third line will ("always") be executed in full.

SAS WHERE statement with several conditions

In SAS PROC FREQ, using a WHERE statement with multiple conditions, I would like to understand why adding a condition causes a frequency to increase.
The first instance:
PROC FREQ;
WHERE X=1 AND Y=1;
TABLE YEARS;
RUN;
Outputs N=100 for a particular year.
But:
PROC FREQ;
WHERE (X=1 AND Y=1) AND A=2 OR B=2;
TABLE YEARS;
RUN;
Outputs larger N than the previous WHERE for the same year, e.g., N=200.
In the second FREQ and WHERE statement I think the condition in parentheses should be evaluated first, before the AND...OR, and should select the same N=100 as the first WHERE statement. And then the remaining criteria in the line, AND A=2 OR B=2, should select a subset of N=100 have either A=2 or B=2. And consequently, selected N should be less than or equal to 100, but not greater than 100.
This is what I want--the subset of (X=1 AND Y=1)
that also has either A=2 OR B=2--but it does not seem to be what I am getting. Suggestions?
Is this the correct statement for what I want?
WHERE (X=1 AND Y=1 AND A=2) OR (X=1 AND Y=1 AND B=2);
Thank you.
Adding an un-nested OR to a logical expression will always cause the result set to remain the same or become larger.
You need the parenthesis to change the order of evaluation. When there are no parentheses all the and expressions will be evaluated first, then the or expressions
From the documentation Combining Expressions By Using Logical Operators
Processing Compound Expressions
When SAS encounters a compound WHERE
expression (multiple conditions), the software follows rules to
determine the order in which to evaluate each expression. When WHERE
expressions are combined, SAS processes the conditions in a specific
order:
The NOT expression is processed first.
Then the expressions joined by AND are processed.
Finally, the expressions joined by OR are processed.
Using Parentheses to Control Order of Evaluation
Even though SAS evaluates logical operators in a specific order, you can
control the order of evaluation by nesting expressions in parentheses.
That is, an expression enclosed in parentheses is processed before one
not enclosed. The expression within the innermost set of parentheses
is processed first, followed by the next deepest, moving outward until
all parentheses have been processed.
For example, suppose you want a
list of all the Canadian sites that have both SAS/GRAPH and SAS/STAT
software, so you issue the following expression:
where product='GRAPH' or product='STAT' and country='Canada';
The result, however, includes all sites that license SAS/GRAPH software along with the Canadian
sites that license SAS/STAT software. To obtain the correct results,
you can use parentheses, which causes SAS to evaluate the comparisons
within the parentheses first, providing a list of sites with either
product licenses, then the result is used for the remaining condition:
where (product='GRAPH' or product='STAT') and country='Canada';
So your
WHERE (X=1 AND Y=1) AND A=2 OR B=2;
is the same as
WHERE (X=1 AND Y=1 AND A=2) OR B=2;
your this is what I want described in the question is
WHERE (X=1 AND Y=1) AND (A=2 OR B=2);
which is the same (by distributive law of logic)
WHERE (X=1 AND Y=1 AND A=2) OR (X=1 AND Y=1 AND B=2);
No matter how you state the expression, adding an OR will always have the possibility of increasing the number of items meeting the expression. The un-nested OR will have the possibility of selecting more items than a nested (or parentheticalized) OR

IF-THEN vs IF in SAS

What is the difference between IF and IF-THEN
For example the following statement
if type='H' then output;
vs
if type='H';
output;
An if-then statement conditionally executes code. If the condition is met for a given observation, whatever follows the 'then' before the ; is executed, otherwise it isn't. In your example, since what follows is output, only observations with type 'H' are output to the data set(s) being built by the data step. You can also have an if-then-do statement, such as in the following code:
if type = 'H' then do;
i=1;
output;
end;
If-then-do statements conditionally execute code between the do; and the end;. Thus the above code executes i=1; and output; only if type equals 'H'.
An if without a then is a "subsetting if". According to SAS documentation:
A subsetting IF statement tests the condition after an observation is
read into the Program Data Vector (PDV). If the condition is true, SAS
continues processing the current observation. Otherwise, the
observation is discarded, and processing continues with the next
observation.
Thus if the condition of a subsetting if (ex. type='H') is not met, the observation is not output to the data set being created by the data step. In your example, only observations where type is 'H' will be output.
In summary, both of your example codes produce the same result, but by different means. if type='H' then output; only outputs observations where type is 'H', while if type='H'; output; discards observations where type is not 'H'. Note that in the latter you don't need the output; because there is an implicit output in the SAS data step, which is only overridden if there is an explicit output; command.
They're similar but not identical. In a data step, if is a subsetting statement, and all records not satisfying the condition are dropped. From the documentation:
"Continues processing only those observations that meet the condition of the specified expression."
if then functions more like the if statement in other languages, it executes the statement after the then clause conditionally. A somewhat contrived example:
data baz;
set foo;
if type = 'H';
x = x + 1;
run;
data baz:
set foo;
if type='H' then x = x + 1;
run;
In both examples x will be incremented by 1 if type = 'H', but in the first data step baz will not contain any observations with type not equal to 'H'.
Nowadays it seems like most things that used to be accomplished by if are done using where.