Why IF statement works when WHERE doesn't - if-statement

Can anyone tell me why this line doesn't work? It's in a datastep in a macro.
where 1*substr(Sample_ID,6,6)<201704; (ERROR: where clause requires numeric bla bla)
Whereas the same thing with an if statement works.
if 1*substr(Sample_ID,6,6)<201704;

A where clause cannot do implicit conversion (number to character, or vice versa), whereas an if statement within the datastep can perform that conversion.
Your where clause should read :
where=(input(substr(Sample_ID,6,6),8.) < 201704)

Related

What does the parenthesis operator does on its own in C++

During writing some code I had a typo that lead to unexpected compilation results and caused me to play and test what would be acceptable by the compiler (VS 2010).
I wrote an expression consisting of only the parenthesis operator with a number in it (empty parenthesis give a compilation error):
(444);
When I ran the code in debug mode, it seems that the line is simply skipped by the program. What is the meaning of the parenthesis operator when it appears by itself?
If I can answer informally,
(444);
is a statement. It can be written wherever the language allows you to write a statement, such as in a function. It consists of an expression 444, enclosed in parentheses (which is also an expression) followed by the statement terminator ;.
Of course, any sane compiler operating in accordance with the as-if rule, will remove it during compilation.
One place where at least one statement is required is in a switch block (even if program control never reaches that point):
switch (1){
case 0:
; // Removing this statement causes a compilation error
}
(444); is a statement consisting of a parenthesized expression (444) and a statement terminator ;
(444) consists of parentheses () and a prvalue expression 444
A parenthesized expression (E) is a primary expression whose type, value, and value category are identical to those of E. The parenthesized expression can be used in exactly the same contexts as those where E can be used, and with the same meaning, except as otherwise indicated.
So in this particular case, parentheses have no additional significance,
so (444); becomes 444; which is then optimized out by the compiler.

List of special characters in SAS macros

I'd like to know which characters are safe for any use in SAS macros.
So what I mean by special characters here is any character (or group of characters) that can have a specific role in SAS in any context. I'm not that interested in keywords (made of a-z 1-9 chars).
For example = ^= ; % , # are special (not sure if # is actually used in SAS, but it's used for doc so still count as a parameter that is not 'safe for all uses').
But what about $ ! ~ § { } ° etc ?
This should include characters that are special in PROC SQL as well.
I'd like to use some of these characters and give them a special meaning in my code, but I'd rather not conflict with any existing use (I'm especially interested in ~).
A bit of general reference:
reserved macro
words
Macro word rules
SAS operators and mnemonics
Rules for SAS names
I think the vast majority of the characters on a standard English keyboard are used somewhere or other in the SAS language.
To address your examples:
$ Used in format names, put/input statements, regular expression definitions...
! 'or' operator in some environments
~ 'not' operator
§ Not used as far as I know
{} Can be used for data step array references & definitions
° Not used as far as I know
None of the above do anything special in a macro context, as Tom has already made clear in his answer.
Maybe SAS Operators in Expressions can help you for ~,
looking at the tables
Comparison Operators and
Logical Operators
The main triggers in macro code are & and % which are used to trigger macro variable references and macro statements, functions or macro calls.
The ; (semi-colon) is used in macro code (as in SAS code) to indicate the end of a statement.
For passing parameters into macro parameters you mainly need to worry about , (comma). But you will also want to avoid unbalanced (). You should avoid use = when passing parameter values by position.
You can protect them by adding quotes or extra () around the values. But those characters become part of the value passed. You can use macro quoting to protect them.
%mymac(parm1='1,200',parm2=(1,200),parm3=%str(1,200),parm4="a(b")
Equal signs can be included without quoting as long as your call is using named parameters.
%mymac(parm1=a=b)
In addition to the previous answers;
% is also used to include files in your program. %include.
Using special characters may cause your code to get stuck in a loop due to unbalanced quotes. SAS Note.
If you run into this just submit the magic string below:
*';*";*/;run;

Data block Fortran 77 clarification

I am trying to access a data block, the way it is define is as follows
DATA NAME /'X1','X2','X3','X4','X5','X6','X7','X8','X9','10','11',00028650
1'12','13','14','15','16','17','18','19','20','21','22','23','24'/ 00028660
The code is on paper. Note this is an old code, the only thing i am trying to do is understand how the array is being indexed. I am not trying to compile it.
The way it is accessed is as follows
I = 0
Loop
I = I + 1
write (06,77) (NAME(J,I),J=1,4) //this is inside a write statement.
end loop //77 is a format statement.
Not sure how it is being indexed, if you guys can shed some light that would be great.
The syntax (expr, intvar=int1,int2[,int3]) widely refers to an implied DO loop. There are several places where such a thing may occur, and an input/output statement is one such place.
An implied DO loop evaluates the expression expr with the do control integer variable intvar sequentially taking the values initially int1 in steps of int3 until the value int2 is reached/passed. This loop control is exactly as one would find in a do loop statement.
In the case of the question, the expression is name(j,i), the integer variable j is the loop variable, taking values between the bounds 1 and 4. [The step size int3 is not provided so is treated as 1.] The output statement is therefore exactly like
write(6,77) name(1,i), name(2,i), name(3,i), name(4,i)
as we should note that elements of the implied loop are expanded in order. i itself comes from the loop containing this output statement.
name here may refer to a function, but given the presence of a data statement initializing it, it must somehow be declared as a rank-2 (character) array. The initialization is not otherwise important.

SAS: Difference between IF-THEN and IF-THEN-DO Statments?

I am new to SAS and would like to know what are the difference Difference between "IF-THEN" and "IF-THEN-DO" statements in SAS?
Simplified you can say, if then is for one statement, if then do for a block of statements. If you use if without then in Datastep, it prevents output for the specific set.
Example:
data x;
set y;
if a = 1 then /*one statment is following*/
b=2;
if a = 1 then do; /* a block of statements is follwing till end statement, similar to brackets in other programming languages*/
b=2;
c=3;
end;
if a = 1; /*only when a = 1 data will be written to x*/
run;
SAS evaluates the expression in an IF-THEN statement to produce a result that is either non-zero, zero, or missing. A non-zero and nonmissing result causes the expression to be true; a result of zero or missing causes the expression to be false.
If the conditions that are specified in the IF clause are met, the IF-THEN statement executes a SAS statement for observations that are read from a SAS data set, for records in an external file, or for computed values. An optional ELSE statement gives an alternative action if the THEN clause is not executed. The ELSE statement, if used, must immediately follow the IF-THEN statement.
Using IF-THEN statements without the ELSE statement causes SAS to evaluate all IF-THEN statements. Using IF-THEN statements with the ELSE statement causes SAS to execute IF-THEN statements until it encounters the first true statement. Subsequent IF-THEN statements are not evaluated. (Source: support.sas.com)
The DO statement is the simplest form of DO group processing. The statements between the DO and END statements are called a DO group. You can nest DO statements within DO groups.
A simple DO statement is often used within IF-THEN/ELSE statements to designate a group of statements to be executed depending on whether the IF condition is true or false. (Source: support.sas.com)
Regards,
Vasilij

IF-THEN vs IF in SAS

What is the difference between IF and IF-THEN
For example the following statement
if type='H' then output;
vs
if type='H';
output;
An if-then statement conditionally executes code. If the condition is met for a given observation, whatever follows the 'then' before the ; is executed, otherwise it isn't. In your example, since what follows is output, only observations with type 'H' are output to the data set(s) being built by the data step. You can also have an if-then-do statement, such as in the following code:
if type = 'H' then do;
i=1;
output;
end;
If-then-do statements conditionally execute code between the do; and the end;. Thus the above code executes i=1; and output; only if type equals 'H'.
An if without a then is a "subsetting if". According to SAS documentation:
A subsetting IF statement tests the condition after an observation is
read into the Program Data Vector (PDV). If the condition is true, SAS
continues processing the current observation. Otherwise, the
observation is discarded, and processing continues with the next
observation.
Thus if the condition of a subsetting if (ex. type='H') is not met, the observation is not output to the data set being created by the data step. In your example, only observations where type is 'H' will be output.
In summary, both of your example codes produce the same result, but by different means. if type='H' then output; only outputs observations where type is 'H', while if type='H'; output; discards observations where type is not 'H'. Note that in the latter you don't need the output; because there is an implicit output in the SAS data step, which is only overridden if there is an explicit output; command.
They're similar but not identical. In a data step, if is a subsetting statement, and all records not satisfying the condition are dropped. From the documentation:
"Continues processing only those observations that meet the condition of the specified expression."
if then functions more like the if statement in other languages, it executes the statement after the then clause conditionally. A somewhat contrived example:
data baz;
set foo;
if type = 'H';
x = x + 1;
run;
data baz:
set foo;
if type='H' then x = x + 1;
run;
In both examples x will be incremented by 1 if type = 'H', but in the first data step baz will not contain any observations with type not equal to 'H'.
Nowadays it seems like most things that used to be accomplished by if are done using where.