error in do loop SAS - sas

I need my data temp dataset to generate 2 columns.
word1 and word2. Both will have blank values. The value in the do loop will change. 2 is just a dummy number.
Can some one tell me how to interpret this error ?
data temp(drop=k);
do k=1 to 2;
word&k=.;
output;
end;
run;
Logs -
180
WARNING: Apparent symbolic reference K not resolved.
ERROR 180-322: Statement is not valid or it is used out of proper order.

You need to use an array, not a macro variable; you're misunderstanding how macro variables work.
data temp(drop=k);
array word[2];
do k=1 to 2;
word[k]=.;
output;
end;
run;
Macro variables are an entirely different system, and require a different kind of loop (and, to be inside a macro, the way you're trying to do it).

Related

difference between using if _n_ =1 then do; and not using it with PRXPARSE() in SAS

Created a dataset :
data x;
infile datalines truncover;
input name $100.;
datalines;
Deepanshu
How are you, deepanshu
dipanshu
deepanshu is a good boy
My name is deepanshu
Deepanshu Bhalla
Deepanshuuu
DeepanshuBhalla
Bhalla Deepanshu
;
run;
Wrote the following code :
data test;
set x;
if _n_ =1 then do;
retain re;
re = prxparse("s/(Deepanshu\s?Bhalla|bhalla\s?Deepanshu|Deepanshu)/Soumya Pandey/i");
end;
new_data = prxchange(re, -1, name);
proc print;
run;
and a similar one but without the
if _n_ =1 then do; end; retain;
data test;
set x;
re = prxparse("s/(Deepanshu\s?Bhalla|bhalla\s?Deepanshu|Deepanshu)/Soumya Pandey/i");
new_data = prxchange(re, -1, name);
proc print;
run;
Both of the testing codes gave the same result. What is the difference between them?
DATA Step code blocks with the construct
if _n_ = 1 then do;
...
end;
cause the interior statements to occur during only the 1st iteration of the implicit loop.
Retaining a (non DATA SET) variable prevents its value from being reset to missing at the top of the implicit loop. Retain can be used to initialize a variable with a literal value at compilation time and do not need a if _n_=1 guard. Initializations from an computation assignment or INPUT necessarily require the guard (except special situations such a prxparse).
For the case of interior statement re = prxparse(...)
As stated by #whymath the DATA Step compiler is being improved in each release of SAS and there are now some implicit guards against recompiling a static regular expression pattern.
You will see the same code construct used for initializing hash objects.
Tip of the day:
If you do not specify an argument, the RETAIN statement causes the values of all variables that are created with INPUT or assignment statements to be retained from one iteration of the DATA step to the next.
The first one uses if _n_ = 1 then ...; retain ; statements, which is a very good and practical programming technique, called initialization block. It will only executing at the first row when reading data and avoids compiling the regular expression for each iteration of data step.
However, This skill may considered old-fashioned. In the very new version of SAS(mine is SAS9.4M5), we don't need to write this initialization block any more, there is some internal parser optimization. Here is specification from SAS Help Center:
If perl-regular-expression is a constant or if it uses the /o option, the Perl regular expression is compiled only once. Successive calls to PRXPARSE do not cause a recompile, but returns the regular-expression-id for the regular expression that was already compiled. This behavior simplifies the code because you do not need to use an initialization block (IF _N_ =1) to initialize Perl regular expressions.
So I prefer to use your second way, it is a true SASor's way.

how to get macro variable to evaluate math?

I have the following sas marco snippet:
%macro processLink(uuid=, name=, cluster_external_ipaddress=);
%let unix_starttime = 1000000*(&starttime - '01JAN1970:00:00'dt);
%let unix_endtime = 1000000*(&endtime - '01JAN1970:00:00'dt);
...
when this runs it just creates the variable as a string ie
=1000000*(dhms(today()-1,0,0,0) - '01JAN1970:00:00'dt)
instead of the unix timestamp in usecs.
using unix_starttime = 1000000*(&starttime - '01JAN1970:00:00'dt); outside the macro in a data step works
do i need a null datastep in the macro for this to work as intended ?
Thanks
In general if you want to work with DATA you are better off using SAS code and not MACRO code. You can use CALL SYMPUTX() to generate a macro variable if you need it later.
data _null_;
call symputx('unix_starttime',1000000*(&starttime - '01JAN1970:00:00'dt));
...
run;
You can use %eval() to do simple integer arithmetic and comparisons. If you need to use floating point numbers (or date/time/datetime literals) then you need to use %sysevalf().
%let unix_starttime=%sysevalf(1000000*(&starttime - '01JAN1970:00:00'dt));
In general, anything after a %let statement is treated as pure text. However, there are functions available to wrap around the text which tell SAS to perform a mathematical operation.
These are %eval, used for integer calculations, or %sysevalf where calculations involving decimals are required.
So you could put %let unix_starttime = %eval(1000000*(&starttime - '01JAN1970:00:00'dt));
It's not applicable here, but if you ever need to include a function in a %let statement, then precede the function name with %sysfunc

SAS Macro Works Standalone, But Not When Looped

I have a large dataset where I am storing macro parameters. The macro is itself used to call a number of other macros, each of which runs a number of operations.
Ideally, I'd like to use another macro to loop over each row of the dataset, construct (using PROC SQL) a macro call, store it in a macro variable :CALL, and call the variable at every iteration of the loop (with a PUT &CALL.;) That is:
%macro OUTER_LOOP(DS);
%let K = ;
%COUNT_ROWS(DS, K); /* This stores the number of rows in DS in K. */
%do i = 1 %to &K.;
proc sql noprint; ...; quit; /* Create the macro call, and store it in :CALL. */
%put &CALL.;
%end;
%mend;
%OUTER_LOOP;
This doesn't work as expected: some of the internal checks that exist in my macro indicate several datasets created by the macro are missing. Curiously, when I don't run this in a macro loop (i.e. I manually create a macro call, row-by-row, and execute it), no error occurs.
Has anyone experienced this issue? If so, is anyone familiar with a solution that would still allow me to loop over macro calls? I know that CALL EXECUTE(); (in the data step) runs different parts of the macro at different times--is that what is occurring in this case, as well?
I would add %put Loop iterating: i=&i k=&k ; inside the DO loop. That will let you see how many times the loop iterates. One possibility is the loop is exiting earlier than you intend it to. If that is the case, the cause could be a collision between the macro variable i you use for the looping in %Outer_Loop and another macro variable i you use in one of the inner macros you call. As a general rule, it's a good idea to define macro variables as %LOCAL to the macro they are defined in. Doing that will prevent such macro variable collisions. But without seeing the inner macros, that's just one possibility.
You could also add %put %superq(Call) ; inside the do loop. That will show you the macro calls that are being generated, so you can check you are getting the expected parameter values in each call.
Most likely a scoping issue. Your sub-macros are likely overwriting the values of your macro variables in your calling-macros.
You can fix this by declaring all your variables as local variables using the %local statement. If there are macro variables that you need to access after the macros have run, explicitly declare them as %global.
So for the macro you have listed above you will need the below line:
%local k i;
Don't forget you need to do this for any sub-macros that are called, and so on...
You can avoid a lot of these types of problems by generating the code yourself. For your example you could move the logic that generates the code from SQL to a data step and then instead of a macro you just need a data step. You don't even need know the number of observations in the dataset in advance.
filename code temp ;
data _null_;
set DS ;
file code ;
put '.... generated code based on values in current data ... ;
run;
%include code / source2 ;

Changing Value of Macro Variable inside SAS macro

I am defining a macro variable inside a macro. Then, I am feeding it into a second macro. Inside macro2 counter changes value to 200. However, when I check what is inside the macro variable that I put in after macro 2 runs, it still says 0. I would like it to store the value 200? is this possible?
%macro macro1();
%let variable1= 0;
macro2(counter=&variable1)
%put &variable1;
%mend macro1;
%macro1;
You have a couple of issues here. First of all, you are missing the % before your call to macro2, but I suspect that's just a typo. The main issue is that you are trying to do what is referred to in other languages as call-by-reference. You can do this in SAS macro by passing the name of your variable rather than the value of your variable, and then use some funky & syntax to set the variable of that name to a new value.
Here is some sample code that does this:
%macro macro2(counter_name);
/* The following code translates to:
"Let the variable whose name is stored in counter_name equal
the value of the variable whose name is stored in counter_name
plus 1." */
%LET &counter_name = %EVAL (&&&counter_name + 1);
%mend;
%macro macro1();
%let variable1= 0;
/* Try it once - see a 1 */
/* Notice how we're passing 'variable1', not '&variable1' */
%macro2(counter_name = variable1)
%put &variable1;
/* Try it twice - see a 2 */
/* Notice how we're passing 'variable1', not '&variable1' */
%macro2(counter_name = variable1)
%put &variable1;
%mend macro1;
%macro1;
I actually have another post on StackOverflow that has an explanation of the &&& syntax; you can have a look at it here. Note that the %EVAL call has nothing to do with call-by-reference, it is just there to do the addition.
Sparc_Spread explains how to "call by reference" in the SAS macro language, which may solve your problem.
In this particular case though, it's not necessarily crucial to use call by reference, and I'd argue it's not idiomatic to SAS macro language to use it (though certainly nothing wrong with it - it just looks a bit odd, and is a bit harder since it's not really a native concept, though certainly intentionally supported to be used that way if desired). There are two ways to get around this that both are very easy to use.
First of all, let's say you know the variable name you want to increment, and the starting value is the only interesting thing. Thanks to how SAS macro language handles scoping, with something not exactly lexical scoping and not exactly functional, it automatically will use the variable that already exists in the most local scope, when it does already exist (with some minor caveats, such as macros using DOSUBL).
So this works as expected:
%macro macro2(counter=);
%do variable1 =&counter. %to 200;
%if %sysfunc(mod(&variable1.,50))=0 %then %put &=variable1;
%end;
%mend macro2;
%macro macro1();
%let variable1= 0;
%macro2(counter=&variable1.);
%put &=variable1;
%mend macro1;
%macro1;
(Of course, that is if you expect &variable1 to have the value of 201 - because %do loops, like do loops, always get incremented one higher than their ending value. I assume your real procedure works differently.)
That's because the &variable1. referred to in %macro2 automatically is the one present in the most local scope - which in this case is the scope of %macro1.
Alternatively, if you're using this %macro2 for the purpose of incrementing a counter, I would use a function-style macro method.
A function-style macro by definition is one that returns only a single value - and by returns I mean has a single value at the end of the macro's code that is presented in plain text (since a macro is, after all, only intended to create text that will then be parsed by the normal SAS language parser).
This can then be used on the right side of an equal sign in an assignment statement. The key is that it uses only macro language elements - %do loops and such - and no data step, proc, etc., language that would prevent it from being on the right side of an equal sign in an assignment statement (ie, x=%macrostuff(); cannot be x=proc sql(select...)).
So the following accomplishes the goal: increment a counter some amount, return the value (201, in this case, just like before), and then that can be assigned to a macro variable.
%macro macro2(counter=);
%do internal_counter =&counter. %to 200;
%if %sysfunc(mod(&internal_counter.,50))=0 %then %put &=internal_counter.;
%end;
&internal_counter.
%mend macro2;
%macro macro1();
%let variable1= %macro2(counter=0);
%put &=variable1;
%mend macro1;
%macro1;
I would suggest that this is the most idiomatic way to accomplish this, and the most simple: you pass the value you want as input, function operates on it, returns value, which you then assign to a variable in your macro however you want.

IF-THEN vs IF in SAS

What is the difference between IF and IF-THEN
For example the following statement
if type='H' then output;
vs
if type='H';
output;
An if-then statement conditionally executes code. If the condition is met for a given observation, whatever follows the 'then' before the ; is executed, otherwise it isn't. In your example, since what follows is output, only observations with type 'H' are output to the data set(s) being built by the data step. You can also have an if-then-do statement, such as in the following code:
if type = 'H' then do;
i=1;
output;
end;
If-then-do statements conditionally execute code between the do; and the end;. Thus the above code executes i=1; and output; only if type equals 'H'.
An if without a then is a "subsetting if". According to SAS documentation:
A subsetting IF statement tests the condition after an observation is
read into the Program Data Vector (PDV). If the condition is true, SAS
continues processing the current observation. Otherwise, the
observation is discarded, and processing continues with the next
observation.
Thus if the condition of a subsetting if (ex. type='H') is not met, the observation is not output to the data set being created by the data step. In your example, only observations where type is 'H' will be output.
In summary, both of your example codes produce the same result, but by different means. if type='H' then output; only outputs observations where type is 'H', while if type='H'; output; discards observations where type is not 'H'. Note that in the latter you don't need the output; because there is an implicit output in the SAS data step, which is only overridden if there is an explicit output; command.
They're similar but not identical. In a data step, if is a subsetting statement, and all records not satisfying the condition are dropped. From the documentation:
"Continues processing only those observations that meet the condition of the specified expression."
if then functions more like the if statement in other languages, it executes the statement after the then clause conditionally. A somewhat contrived example:
data baz;
set foo;
if type = 'H';
x = x + 1;
run;
data baz:
set foo;
if type='H' then x = x + 1;
run;
In both examples x will be incremented by 1 if type = 'H', but in the first data step baz will not contain any observations with type not equal to 'H'.
Nowadays it seems like most things that used to be accomplished by if are done using where.