This question already has an answer here:
Prompt or Macro Variables used in calculations
(1 answer)
Closed 4 years ago.
I have a data set like this;
DATA work.faminc;
INPUT famid faminc1-faminc12 ;
CARDS;
1 3281 3413 3114 2500 2700 3500 3114 3319 3514 1282 2434 2818
2 4042 3084 3108 3150 3800 3100 1531 2914 3819 4124 4274 4471
3 6015 6123 6113 6100 6100 6200 6186 6132 3123 4231 6039 6215
;
RUN;
I can create a variable and do some stuff with it like,
%let N=12;
DATA faminc1b;
SET faminc ;
ARRAY Afaminc(12) faminc1-faminc12 ;
ARRAY Ataxinc(&N) taxinc1-taxinc&N ;
DO month = 1 TO &N;
Ataxinc(month) = Afaminc(month) * .10 ;
END;
RUN;
But I also want to divide every family income to the one before it.
The result should be like faminc1/faminc2 - faminc2/faminc3 - faminc3/faminc4...
So main problem is how to use arithmetic (+,-,*,/) operators to the "N" variable which i have created.
When I tried to simply do this, it doesnt work;
%let N=12;
DATA faminc1b;
SET faminc ;
ARRAY Afaminc(12) faminc1-faminc12 ;
ARRAY Afamdiv(&N) famdiv1-famdiv&N ;
DO month = 1 TO &N+1;
Afamdiv(month) = faminc&N/faminc&N+1 ;
END;
RUN;
Thanks for the help.
I am not exactly sure what you want to achieve, so i can only answer your question regarding an operation on a macrovariable, to get your sample working you should put it in a seperate macro, then you can do the eval function on your macrovariable to add 1.
But as far as i can see, you must use month as your loopingvariable and not N, also you have to stop at 11, because you dont have a variable 13 to divide with variable 12.
%let N=12;
%macro calc;
DATA faminc1b;
SET faminc ;
ARRAY Afaminc(12) faminc1-faminc12 ;
ARRAY Afamdiv(&N) famdiv1-famdiv&N ;
%DO month = 1 %TO %eval(&N-1);
Afamdiv(&month) = faminc&month/faminc%eval(&month+1) ;
%END;
RUN;
%mend;
%calc;
You do not need to use the macro variable for anything other than to define the upper bound on your varaible list.
Everything else you can do with normal SAS code. Use the DIM() function to find the upper bound arrays. Use the arrays in your calculations. Not sure why you are hardcoding one upper bound and using the macro variable for the other, but if they can be different then you need to consider the length of both arrays to find upper bound for your DO loop.
%let N=12;
DATA faminc1b;
SET faminc ;
ARRAY Afaminc faminc1-faminc12 ;
ARRAY Afamdiv famdiv1-famdiv&N ;
DO month = 1 TO min(dim(afaminc)-1,dim(afamdiv));
Afamdiv(month) = afaminc(month)/afaminc(month+1) ;
END;
RUN;
Related
I want to apply a macro I have written to each individual row in SAS
DATA cars1;
INPUT make $ model $ mpg weight price;
CARDS;
AMC Concord 22 2930 4099
AMC Pacer 17 3350 4749
AMC Spirit 22 2640 3799
Buick Century 20 3250 4816
Buick Electra 15 4080 7827
;
RUN;
%macro calculate1 (var_name, var_value);
%If &var_name < 20 %then
%do;
&var_value + &var_name;
%end;
%else %if &var_name >= 20 %then
%do;
&var_value - &var_name;
%end;
%mend ;
Data cars2; Set cars1;
varnew = %calculate1(mpg, weight);
Run;
When I run this code I get a difference between the two columns even when the MPG values are <20 when according to the code I want the sum of the columns if the value in the MPG column is < 20.
I know I could use If conditions using the columns but I want try to use macros to do this.
Please help me apply my macro on the columns.
Thanks in advance.
You most likely do not need to macro code yet.
Macro writes SAS source code before run-time, it does not evaluate data step expressions at runtime.
Learn to write DATA Step code before attempting to abstract it to macro.
DATA Step
This might contain source code statements (the if/then/else) that you want macro to generate
data cars2;
* calculate;
if mpg < 20 then varnew = weight + mpg;
else
if mpg >= 20 then varnew = weight - mpg;
run;
How would this be abstracted ? Determine which components of the if/then/else would be reused in a different context or with different variables. If you can't determine re-use, don't code a macro.
Consider abstraction #1 (as pseudo-code) that is to generate a complete statement
if PARAMETER_2 < 20 then RESULT_VAR = PARAMETER_1 + PARAMETER_2;
else
if PARAMETER_2 >= 20 then RESULT_VAR = PARAMETER_1 - PARAMETER_2;
or abstraction #2 that is to generate source code for an expression
ifn (PARAMETER_2 < 20, PARAMETER_1 + PARAMETER_2, PARAMETER_1 - PARAMETER_2)
But why hardcode 20 into the macro ? Why not make that a parameter as well ? If you go that route the abstraction is too much and is templating actual language elements that should be used in a non-macro way. (One might suppose in a purely functional language such as LISP there is no abstraction too much)
Abstraction #1 as macro
%macro calculate(result_var, parameter_1, parameter_2);
/* generate DATA Step source code, using the passed parameters */
if &PARAMETER_2 < 20 then &RESULT_VAR = &PARAMETER_1 + &PARAMETER_2;
else
if &PARAMETER_2 >= 20 then &RESULT_VAR = &PARAMETER_1 - &PARAMETER_2;
%mend;
data cars2;
%calculate(varnew,weight,mpg);
run;
Abstraction #2 as macro
%macro calculate(parameter_1, parameter_2);
/* generate source code that is valid as right hand side of assignment */
ifn (&PARAMETER_2 < 20, &PARAMETER_1 + &PARAMETER_2, &PARAMETER_1 - &PARAMETER_2)
%mend;
data cars2;
varnew = %calculate(weight,mpg);
run;
Do somebody know why the number stocked in "numero" isn't the same that the one I put in the let ?
I use SAS Enterprise Guide 7.1.
Here's my program :
%let ident = 4644968792486317489 ;
data _null_ ;
numero= put(&ident.,z19.);
call symputx('numero',numero);
run;
%put &numero. ;
And the log :
30 %let ident = 4644968792486317489 ;
31
32 data _null_ ;
33 numero= put(&ident.,z19.);
34 call symputx('numero',numero);
35 run;
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
36
37 %put &numero. ;
4644968792486317056
Thanks by advance !
SAS stores numbers as 8 byte floating point values. Therefore there is a limit to the maximum integer that can be stored exactly (or really exactly without gaps). They even publish a table with the maximum value.
And a function you can use to determine the maximum value.
3 %put %sysfunc(constant(exactint),comma23.);
9,007,199,254,740,992
Looks like your "number" is really an identifier. So store it as character to begin with and you will not have these problems.
data want;
length numero $19;
numero = "&ident";
numero = translate(right(numero),'0',' ');
run;
Use the SAS MD5 function to anonymize strings. Don't forget MACRO is really just text processing.
%let ident = 4644968792486317489 ;
%let numero = %sysfunc(MD5(&ident));
or in DATA Step
data ... ;
numero = MD5("&ident");
In certain situations you might associate a monotonic serial value to an identity value.
%let ident = 4644968792486317489 ;
%if not %symexist(i&ident) %then %do;
%let i&ident = %sysfunc(monotonic());
%put new serial;
%end;
%put i&ident=&&i&ident;
----- LOG -----
i4644968792486317489=1
I have a dataset holding parameters like thus
Parameters
year threshold1 threshold2
1 100 200
2 150 300
....
7 200 390
I can do
data output;
set input;
if 0 then set set parameters;
array thresholds [2] thresholds:;
%do year = 1 %to 7;
year = &year.;
set parameters point=year;
array my_thresholds&year. [2] _temporary_;
do i = 1 to 2;
my_thresholds&year.[i] = thresholds[i];
end;
%end;
This would, for every observation in INPUT, threshold1 threshold2 for each year as variables and set up an array for my_thresholds&year. holding each.
The problem however, is if the number of thresholds is unknown. I can't do dim(thresholds) nor *.
How can I get SAS to know at compile how to set up the array?
To my knowledge you cannot dynamically set the size of the array at the compile time.
One possibility to get this done is to use proc contents and proc sql to figure out how many threshold parameters there are in the parameters data set and then pass that information to the data step by the macro variable.
data parameters;
do year=1 to 7;
threshold1 = 1;
threshold2 = 2;
threshold3 = 3;
output;
end;
run;
proc contents data=parameters out=cont noprint;
run;
proc sql noprint;
select count(*) into :thr_count
from cont
where name like "threshold%";
quit;
%put &thr_count.;
My initial Dataset has 14000 STID variable with 10^5 observation for each.
I would like to make some procedures BY each stid, output the modification into data by STID and then set all STID together under each other into one big dataset WITHOUT a need to output all temporary STID-datsets.
I start writing a MACRO:
data HAVE;
input stid $ NumVar1 NumVar2;
datalines;
a 5 45
b 6 2
c 5 3
r 2 5
f 4 4
j 7 3
t 89 2
e 6 1
c 3 8
kl 1 6
h 2 3
f 5 41
vc 58 4
j 5 9
ude 7 3
fc 9 11
h 6 3
kl 3 65
b 1 4
g 4 4
;
run;
/* to save all distinct values of THE VARIABLE stid into macro variables
where &N_VAR - total number of distinct variable values */
proc sql;
select count(distinct stid)
into :N_VAR
from HAVE;
select distinct stid
into :stid1 - :stid%left(&N_VAR)
from HAVE;
quit;
%macro expand_by_stid;
/*STEP 1: create datasets by STID*/
%do i=1 %to &N_VAR.;
data stid&i;
set HAVE;
if stid="&&stid&i";
run;
/*STEP 2: from here data modifications for each STID-data (with procs and data steps, e.g.)*/
data modified_stid&i;
set stid&i;
NumVar1_trans=NumVar1**2;
NumVar2_trans=NumVar1*NumVar2;
run;
%end;
/*STEP 3: from here should be some code lines that set together all created datsets under one another and delete them afterwards*/
data total;
set %do n=1 %to &N_VAR.;
modified_stid&n;
%end;
run;
proc datasets library=usclim;
delete <ALL DATA SETS by SPID>;
run;
%mend expand_by_stid;
%expand_by_stid;
But the last step does not work. How can I do it?
You're very close - all you need to do is remove the semicolon in the macro loop and put it after the %end in step 3, as below:
data total;
set
%do n=1 %to &N_VAR.;
modified_stid&n
%end;;
run;
This then produces the statement you were after:
set modified_stid1 modified_stid2 .... ;
instead of what your macro was originally generating:
set modified_stid1; modified_stid2; ...;
Finally, you can delete all the temporary datasets using stid: in the delete statement:
proc datasets library=usclim;
delete stid: ;
run;
I'm trying to write robust code to assign values to macro variables. I want the names of the macro variables to depend on values coming from the variable 'subgroup'. So subgroup could equal 1, 2, or 45 etc. and thus have macro variable names trta_1, trta_2, trt_45 etc.
Where I am having difficulty is calling the macro variable name. So instead of calling e.g. &trta_1 I want to call &trta_%SCAN(&subgroups, &k), which resolves to trta_1 on the first iteration. I've used a %SCAN function in the macro variable name, which is throwing up a warning 'WARNING: Apparent symbolic reference TRTA_ not resolved.'. However, the macro variables have been created with values assigned.
How can I resolve the warning? Is there a function I could run with the %SCAN function to get this to work?
data data1 ;
input subgroup trta trtb ;
datalines ;
1 30 58
2 120 450
3 670 3
run;
%LET subgroups = 1 2 3 ;
%PUT &subgroups;
%MACRO test;
%DO k=1 %TO 3;
DATA test_&k;
SET data1;
WHERE subgroup = %SCAN(&subgroups, &k);
CALL SYMPUTX("TRTA_%SCAN(&subgroups, &k)", trta, 'G');
CALL SYMPUTX("TRTB_%SCAN(&subgroups, &k)", trtb, 'G');
RUN;
%PUT "&TRTA_%SCAN(&subgroups, &k)" "&TRTB_%SCAN(&subgroups, &k)";
%END;
%MEND test;
%test;
Using the structure you've provided the following will achieve the result you're looking for.
data data1;
input subgroup trta trtb;
datalines;
1 30 58
2 120 450
3 670 3
;
run;
%LET SUBGROUPS = 1 2 3;
%PUT &SUBGROUPS;
%MACRO TEST;
%DO K=1 %TO 3;
%LET X = %SCAN(&SUBGROUPS, &K) ;
data test_&k;
set data1;
where subgroup = &X ;
call symputx(cats("TRTA_",&X), trta, 'g');
call symputx(cats("TRTB_",&X), trtb, 'g');
run;
%PUT "&&TRTA_&X" "&&TRTB_&X";
%END;
%MEND TEST;
%TEST;
However, I'm not sure this approach is particularly robust. If your list of subgroups changes you'd need to change the 'K' loop manually, you can determine the upper bound of the loop by dynamically counting the 'elements' in your subgroup list.
If you want to call the macro variables you've created later in your code, you could a similar method.
data data2;
input subgroup value;
datalines;
1 20
2 25
3 15
45 30
;
run ;
%MACRO TEST2;
%DO K=1 %TO 3;
%LET X = %SCAN(&SUBGROUPS, &K) ;
data data2 ;
set data2 ;
if subgroup = &X then percent = value/&&TRTB_&X ;
format percent percent9.2 ;
run ;
%END;
%MEND TEST2;
%TEST2 ;
Effectively, you're re-writing data2 on each iteration of the loop.
This should cover your requirements. You can load and unload an array of macro variable without a macro. I have included an alternate method of unloading a macro variable array with a macro for comparison.
Load values into macro variables including Subgroup number within macro variable name e.g. TRTA_45.
data data1;
input subgroup trta trtb;
call symput ('TRTA_'||compress (subgroup), trta);
call symput ('TRTB_'||compress (subgroup), trtb);
datalines;
1 30 58
2 120 450
3 670 3
45 999 111
;
run;
No need for macro to load or refer to macro variables.
%put TRTA_45: &TRTA_45.;
%let Subgroup_num = 45;
%put TRTB__&subgroup_num.: &&TRTB_&subgroup_num.;
If you need to loop through the macro variables then you can use Proc SQL to generate a list of subgroups.
proc sql noprint;
select subgroup
, count (*)
into :subgroups separated by ' '
, :No_Subgroups
from data1
;
quit;
%put Subgroups: &subgroups.;
%put No_Subgroups: &No_Subgroups.;
Use a macro to loop through the macro variable array and populate a table.
%macro subgroups;
data subgroup_data_macro;
%do i = 1 %to &no_subgroups.;
%PUT TRTA_%SCAN(&subgroups, &i ): %cmpres(&TRTA_%SCAN(&subgroups, &i ));
%PUT TRTB_%SCAN(&subgroups, &i ): %cmpres(&TRTB_%SCAN(&subgroups, &i ));
subgroup = %SCAN(&subgroups, &i );
TRTA = %cmpres(&TRTA_%SCAN(&subgroups, &i ));
TRTB = %cmpres(&TRTB_%SCAN(&subgroups, &i ));
output;
%end;
run;
%mend subgroups;
%subgroups;
Or use a data step (outside a macro) to loop through the macro variable array and populate a table.
data subgroup_data_sans_macro;
do i = 1 to &no_subgroups.;
subgroup = SCAN("&subgroups", i );
TRTA = input (symget (compress ('TRTA_'||subgroup)),20.);
TRTB = input (symget (compress ('TRTB_'||subgroup)),20.);
output;
end;
run;
Ensure both methods (within and without a macro) produce the same result.
proc compare
base = subgroup_data_sans_macro
compare = subgroup_data_macro
;
run;