Several if statements - if-statement

I want to flag Komp and Bauspar if either one of them is <1 with -, >1 with + and if one of them is blank --> no flag.
Tried the following, but it produces with two 2022_Bauspar_flag columns somehow?
Can you give me hint?
Thanks a lot.
Kind regards,
Ben
%macro target_years2(table,type);
%local name_Bauspar name_Komp;
data &table ;
set work.&table;
%let name_Komp = "2022_ZZ_Komp"n;
%let name_Bauspar = "2022_ZZ_Bauspar"n;
&name_Komp = (1+("2022_Komposit"n-"2022_Komposit_Ziel"n)/"2022_Komposit_Ziel"n);
&name_Bauspar = (1+("2022_Bausparen"n-"2022_Bausparen_Ziel"n)/"2022_Bausparen_Ziel"n);
/*create ZZ_flags*/
if &name_Komp > 1 THEN do;
"2022_ZZ_Komp_flag"n = '+';
end;
else if &name_Komp < 1 and &name_Komp <> . THEN do;
"2022_ZZ_Komp_flag"n = '-';
end;
else if &name_Bauspar > 1 THEN do;
"2022_ZZ_Baupar_flag"n = '+';
end;
else if &name_Bauspar < 1 and &name_Bauspar <> . THEN do;
"2022_ZZ_Bauspar_flag"n = '-';
end;
else do;
end;
run;
%mend;
%target_years2(Produktion_temp,Produktion)

Difficult to help you as you do not provide any output or detailed explanation of what is wrong.
Note that if you want to compute both columns for each observations you would need to split your if statement. The second IF condition is not evaluated when the first IF condition is true.
I understand you want to compute two derived columns 2022_ZZ_Komp_flag and 2022_ZZ_Bauspar_flag with the following condition:
if associated macro variable &name_ > 1 then flag is +
if associated macro variable &name_ < 1 then flag is -
if associated macro variable &name_ = . then flag is missing
With the following dataset
data have;
input zz_komp zz_baupar;
cards;
0.9 1.1
1.1 0.8
. 2
0.8 .
;
The following code
data want;
set have;
"2022_ZZ_Komp_flag"n = ifc(zz_komp > 1, '+', '-');
"2022_ZZ_Baupar_flag"n = ifc(zz_baupar > 1, '+', '-');
if missing(zz_komp) then "2022_ZZ_Komp_flag"n = '';
if missing(zz_baupar) then "2022_ZZ_Baupar_flag"n = '';
run;
Produces
Is it the expected result?

You have a typo in your code. You assign to Baupar_flag in one case, and Bauspar_flag in the other
else if &name_Bauspar > 1 THEN do;
"2022_ZZ_Baupar_flag"n = '+';
------
end;
else if &name_Bauspar < 1 and &name_Bauspar <> . THEN do;
"2022_ZZ_Bauspar_flag"n = '-';
-------

Related

How can I transform this code into macro?

So I'd like to do a macro code mixed with proc sql and data step. I have the following code in SAS:
data work.calendar;
set work.calendar;
if business_day_count^=-1 then do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
end;
else
business_day_count = -1;
run;
I'd like to put it into macro code, but it doesn't work.
My macro code:
%macro1();
data work.job_calendar;
set work.job_calendar;
%if business_day_count^=-1 %then %do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
%end;
else
business_day_count = -1;
run;
%mend;
The whole code is:
%macro update_day(date);
proc sql;
update work.job_calendar
set business_day_count =
case when datepart(calendar_date) = "&date"d then -1
else business_day_count
end;
quit;
proc sql;
update work.job_calendar
set status_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
proc sql;
update work.job_calendar
set rundate_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
proc sql;
update work.job_calendar
set daily_rundate_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
proc sql;
update work.job_calendar
set weekly_rundate_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
proc sql;
update work.job_calendar
set monthly_rundate_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
data work.job_calendar;
set work.job_calendar;
if business_day_count^=-1 then do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
end;
else
business_day_count = -1;
%mend;
The error code is: ERROR 180 - 322 Statement is not valid or it is used out of proper order. I don't know what I'm doing wrong.
You're mixing data step and macro code. Not sure what you're trying to achieve so no idea on what to propose but removing the %IF/%THEN will allow your code to work.
%macro1();
data work.job_calendar;
set work.job_calendar;
if business_day_count^=-1 then do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
end;
else
business_day_count = -1;
run;
%mend;
%macro1;
Here is a tutorial on converting working code to a macro and one on overall macro programming.
To define a macro you need to use the %MACRO statement. Also why did you change lines 3 and 7 from data statements to macro statements?
That code cannot work. First the %IF is always true since the string business_day_count is never going to match the string -1. Second you have an else statement without any previous if statement.
Try something like this instead.
%macro macro1();
data work.job_calendar;
set work.job_calendar;
if business_day_count^=-1 then do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
end;
else business_day_count = -1;
run;
%mend macro1;

how to create variables that names are concat with two array variable names

I have a HCC dataset DATA_HCC that with member ID and 79 binary variables:
Member_ID HCC1 HCC2 HCC6 HCC8 ... HCC189
XXXXXXX1 1 0 1 0 ... 0
XXXXXXX2 0 0 1 0 ... 0
XXXXXXX3 0 1 0 0 ... 1
I am trying to create a output dataset that could create new binary variables for all the combination of those 79 variables. Each new variable represents if a member had both of the variables as 1.
%LET hccList = HCC1 HCC2 HCC6 HCC8 HCC9 HCC10 HCC11 HCC12 HCC17 HCC18 HCC19 HCC21 HCC22 HCC23 HCC27
HCC28 HCC29 HCC33 HCC34 HCC35 HCC39 HCC40 HCC46 HCC47 HCC48 HCC54 HCC55 HCC57 HCC58
HCC70 HCC71 HCC72 HCC73 HCC74 HCC75 HCC76 HCC77 HCC78 HCC79 HCC80 HCC82 HCC83 HCC84
HCC85 HCC86 HCC87 HCC88 HCC96 HCC99 HCC100 HCC103 HCC104 HCC106 HCC107 HCC108 HCC110
HCC111 HCC112 HCC114 HCC115 HCC122 HCC124 HCC134 HCC135 HCC136 HCC137 HCC157 HCC158
HCC161 HCC162 HCC166 HCC167 HCC169 HCC170 HCC173 HCC176 HCC186 HCC188 HCC189;
DATA COUNT_HCC; SET DATA_HCC;
ARRAY HCC [*] &hccList.;
DO i = 1 TO DIM(HCC);
DO j = i+1 TO DIM(HCC);
%LET HCC_COMBO = CATX('_', VARNAME(HCC[i]), VARNAME(HCC[j]));
&HCC_COMBO. = MIN(HCC[i], HCC[j]);
END;
END;
RUN;
I tried to use CATX function to just concat the two variable names but it didn't work.
Here is the log error that I got:
ERROR: Undeclared array referenced: CATX.
ERROR: Variable CATX has not been declared as an array.
ERROR 71-185: The VARNAME function call does not have enough arguments.
And the results output sample would like this:
Member_ID HCC1_HCC2 HCC1_HCC6 HCC1_HCC8 ... HCC188_HCC189
XXXXXXX1 0 1 0 ... 0
XXXXXXX2 0 0 0 ... 0
XXXXXXX3 0 0 0 ... 1
To achieve dynamic variable name generation, use a macro to create the variables that you need. The below code generates dynamic variable names and generates data step code to create the variables.
%macro get_hcc_combo_mins;
%do i = 1 %to %sysfunc(countw(&hccList.));
%do j = %eval(&i.+1) %to %sysfunc(countw(&hccList.));
%let hcc1 = %scan(&hccList., &i.);
%let hcc2 = %scan(&hccList., &j.);
&hcc1._&hcc2. = min(&hcc1., &hcc2.);
%end;
%end;
%mend;
DATA COUNT_HCC; SET DATA_HCC;
ARRAY HCC [*] &hccList.;
%get_hcc_combo_mins;
RUN;
The macro %get_hcc_combo_mins generates this code in the data step:
HCC1_HCC2 = min(HCC1, HCC2);
HCC1_HCC6 = min(HCC1, HCC6);
HCC1_HCC8 = min(HCC1, HCC8);
...
There may be other ways to do this all within one data step that I'm not aware of, but macros can get the job done.
A DATA Step with LEXCOMB can generate variable name pairs. CALL EXECUTE submit a statement using those names.
Example:
Presume HCC: variable names, which specific ones not known apriori.
data have;
call streaminit(1234);
do id = 1 to 100;
array hcc hcc1 hcc3 hcc5 hcc7 hcc10-hcc79 hcc150 hcc155 hcc180 hcc190-hcc191;
do over hcc;
hcc = rand('uniform', dim(hcc)) < _i_;
end;
output;
end;
run;
data _null_;
set have;
array hcc hcc:;
do _n_ = 1 to dim(hcc);
hcc(_n_) = _n_;
end;
call execute("data pairwise; set have;");
do _n_ = 1 to comb(dim(hcc),2);
call lexcomb(_n_, 2, of hcc(*));
index1 = hcc(1);
index2 = hcc(2);
name1 = vname(hcc(index1));
name2 = vname(hcc(index2));
put name1=;
call execute (cats(
catx( '_',name1,name2),
'=',
catx(' and ',name1,name2),
';'
));
end;
call execute('run;');
stop;
run;
See if you can use this as a template.
/* Example data */
data have (drop = i j);
array h {*} HCC1 HCC2 HCC6 HCC8 HCC9 HCC10 HCC11 HCC12 HCC17 HCC18 HCC19 HCC21 HCC22 HCC23 HCC27
HCC28 HCC29 HCC33 HCC34 HCC35 HCC39 HCC40 HCC46 HCC47 HCC48 HCC54 HCC55 HCC57 HCC58
HCC70 HCC71 HCC72 HCC73 HCC74 HCC75 HCC76 HCC77 HCC78 HCC79 HCC80 HCC82 HCC83 HCC84
HCC85 HCC86 HCC87 HCC88 HCC96 HCC99 HCC100 HCC103 HCC104 HCC106 HCC107 HCC108 HCC110
HCC111 HCC112 HCC114 HCC115 HCC122 HCC124 HCC134 HCC135 HCC136 HCC137 HCC157 HCC158
HCC161 HCC162 HCC166 HCC167 HCC169 HCC170 HCC173 HCC176 HCC186 HCC188 HCC189;
do i = 1 to 10;
do j = 1 to dim (h);
h [j] = rand('uniform') > .5;
end;
output;
end;
run;
/* Create long version of output data */
data temp (drop = i j);
set have;
array a {*} HC:;
do i = 1 to dim (a)-1;
do j = i+1 to dim (a);
v = catx('_', vname (a[i]), vname (a[j]));
d = a [i] * a [j];
n = _N_;
output;
end;
end;
run;
/* Transpose to wide format */
proc transpose data=temp out=temp2 (drop=_: n);
by n;
id v;
var d;
run;
/* Merge back with original data */
data want;
merge have temp2;
run;

Create a variable inside WHEN-DO statement SAS

Want to create 2 macro variables "list" and "list2" depending on prod's value but it always returns the last iteration values.
Thanks
%let prod=WC;
SELECT ;
WHEN (WC = &prod)
DO;
%let list = (60 , 63 );
%let list2= ("6A","6B","6C") ;
END;
WHEN (MT = &prod)
DO;
%let list = (33 , 34);
%let list2= ("3A","3B");
END;
OTHERWISE;
END;
RUN;
``
The macro processor works before the generated code is passed to SAS to interpret. So your code is evaluated in this order:
%let prod=WC;
%let list = (60 , 63 );
%let list2= ("6A","6B","6C") ;
%let list = (33 , 34);
%let list2= ("3A","3B");
data ...
SELECT ;
WHEN (WC = &prod) DO;
END;
WHEN (MT = &prod) DO;
END;
OTHERWISE;
END;
...
RUN;
To set macro variables from a running data step use the CALL SYMPUTX() function. Also are you really trying to compare the variable WC to the variable MT? Does the data in your data step even have those variables? Or did you want to compare the text WC to the text MT?
when ("WC" = "&prod") do;
call symputx('list','(60,63)');
call symputx('list2','("6A","6B","6C")') ;
end;
Use call symput in a datastep:-
Call symput in SAS documentation
So your statements will be something like:-
call symput("list", "(60 , 63 )");
Hope this helps :-)

Automate check for number of distinct values SAS

Looking to automate some checks and print some warnings to a log file. I think I've gotten the general idea but I'm having problems generalising the checks.
For example, I have two datasets my_data1 and my_data2. I wish to print a warning if nobs_my_data2 < nobs_my_data1. Additionally, I wish to print a warning if the number of distinct values of the variable n in my_data2 is less than 11.
Some dummy data and an attempt of the first check:
%LET N = 1000;
DATA my_data1(keep = i u x n);
a = -1;
b = 1;
max = 10;
do i = 1 to &N - 100;
u = rand("Uniform"); /* decimal values in (0,1) */
x = a + (b-a) * u; /* decimal values in (a,b) */
n = floor((1 + max) * u); /* integer values in 0..max */
OUTPUT;
END;
RUN;
DATA my_data2(keep = i u x n);
a = -1;
b = 1;
max = 10;
do i = 1 to &N;
u = rand("Uniform"); /* decimal values in (0,1) */
x = a + (b-a) * u; /* decimal values in (a,b) */
n = floor((1 + max) * u); /* integer values in 0..max */
OUTPUT;
END;
RUN;
DATA _NULL_;
FILE "\\filepath\log.txt" MOD;
SET my_data1 NOBS = NOBS1 my_data2 NOBS = NOBS2 END = END;
IF END = 1 THEN DO;
PUT "HERE'S A HEADER LINE";
END;
IF NOBS1 > NOBS2 AND END = 1 THEN DO;
PUT "WARNING!";
END;
IF END = 1 THEN DO;
PUT "HERE'S A FOOTER LINE";
END;
RUN;
How can I set up the check for the number of distinct values of n in my_data2?
A proc sql way to do it -
%macro nobsprint(tab1,tab2);
options nonotes; *suppresses all notes;
proc sql;
select count(*) into:nobs&tab1. from &tab1.;
select count(*) into:nobs&tab2. from &tab2.;
select count(distinct n) into:distn&tab2. from &tab2.;
quit;
%if &&nobs&tab2. < &&nobs&tab1. %then %put |WARNING! &tab2. has less recs than &tab1.|;
%if &&distn&tab2. < 11 %then %put |WARNING! distinct VAR n count in &tab2. less than 11|;
options notes; *overrides the previous option;
%mend nobsprint;
%nobsprint(my_data1,my_data2);
This would break if you have to specify libnames with the datasets due to the .. And, you can use proc printto log to print it to a file.
For your other part as to just print the %put use the above as a call -
filename mylog temp;
proc printto log=mylog; run;
options nomprint nomlogic;
%nobsprint(my_data1,my_data2);
proc printto; run;
This won't print any erroneous text to SAS log other than your custom warnings.
#samkart provided perhaps the most direct, easily understood way to compare the obs counts. Another consideration is performance. You can get them without reading the entire data set if your data set has millions of obs.
One method is to use nobs= option in the set statement like you did in your code, but you unnecessarily read the data sets. The following will get the counts and compare them without reading all of the observations.
62 data _null_;
63 if nobs1 ne nobs2 then putlog 'WARNING: Obs counts do not match.';
64 stop;
65 set sashelp.cars nobs=nobs1;
66 set sashelp.class nobs=nobs2;
67 run;
WARNING: Obs counts do not match.
Another option is to get the counts from sashelp.vtable or dictionary.tables. Note that you can only query dictionary.tables with proc sql.

Find the next non blank value in SAS

I am trying to linearly interpolate values within a panel set data. So I am find the next non zero value within a variable if the current value of the variable is "."
For example if X = { 1, 2, . , . , . ,7), I want to store 7 as a variable "Y" and subject the lag value of X from it as the numerator of the slope. Can anyone help with this step?
If you cannot transpose your data, here is a way that will work for your given example:
data test;
input id $3. x best12.;
datalines;
AAA 1
BBB 2
CCC .
DDD .
EEE .
FFF 7
;
run;
data test2;
set test;
n = _n_;
if x ne .;
run;
data test3;
set test2;
lagx = lag(x);
lagn = lag(n);
if _n_ > 1 and n ne lagn + 1 then do;
postiondiff = n - lagn;
valuediff = x - lagx;
do i = (lagx + ((x-lagx)/(n-lagn))) to x by ((x-lagx)/(n-lagn));
x = i;
output;
end;
end;
else output;
keep x;
run;
data test4;
merge test test3 (rename = (x=newx));
run;
So we are basically rebuilding the variable with the interpolated values, then remerging it into the original dataset without a by variable which will line up all the new interpolated data with the missing points.
Is there a way you could transpose all your data? Interpolating like that is much easier when all the data you need is in a single observation. Like this:
data test;
input x best12.;
datalines;
1
2
.
.
.
7
;
run;
proc transpose data = test
out = test2;
run;
data test3;
set test2;
array xvalues {*} COL1-COL6;
array interpol {4,10} begin1-begin10 end1-end10 begposition1-begposition10 endposition1-endposition10;
rangenum = 1;
* Find the endpoints of the missing ranges;
do i = 1 to dim(xvalues);
if xvalues{i} ne . then lastknownx = xvalues{i};
else do;
interpol{1,rangenum} = lastknownx;
if interpol{3,rangenum} = . then interpol{3,rangenum} = i - 1;
end;
if i > 1 and xvalues{i} ne . then do;
if xvalues{i-1} = . then do;
interpol{2,rangenum} = xvalues{i};
interpol{4,rangenum} = i;
rangenum = rangenum + 1;
end;
end;
end;
* Interpolate;
rangenum = 1;
do j = 1 to dim(xvalues);
if xvalues{j} = . then do;
xvalues{j} = interpol{1,rangenum} + (j-interpol{3,rangenum})*((interpol{2,rangenum}-interpol{1,rangenum})/(interpol{4,rangenum}-interpol{3,rangenum}));
end;
else if j > 1 and xvalues{j} ne . then do;
if xvalues{j-1} = . then rangenum = rangenum + 1;
end;
end;
keep col1-col6;
run;
That can handle up to 10 different missing ranges per observation, though you could tweak the code to handle much more than that by creating bigger arrays.
The SAS data step reads a dataset one record at a time from top to bottom. So at record i, it can't access i+1 because it hasn't read it yet; it can only access i-1. Assume you have a dataset with a variable x.
data intrpl;
retain _x;
set yourdata;
by x notsorted;
if not missing(x) then do;
_x = x;
if last.x then do;
slope = _x - lag(_x);
output;
end;
end;
run;
Transposing can get kind of messy if x takes on a lot of values, so I recommend this method. I hope it helps!