write conditional in SAS with DATA _NULL_ - sas

I am writing a conditional in SAS starts with DATA NULL
%LET today = today();
DATA _NULL_;
if day(today) ge 1 and day(today) le 15 then do;
date1=put(intnx('month',today,-1,'E'), date11.);
date2=put(intnx('month',today,-1,'L'), date11.);
end;
if day(today) > 15 then do;
date1=put(intnx('month',today,0,'B'), date11.);
date2=put(intnx('month',today,0,'L'), date11.);
end;
call symput('report_date',date1);
call symput('report_date2',date2);
RUN;
but with above, I am not getting any values for my report_dates.
the condition is:
date 1 = If the current date is greater than or equal to 1 and less than 16, set the date1 to the 16th of the previous month, otherwise set it to the 1st of the current month
date2 = If the current date is 16 and above, set the date2 to the 15th of the current month, otherwise set date2 to the last day of the previous month

The IF/THEN logic does not account for the missing value you passing to the DAY() function calls.
The variable TODAY is never created in the data step. So just remove the %LET statement and add an actual assignment statement instead.
DATA _NULL_;
today=today();
if day(today) ge 1 and day(today) le 15 then do;
...
Just because you used the same name for the macro variable in the %LET statement as you used for the variable in the data step does not imply that the two have anything at all to do with each other.
If you wanted to use the macro variable to generate the code for the data step you would need to replace the TODAY with &TODAY.
if day(&today) ge 1 and day(&today) le 15 then do;
So for the value you set to the macro variable TODAY it would mean that the SAS code you are trying to run is:
if day(today()) ge 1 and day(today()) le 15 then do;
Note that is actually not a good way to handle this problem because it is calling the TODAY() function multiple times. That could cause strange results if the data step started right before midnight. So the IF condition might be run on the 31st but when you get to the ELSE condition the clock has ticked over to the first of the next month.

Related

I am having difficulty using an IF statement within a SAS Macro

I am setting up a batch script in SAS, I want it to run monthly and output to table, the code works when running manually but I am having difficulty setting up the macros.
I have created 3 macros, one for the previous month (please ignore the hard coded 12 and the commented out code, I was just testing if it worked or not, the commented out code will remain and the hard coded 12 will be removed when I can get it working), one for the current year and one for the previous year. The issue I will have is that the script grabs month end data, so come January the year will be set to 2020 but the month will be looking for December data.
`%let pmonth = 12 /*%sysfunc(month(%sysfunc(intnx(month,"&sysdate"d ,-1))))*/;`
`%put &pmonth;`
`%let year1 = %sysfunc(year("&sysdate"d));`
`%put &year1;`
`%let year2 = %sysfunc(year(%sysfunc(intnx(year,"&sysdate"d ,-1))));`
`%put &year2;`
`%macro year;
%if &pmonth = 12 %then %do;
&year2;
%end;
%else %do;
&year1;
%end;
%mend;
%year;`
What I would like is for the macro year, which is used in my code, to select correctly between year1 and year2 based on the month macro pmonth. I have played around a bit with the macros but if I'm honest, its not an area of SAS I've used that often.
A little terminology first. Your code has one macro defined, %YEAR. It does have three macro variables named YEAR1, YEAR2 and PMONTH. And there is no macro variable named YEAR that your code is either attempting to create or use. And your code does not have any IF statements. But there is one %IF statement inside the definition of the YEAR macro.
You probably will want to convert your macro variable values into actual DATE values to deal with the year issue. So if your input is two macro variables with YEAR and MONTH you can convert that into a date.
%let year1=2018;
%let month=1;
%let date=%sysfunc(mdy(&month,1,&year1));
Then if you want to find the end of the previous month you can use the date in the INTNX() function call.
%let end_of_prev_month=%sysfunc(intnx(month,&date,-1,e));
Then if you need to you can get the YEAR and MONTH for that new date.
%let year2=%sysfunc(year(&end_of_prev_month));
%let month2=%sysfunc(month(&end_of_prev_month));

SAS Call SYMPUT - one macro variable, two values? (conditional)

I would like to create one macro called 'currency_rate' which calls the correct value depending on the conditions stipulated
(either 'new_rate' or static value of 1.5):
%macro MONEY;
%Do i=1 %to 5;
data get_currency_&i (keep=month code new_rate currency_rate);
set Table1;
if month = &i and code = 'USD' then currency_rate=new_rate;
else currency_rate=1.5;
run;
data _null_;
set get_currency_&i;
if month = &i and code = 'USD' then currency_rate=new_rate;
else currency_rate=1.5;
call symput ('currency_rate', ???);
run;
%End;
%mend MONEY;
%MONEY
I am happy with the do loop and first data step. It is the call symput I am stuck on. Is call symput the correct function to use, to assign two possible values to one macro?
A snippet example of the way I will be using 'currency_rate' in a proc sql:
t1.income/&currency_rate.
I am a beginner level SAS user, any guidance would be great!
Thanks
Let's simulate your case. Suppose we have 3 datasets, as shown below -
data get_currency_1;
input month code $ new_rate currency_rate;
cards;
1 USD 2 2
2 CHF 2 1.5
3 GBP 1 1.5
;
data get_currency_2;
input month code $ new_rate currency_rate;
cards;
1 USD 3 1.5
2 USD 4 4
3 JPY 0.5 1.5
;
data get_currency_3;
input month code $ new_rate currency_rate;
cards;
1 USD 1 1.5
2 USD 3 1.5
3 USD 2.5 2.5
;
Now, let's run your code where we assign a value to currency_rate.
Let i=1 So, the dataset get_currency_1 will be accessed. As we run the step, each and every row will be accessed and the value of currency_rate will be assigned to the macro variable currency_rate and this iteration will continue till the end of the data step. At this time, the last value will be of currency_rate will be the final value of macro variable currency_rate because beyond that the step ends.
%let i=1; /*Let's assign 1 to i*/
data _null_;
set get_currency_&i;
if month = &i and code = 'USD' then currency_rate=new_rate;
else currency_rate=1.5;
call symput ('currency_rate', currency_rate);
run;
%put Currency rate is: &currency_rate;
Currency rate is: 1.5
Let i=3:
%let i=3; /*Let's assign 3 to i*/
data _null_;
set get_currency_&i;
if month = &i and code = 'USD' then currency_rate=new_rate;
else currency_rate=1.5;
call symput ('currency_rate', currency_rate);
run;
%put Currency rate is: &currency_rate;
Currency rate is: 2.5
You cannot have multiple values on one macro variable.
You say you are a beginner, so the best course of action is to avoid macro programming at this point. You would be better served learning about where, merge (or join) and by statements.
You state you will need to use a currency_rate in a statement such as
t1.income / &currency_rate.
The t1. to me suggests t1 is an alias in a SQL join and thus the far more likely scenario is that you need to left join table t1 that contains incomes with table1 (call it monthly_datum) that contains the monthly currency rates.
select
t1.income / coaslesce(monthly_datum.currency_rates,1.5)
, …
from
income_data as t1
left join
monthly_datum
on t1.month = monthly_datum.month
The rate of 1.5 would be used when the income is associated with a month that is not present in monthly_datum.
A macro variable can only hold a single value.
Since you're only ever assigning a single value, you can easily use CALL SYMPUTX().
call symputx('currency_rate', currency_rate);
But if your data has more than one row, then the value will be the last value set in the data set.

SAS Retain statement - how to retain previous non missing values into new column for comparison

I am simply looking to create a new column retaining the value of a specific column within the previous record, so that I can compare the existing column against the new column. Further down the line I want to be able to output records whereby the values in both columns are different and lose the records where the values are the same
Essentially I want my dataset to look like this, where the first idr has the retained date set to null:
Idr Date1 Date2
1 20/01/2016 .
1 20/01/2016 20/01/2016
1 18/10/2016 20/01/2016
2 07/03/2016 .
2 18/05/2016 07/03/2016
2 21/10/2016 18/05/2016
3 29/01/2016 .
3 04/02/2016 29/01/2016
3 04/02/2016 04/02/2016
I have used code along the following lines in the past whereby I have created a temporary variable referencing the data I want to retain:
date_temp=date1;
data example2;
set example1;
by idr date1;
date_temp=date1;
retain date_temp ;
if first.idr then do;
date_temp=date1;
end;else do;
date2=date_temp;
end;
run;
I have searched the highs and lows of google - Any help would be greatly appreciated
The trick here is to set the value of the retained variable ready for the next row after you've already output the current row, rather than relying on the default implicit output at the end of the data step:
data example2;
set example1;
by idr;
retain date2;
if first.idr then call missing(date2);
output;
date2 = date1;
format date2 ddmmyy10.;
run;
Logic that executes after the output statement doesn't make any difference to the row that's just been output, but until the data step proceeds to the next iteration, everything is still in the PDV, including variables that are being retained across to the next row, so this is an opportunity to update them. More details on the PDV.
Another way of doing this, using the lag function:
data example3;
set example1;
date2 = lag(date1);
if idr ne lag(idr) then call missing(date2);
run;
Be careful when using lag - it returns the value from the last time that instance of the lag function was executed during your data step, not necessarily the value of that variable from the previous row, so weird things tend to happen if you do something like if condition then mylaggedvar=lag(var);
To achieve your final outcome (remove records where the idr and date are the same as the previous row), you can easily achieve this without creating the extra column. Providing the data is sorted by idr and date1, then just use first.date1 to keep the required records.
data have;
input Idr Date1 :ddmmyy10.;
format date1 ddmmyy10.;
datalines;
1 20/01/2016
1 20/01/2016
1 18/10/2016
2 07/03/2016
2 18/05/2016
2 21/10/2016
3 29/01/2016
3 04/02/2016
3 04/02/2016
;
run;
data want;
set have;
by idr date1;
if first.date1 then output;
run;

SAS Data Step | Between 2 Dates

Probably a simple question. I have a simple dataset with scheduled payment dates in it.
DATA INFORM2;
INFORMAT previous_pmt_date scheduled_pmt_date MMDDYY10.;
INPUT previous_pmt_date scheduled_pmt_date;
FORMAT previous_pmt_date scheduled_pmt_date MMDDYYS10.;
DATALINES;
11/16/2015 12/16/2015
12/17/2015 01/16/2016
01/17/2016 02/16/2016
;
What I'm trying to do is to create a binary latest row indicator. For example, If I wanted to know the latest row as of 1/31/2016 I'd want row 2 to be flagged as the latest row. What I had been doing before is finding out where 1/31/2016 is between the previous_pmt_date and the scheduled_pmt_date, but that isn't correct for my purposes. I'd like to do this in an data step as opposed to SQL subqueries. Any ideas?
Want:
previous_pmt_date scheduled_pmt_date latest_row_ind
11/16/2015 12/16/2015 0
12/17/2015 01/16/2016 1
01/17/2016 02/16/2016 0
Here's a solution that does it all in the single existing datastep without any additional sorting. First I'm going to modify your data slightly to include account as the solution really should take that into account as well:
DATA INFORM2;
INFORMAT previous_pmt_date scheduled_pmt_date MMDDYY10.;
INPUT account previous_pmt_date scheduled_pmt_date;
FORMAT previous_pmt_date scheduled_pmt_date MMDDYYS10.;
DATALINES;
1 11/16/2015 12/16/2015
1 12/17/2015 01/16/2016
1 01/17/2016 02/16/2016
2 11/16/2015 12/16/2015
2 12/17/2015 01/16/2016
2 01/17/2016 02/16/2016
;
run;
Specify a cutoff date:
%let cutoff_date = %sysfunc(mdy(1,31,2016));
This solution uses the approach from this question to save the variables in the next row of data, into the current row. You can drop the vars at the end if desired (I've commented out for the purposes of testing).
data want;
set inform2 end=eof;
by account scheduled_pmt_date;
recno = _n_ + 1;
if not eof then do;
set inform2 (keep=account previous_pmt_date scheduled_pmt_date
rename=(account = next_account
previous_pmt_date = next_previous_pmt_date
scheduled_pmt_date = next_scheduled_pmt_date)
) point=recno;
end;
else do;
call missing(next_account, next_previous_pmt_date, next_scheduled_pmt_date);
end;
select;
when ( next_account eq account and next_scheduled_pmt_date gt &cutoff_date ) flag='a';
when ( next_account ne account ) flag='b';
otherwise flag = 'z';
end;
*drop next:;
run;
This approach works by using the current observation in the dataset (obtained via _n_) and adding 1 to it to get the next observation. We then use a second set statement with the point= option to load in that next observation and rename the variables at the same time so that they don't overwrite the current variables.
We then use some logic to flag the necessary records. I'm not 100% of the logic you require for your purposes, so I've provided some sample logic and used different flags to show which logic is being triggered.
Some notes...
The by statement isn't strictly necessary but I'm including it to (a) ensure that the data is sorted correctly, and (b) help future readers understand the intent of the datastep as some of the logic requires this sort order.
The call missing statement is simply there to clean up the log. SAS doesn't like it when you have variables that don't get assigned values, and this will happen on the very last observation so this is why we include this. Comment it out to see what happens.
The end=eof syntax basically creates a temporary variable called eof that has a value of 1 when we get to the last observation on that set statement. We simply use this to determine if we're at the last row or not.
Finally but very importantly, be sure to make sure you are keeping only the variables required when you load in the second dataset otherwise you will overwrite existing vars in the original data.

How can I use different sets of date values depending on the date

I am looking to automate a daily report for my company but I have run in to a bit of trouble. The report gets updated only on the 2nd working day of each month. I found some code on the SAS website which works out what the 2nd working day of any month is.
data scdwrk;
/* advance date to the first day of the month using the INTNX function */
second=intnx('month',today(),0);
/* determine the day of the week using the WEEKDAY function */
day=weekday(second);
/* if day=Monday then advance by 1 */
if day=2 then second+1;
/* if day=Sunday then advance by 2 */
else if day=1 then second+2;
format second date9.;
run ;
I have also set a flag that compares todays date to the date from this generated by this piece of code.
I now need to find a way that if the code is run on the first working day of the month then it runs a particular set of macro date variables
%let start_date="&prevmnth;
%let end_date= &endprevmnth;
%let month= &prevyearmnth;
and then when its run on the 2nd working day of the month it uses the other set of macro date variables (calender month)
%let start_date="&currmnth;
%let end_date= &endcurrmnth;
%let month= &curryearmnth;
Any help on this would be greatly appreciated.
I have some recent code that does just this. Here is how I tackled it.
First, create a table of holidays. This can be maintained yearly.
Second, create a table with the first 5 days of the month that are not weekend days.
Third, delete holidays.
Finally, get the second value in the data set.
data holidays;
format holiday_date date9.;
informat holiday_date date9.;
input holiday_date;
datalines;
01JAN2015
19JAn2015
16FEB2015
03APR2015
25MAY2015
03JUL2015
07SEP2015
26NOV2015
25DEC2015
;
data _dates;
firstday = intnx('month',today(),0);
format firstday date date9.;
do date=firstday to firstday+5;
if 1 < weekday(date) < 7 then
output;
end;
run;
proc sql noprint;
delete from _dates
where date in (select holiday_date from holidays);
quit;
data _null_;
set _dates(firstobs=2);
call symput("secondWorkDay",put(date,date9.));
stop;
run;
%put &secondWorkDay;