Check for valid dates using SAS macro - sas

I have written a macro to check for invalid dates and set it to '11111111', but got an unexpected NOTE: Invalid argument to function INPUT. The reason for the note is, source data has date "0212-04-26" which is beyond the SAS dates ranging A.D. 1582 to A.D. 19,900. So now I'm looking to check for invalid date without any message`.
My Code:
%macro chkdate(datefld=, num_date=) ;
/* invalid date gets set to '11111111' */
if &datefld ne '0001-01-01' then do ;
t_date = input(compress(&datefld, '-'), yymmdd8.) ;
if t_date eq . then do ;
%errors( key_desc=Invalid Date, fname=&datefld,)
&datefld = '11111111' ;
end;
end;
%mend chkdate ;
thanks

Different approaches exist...
I'd recommend reading this paper which illustrates the ? and ?? informat modifiers to be used with input().
You could, among other avenues, start by using input() with format 8. and check that it's between, for instance, 19600101 and 20250101 (or whichever numbers are appropriate), and only then do the input() with yymmdd8..
If you want to be more thorough, you could use input() with 3 substrings and check separately the year, month and day parts (using between 1960 and 2020, between 1 and 12, etc.).

Related

SAS character and numeric change with set statement

I am working to merge two data sets and get the following error:
Variable DOB has been defined as both character and numeric.
Here is my code. I know I need a set statement to change the character to numeric. I was thinking:
DATA Merged1;
SET Aug21 Aug22;
RUN;
set (rename=(DOB=DOBnum));
length DOB $ 10.;
DOB= put(DOBnum,f10. -L);
drop DOBnum;
Would this be placed before my Set statement to merge to Aug 21 Aug 22?
Thank you!
I tried to run the code but it would not merge, unsure if where the Set statement for DOB would go
You do not need the second SET statement. You need to add the RENAME= dataset option to the dataset where it is mentioned in the first SET statement.
So something like:
DATA BOTH;
SET Aug21 Aug22(in=in2 rename=(DOB=DOBnum));
if in2 then DOB= put(DOBnum,f10. -L);
drop DOBnum;
RUN;
To get a more detailed answer provide more details about the variables and the types of values they contain. For example if DOB means Date of Birth then it does not make much sense to use the F format. If DOB should be an actual DATE then it should be numeric and not character. And if the version that is numeric has actual date values then converting them to text using the F format is going to generate strings that will be confusing for humans.
If you're a beginner I recommend two steps so you can trace the work.
Convert dob from character to numeric
Append the two datasets together (assume you're stacking the data sets)
Use format to control how the date is displayed
*convert character to numeric SAS date;
data aug21_convert2num;
set aug21(rename=dob=dobchar);
dob = input(dob, anydtdte.);
drop dobchar;
run;
*append the two data sets;
data want;
set aug21_convert2num aug22;
format dob yymmdd10.;
run;

SAS macro to get the last date of the current month

I would like to use a macro in SAS to calculate the last day of the current month when executed.
As i'm quite new to the SAS macro's i've tried to create on based on the information i've found on the internet.
%let last_day = %sysfunc(putn(%sysfunc(intnx(month,%sysfunc(today()),e), date9.));
However it does not seem to work when i execute it.
You left out the number of intervals in the INTNX() function call.
To create a macro variable with the string that looks like the last day of the current month in the style produced by the DATE9. format just use:
%let last_day = %sysfunc(intnx(month,%sysfunc(today()),0,e), date9.);
You could then use that macro variable to generate strings. Such as in a TITLE statement.
TITLE "End of the month is &last_day";
If you want to use it as an actual date you will need to convert it to a date literal by adding quotes and the letter d.
...
where date <= "&last_day"d ;
And if so it is probably simpler to not use the DATE9. format at all and just store the raw number of days since 1960 in the macro variable.
%let last_day = %sysfunc(intnx(month,%sysfunc(today()),0,e));
...
where date <= &last_day ;

Convert Timestamp to Numeric value in SAS

How to convert the default timestamp "0001-01-01-00.00.00.000000" in SAS, i have tried below code but it has returned null value. Can someone help on this please
data _NULL_;
x = "0001-01-01-00.00.00.000000";
rlstime = input(x,anydtdtm26.);
call symput('rlstime',rlstime);
run;
%put rlst: &rlstime;
As far as I remember, SAS cannot do that. Any date/timestamp before 1.1.1600 doesn't exist for SAS. Do you need it or can you just replace it with a null value? If you really need it you could transform it into another valid timestamp, split it into different columns (year, month, etc.) or just use it as a string. In your example you just write the timestamp into the log, meaning it's not necessary to transform it.
The earliest date that SAS will handle is 1st January, 1582. Additionally, a colon character should be used to delimit the time from the date, as well as the hours, minutes and seconds. Therefore, your code may be adjusted to the following:
data _NULL_;
x = "1582-01-01:00:00:00.000000";
rlstime = input(x,anydtdtm26.);
call symput('rlstime',rlstime);
run;
%put rlst: &rlstime;

SAS HANA Date from data step in PRO

A little new to SAS here. I am using the following data step to get the first and last day of the month.
Data _NULL_;
begindt=IntNX("Month", Date(), 0) ;
enddt=IntNX("Month", Date(),0,'E');
PUT begindt=E8601DA. enddt=E8601DA.;
Run;
The data step gets the results of begindt=2021-09-01 and enddt=2021-09-30.
However, I am having trouble converting the value to a date format to use in a where claus in a PROC SQL statement later in the program. The commented out code works, but I can't get the date from data step in the correct format for the PROC SQL statement to work.
/* AND "DETAILAR"."CLEAR_DOC_POSTING_DATE" = '2021-09-01' */
AND "DETAILAR"."CLEAR_DOC_POSTING_DATE" = begindt
SAS has date and time literals that make dealing with dates and times easy. SAS dates are the number of days since Jan 1 1960, and SAS datetimes are the number of seconds since Jan 1 1960. SAS automatically converts date literals to these times for you. Some examples of date and datetime literals:
'04SEP2021'd
'04SEP2021:00:00'dt
You don't have to use these all the time*, but they make debugging way easier. In your case, you simply need to feed a date literal into proc sql. If you're connecting to SAP HANA through SAS, the SAS/ACCESS engine to SAP will handle the conversion for you.
data _null_
begindt = intnx('month', today(), 0, 'B');
enddt = intnx('month', today(), 0, 'E');
call symputx('begindt', put(begindt, date9.) );
call symputx('enddt', put(enddt, date9.) );
run;
Or, equivalently:
%let begindt = %sysfunc(intnx(month, %sysfunc(today()), 0, B), date9.);
%let enddt = %sysfunc(intnx(month, %sysfunc(today()), 0, E), date9.);
Now you have two macro variables that you can not only easily read, but SAS will convert them for you. You can view them below:
%put &begindt;
%put &enddt;
Simply add them as date literals to your where clause in proc sql and let SAS do the rest.
proc sql;
create table want as
select *
from have
where CLEAR_DOC_POSTING_DATE BETWEEN "&begindt"d AND "&enddate"d
;
quit;
There are other literals too, like time literals, hex literals and name literals for variables with spaces in them.
'10:00't - Time Literal
'32'x - Hex literal
'this is a var'n - Name literal
*proc timeseries, proc timedata, and proc tsmodel require date/datetime literals for the start and end options. But those are the only ones I know of.
If you want to generate code like '2021-09-01' then why not create a macro variable with that string in it?
In your data _null_ step use:
call symputx('begindt',quote(put(intnx('month',date(),0),yymmdd10.),"'"));
Working from the inside out that statement will:
calculate today's date
convert to the start of the month
convert to a 10 character string representing that date
add single quotes around the string
store the value into a macro variable named begindt
Now you reference the macro variable to generate the code you want
and "DETAILAR"."CLEAR_DOC_POSTING_DATE" = &begindt.
Which will generate the code:
and "DETAILAR"."CLEAR_DOC_POSTING_DATE" = '2021-09-01'

Input date format YYYYMMDD

I need to input dates in YYYYMMDD format and create macro variables from these dates to use in a WHERE clause. The FINAL dataset should select one record from Sales but 0 observations are returned.
data work.FiscalYear2019;
input #1 fiscalYear $4. #5 StartDate mmddyy8.;
retain diff;
if fiscalYear = '2019' then do;
tday = today();
diff = tday - StartDate;
call symputx('FYTD_days',diff);
call symputx('CY_StartDate', StartDate);
call symputx('CY_EndDate', put(today(),mmddyy8.));
end;
else if fiscalYear = '2018' then do;
PY_EndDate = StartDate + diff;
call symput('PY_EndDate', put(PY_EndDate,mmddyy8.));
call symput('PY_StartDate', put(StartDate,mmddyy8.));
end;
datalines;
201912312018
201801012018
;
data work.Sales;
input #1 fiscalYear $4. #5 orderDate mmddyy8.;
format orderDate mmddyy6.;
datalines;
201902042019
201801012018
;
data final (WHERE=(orderDate >= &PY_StartDate AND
orderDate <= &PY_EndDate));
set Sales;
run;
I expect the FINAL dataset to contain one record from the Sales dataset but FINAL has 0 observations.
To use your macro variables as date values you need to either generate the macro variables as the raw number of days values, like you did with CY_StartDate, or generate them using the DATE format and enclose them in quotes and append the letter D to make a date literal.
Like this:
call symputX('PY_StartDate', put(StartDate,date9.));
call symputX('PY_EndDate', PY_EndDate);
...
data final
set Sales;
WHERE orderDate >= "&PY_StartDate"d
AND orderDate <= &PY_EndDate
;
run;
Also your subject line mentions YYYYMMDD informat and it does not appear in your code. Are you interpreting your source data properly? Does 201801012018 represent an 8 digit date in YMD order plus a four digit year? Or a four digit year plus an 8 digit date in MDY order?
You are just not referring to your macro variables in the last data step with the proper syntax. Those &PY_StartDate and &PY_EndDate variables are just strings after macro code is compiled, and you need to refer to them as date constants. So this should fix the issue:
data final (WHERE=(orderDate >= "&PY_StartDate"d AND
orderDate <= "&PY_EndDate"d));
set Sales;
run;
In the future, I recommend including options mprint; at the start of your code. This option displays the text generated by macro execution in your log, which can help greatly with debugging macros.