I have a proc sql code that needs to run twice each month- 1st and the 16th.
There is a where clause in the proc sql.
When the report runs on say January 1st 2022, the where clause filters records that lie between 16December 2021 to 31st December 2021.
And when the report runs on 16th January 2022, the where clause filters records that lie between 01 January 2022 to 15 January 2022.
I have been manually updating these filters everytime I run it but now I need to automate it. There should only be one schedule, which checks for the report run date and accordingly sets the where clause.
To automate date selection, add the following code to your macro. This will create two start/end-date macro variables that hold dates depending on the current day of month.
data _null_;
/* Beginning, end, 15th, and 16th days of this month and last month */
this_month_b = intnx('month', today(), 0, 'B');
last_month_e = intnx('month', today(), -1, 'E');
this_month_15 = mdy(month(this_month_b), 15, year(this_month_b) );
last_month_16 = mdy(month(last_month_e), 16, year(last_month_e) );
/* Assign start/end dates to macro variables based on the current day of month */
if(day(today() ) < 16) then do;
call symputx('start_date', put(last_month_16, date9.) );
call symputx('end_date', put(last_month_e, date9.) );
end;
else do;
call symputx('start_date', put(this_month_b, date9.) );
call symputx('end_date', put(this_month_15, date9.) );
end;
run;
For example, running it today on Jan 3rd:
%put &start_date;
%put &end_date;
16DEC2021
31DEC2021
Add these macro variables to your SQL where statement.
where date BETWEEN "&start_date"d AND "&end_date"d
As for scheduling, there are numerous ways you can schedule SAS processes, whether it's through cron, Viya Jobs or another schedule manager. There are a lot of papers out there on how to schedule SAS jobs in batch. How you do that is up to you, but the above code will handle dynamically selecting data when it runs.
Related
Say you download data for stocks or bonds, you have the stock price or yield for every trading day. So you have two variables, stock price (or yield if bond) and date. What is the quickest way to add weekends and holidays to the dates variable while using the previous open day as the values for those missing days?
For example, if it were July 1, 2022 there would be a stock price, lets say $100, corresponding to that date, but during the long weekend (4th of July) there are no observations in the data with the date being July 2nd through 4th. How do you add those dates with the stock price equaling $100 until the next trading day, July 5th?
I used a do loop to create the dates then merged and retain, but I feel like theres got to be a quicker method
You could just add an OUTPUT statement in a DO loop. The tricky part is getting the next date. Here is a method using a second SET statement that is offset by one observation.
data want;
set have ;
by date;
set have(firstobs=2 keep=date rename=(date=next_date)) have(obs=0 drop=_all_);
next_date = coalesce(next_date,date);
do date=date to next_date;
output;
end;
run;
But your real data probably has multiple stocks. So add some BY group processing.
data want;
set have ;
by stock date;
set have(firstobs=2 keep=date rename=(date=next_date)) have(obs=0 drop=_all_);
if last.stock the next_date=date;
do date=date to next_date;
output;
end;
run;
If I have two dates e.g. 22MAR2005 to 01MAR2006 and I want to create season intervals (spring, summer, autumn, winter) based on this interval, how can this be done in a data step?
Season's are defined as:
Spring: March to May
Summer: June to August
Autumn: September to November
Winter: December to February
I need to calculate how long they spent in each season.
You need to convert that to MONTH first and then to SEASON. This exact question was asked recently so it's relatively easy to find via search (I'm assuming some course is using this as homework?).
data want;
set have;
*create month;
month_date = month(date);
*assign to season;
if month_date in (6, 7, 8) then season = 'Summer';
else if month_date in (9, 10, 11) then season ='Fall';
....etc;
run;
You could also use a format but since you're just starting out this is likely easier.
Other users question which seems really really similar:
https://communities.sas.com/t5/SAS-Enterprise-Guide/proc-glm/m-p/492142
EDIT: Use INTCK() to calculate the number of intervals.
Then use INTNX to increment across the intervals and count your days.
How you align your dates can be controlled with the first parameter to the INTNX() function. You can check the documentation for the exact specifics.
data want;
start_date="22Mar2005"d;
end_date="01Mar2006"d;
num_intervals=intck('quarter', start_date, end_date, 'C') ;
do interval=0 to num_intervals;
season_start=intnx('Month3.3', start_date, interval, 'b');
season_end=intnx('Month3.3', start_date, interval, 'e');
Number_Days=season_end - season_start + 1;
output;
end;
format start_date end_date season: yymmddd10.;
run;
Hi to all and good time of a day!
Here is my case I need to solve I will very gratefull if you can help me.
I have some data set it contains only one variable date format.
Example:
01JAN2016
06JAN2016
15FEB2016
The second data set is days - holidays for a period 5 years.
Example:
01JAN2016
02JAN2016
and etc, all these days are not working days.
The case is I need to count number of working days from date for every observation from first data set till now. It seems that I need to count number of days
"Now date" minus Date(from first data set) and minus number of days from second data set with holidays (count(date) where Date(from first data set)< date < "Now"
You can define your own type of interval to use with SAS funcions intck and intnx. Here's how to do it:
First create a table of weekdays for whichever years you have holidays for, up to present (or a future) year.
Here we'll start by including all weekdays from 2014 to 2016. This is assuming you don't want to count weekend days. If that's not the case, just modify the code so that the condition "weekday(date) in (2:6)" is not applied. You'll get the full 365 days of the year.
data mon_fri;
do date = "01JAN2014"d to "31DEC2016"d;
if weekday(date) in (2:6) then output;
end;
format date date9.;
run;
Then we'll create a table having all those dates we just created, minus the holidays we have in the table Holidays. We'll place the table in a library called myLib, and rename the date column to "Begin" for compliance with SAS custom intervals.
libname myLib "some/place/on/your/drive";
data mylib.workdays(RENAME=(date=Begin));
merge mon_fri (in=weekday)
Holidays (in=holiday);
by date;
if weekday and not holiday then output;
run;
Now we set up a custom interval which we'll simply call "workdays".
options intervalds=(workdays=mylib.workdays);
From there, all you have left to do is something like this:
data dateCalculations;
set mydata;
numOfDays = intck("workdays", theDate, today());
run;
SAS will take care of counting the number of dates (lines in the workdays dataset) separating the startdate (column called theDate) from the enddate (today's date).
Et voilĂ !
This is wonderful and very helpful. I use two different SAS systems (both on remote Unix servers). Setting the intervalds option only seems to work on one of them. I copy/paste the same code and on the other nothing happens - no warning, no error, it simply doesn't work.
Here is how I'm setting it (download the CSV from Yahoo! Finance for the S&P500, daily data, starting January 1950):
PROC IMPORT DATAFILE="sp500_1950_2016.csv"
OUT=sp500_1950_2016
DBMS=DLM
REPLACE;
delimiter=',';
getnames=yes;
RUN;
data trading_days;
set sp500_1950_2016 (keep = date rename=(date=begin));
where year(begin) < 2017;
run;
options intervalds=(TradingDay=trading_days) ;
Then I call it like so to count number of observations I should have from fund inception to Dec 31, 2016 or when the fund closed, whichever is sooner:
data ops2; set operations_master; where ~missing(inception);
if missing(enddate) then enddate = '31dec2016'd;
datadays = INTCK('TradingDay',inception,enddate);run;
proc univariate; var datadays;run;quit;
On system 1, this works just fine. On system 2, I get 0 for the variable datadays. I've already checked to see if there is a sys admin override on setting the intervalds option, and there is not. Is there another reason why this might not work on a given system?
I am looking to automate a daily report for my company but I have run in to a bit of trouble. The report gets updated only on the 2nd working day of each month. I found some code on the SAS website which works out what the 2nd working day of any month is.
data scdwrk;
/* advance date to the first day of the month using the INTNX function */
second=intnx('month',today(),0);
/* determine the day of the week using the WEEKDAY function */
day=weekday(second);
/* if day=Monday then advance by 1 */
if day=2 then second+1;
/* if day=Sunday then advance by 2 */
else if day=1 then second+2;
format second date9.;
run ;
I have also set a flag that compares todays date to the date from this generated by this piece of code.
I now need to find a way that if the code is run on the first working day of the month then it runs a particular set of macro date variables
%let start_date="&prevmnth;
%let end_date= &endprevmnth;
%let month= &prevyearmnth;
and then when its run on the 2nd working day of the month it uses the other set of macro date variables (calender month)
%let start_date="&currmnth;
%let end_date= &endcurrmnth;
%let month= &curryearmnth;
Any help on this would be greatly appreciated.
I have some recent code that does just this. Here is how I tackled it.
First, create a table of holidays. This can be maintained yearly.
Second, create a table with the first 5 days of the month that are not weekend days.
Third, delete holidays.
Finally, get the second value in the data set.
data holidays;
format holiday_date date9.;
informat holiday_date date9.;
input holiday_date;
datalines;
01JAN2015
19JAn2015
16FEB2015
03APR2015
25MAY2015
03JUL2015
07SEP2015
26NOV2015
25DEC2015
;
data _dates;
firstday = intnx('month',today(),0);
format firstday date date9.;
do date=firstday to firstday+5;
if 1 < weekday(date) < 7 then
output;
end;
run;
proc sql noprint;
delete from _dates
where date in (select holiday_date from holidays);
quit;
data _null_;
set _dates(firstobs=2);
call symput("secondWorkDay",put(date,date9.));
stop;
run;
%put &secondWorkDay;
Background:
I have a code that pulls transactional data starting at the beginning of the current calendar quarter, but from an year ago.
For example, if I run the code today (August 16, 2013) it will have to pull all the data from July 1, 2012 onwards.
Problem:
I want to automate the starting date for the data pull with a macro variable.
So far, I'm stuck here:
%let ThisYear = %Sysfunc(Date(), YEAR.);
%let LastYear= %eval(&ThisYear-1); /* I get the starting year */
%let QTR_start_month= %eval(3*%Sysfunc(Date(), qtr.)-2); /* this gives me the current quarter starting month. If I run it in August, it outputs 7 for July */
%let start_date=%str(01/%Sysfunc(month(&QTR_start_month))/&lcy);
The final macro variable outputs the date which I want, but in a format which is not recognized by SAS.
I will greatly appreciate any help.
Many thanks in advance!
You can either input that date to a date format, or construct it like a SAS date literal ('01JUL2013'), DDMONYY(YY), or construct it as a date value directly.
INTNX is probably your best option here to construct it; you don't need all that work.
%let start_date = %sysfunc(intnx(Quarter,%sysfunc(date()),-4),DATE9.);
%put &start_date;
You can leave DATE9. to use it as a date literal, or remove the ,DATE9. to get the numeric value that can be used directly. You would use this as "&start_Date."d to use the date literal.
This should do the job.
data test;
format todays_date starting_qtr date9.;
todays_date=today();
/*this takes today's date and rolls back 4 qtrs and sets that date to the first day of that quarter*/
starting_qtr = intnx('qtr',todays_date,-4,'b');
/*so, running this code today, 16AUG2013 would yield starting_qtr=01JUL2012 */
call symputx('start_date', put(starting_qtr, date9.));
run;
%put &start_date.;