How to get week value based on financial year in SAS? - sas

I have below dataset , I need to find the week number from the date given based on the financial year(e.g April 2013 to March 2014). For example 01AprXXX , should be 0th or 1st week of the year and the consequent next year March's last week should be 52/53. I have tried a way to find out the same( code is present below as well).
I am just curious to know if there is any better way in SAS to do this in SAS
. Thanks in advance. Please let me know if this question is redundant, in that case I would delete it at the earliest, although I search for the concept but didn't find anything.
Also my apologies for my English, it may not be grammatically correct.But I hope I am able to convey my point.
DATA
data dsn;
format date date9.;
input date date9.;
cards;
01Nov2015
08Sep2013
06Feb2011
09Mar2004
31Mar2009
01Apr2007
;
run;
CODE
data dsn2;
set dsn;
week_normal = week(date);
dat2 = input(compress("01Apr"||year(date)),date9.);
week_temp = week(dat2);
format dat2 date9.;
x1 = month(input(compress('01Jan'||(year(date)+1)),date9.)) ;***lower cutoff;
x2 = month(input(compress("31mar"||year(date)),date9.)); ***upper cutoff;
x3 = week(input(compress("31dec"||(year(date)-1)),date9.)); ***final week value for the previous year;
if month(dat2) <= month(date) <= month(input(compress("31dec"||year(date)),date9.)) then week_f = week(date) - week_temp;
else if x2 >= month(date) >= x1 then week_f = week_normal + x3 - week(input(compress("31mar"||(year(date)+1)),date9.)) ;
run;
RESULT

INTCK and INTNX are your best bets here. You could use them as I do below, or you could use the advanced functionality with defining your own interval type (fiscal year); that's described more in the documentation.
data dsn2;
set dsn;
week_normal = week(date);
dat2 = intnx('month12.4',date,0); *12 month periods, starting at month 4 (so 01APR), go to the start of the current one;
week_F = intck('week',dat2,date); *Can adjust what day this starts on by adding numbers to week, so 'week.1' etc. shifts the start of week day by one;
format dat2 date9.;
run;

Related

Do loop not pulling in all data- creating dates

We have a code that we use to create quarterly reports of projects. There is a piece of code, a do loop, that takes the startdate and enddate of each project in our dataset and creates an observation for each month and year that the project took place in. For example if we have a project called "Employment Help" with a startdate value of 01JAN2022 and an enddate value of 01APR2022, the do loop will create 4 observations for this project with the month and year values of 1 2022, 2 2022, 3 2022, and 4 2022. We use this to count how many projects happened during our quarters. We are running into an issue where the do loop is dropping projects and not giving them a month or year value and we are losing projects in our count because of this. The dates are all in the same format.
Here is an example of some data that is pulled in, EXAMPLE 2 is properly pulled into the do loop, EXAMPLE 1 does not get pulled through.
Here is the code:
**data test2;
set users3;
do i = 0 to (year(enddate)-year(startdate));
year = year(startdate)+i;
end;
do i = 0 to (month(enddate)-month(startdate));
month = month(startdate)+i;
drop i;
output;
end;
run;**
Consider the following example:
data have;
input project$ startdate:date9. enddate:date9.;
format startdate enddate date9.;
datalines;
A 01JAN2022 01APR2022
B 01MAR2022 01JUN2022
C 01NOV2022 01JAN2023
;
run;
The third row will fail to run because the difference between the start month number and end month number is negative (1 - 11). Instead of doing two loops, one for year and one for month, do a single loop for all of the months from the start date. Use intnx() to generate your months using startdate as the reference month. i will offset each month from the start date. For example:
code output
intnx('month', '01JAN2022'd, 0) 01JAN2022
intnx('month', '01JAN2022'd, 1) 01FEB2022
intnx('month', '01JAN2022'd, 2) 01MAR2022
Since you're incrementing by exactly one month for each date, you can get the year and month number in a single loop.
data want;
set have;
do i = 0 to intck('month', startdate, enddate);
month = month(intnx('month', startdate, i) );
year = year(intnx('month', startdate, i) );
output;
end;
drop i;
run;
Your code doesn't seem to handle the cross of years, ie if a project started in 2021 and ended in 2022.
This should get you closer.
data have;
input startdate : date9. enddate : date9.;
format startdate enddate date9.;
cards;
01Jan2022 01Apr2022
01Sep2021 01Apr2022
;;;
run;
data want;
set have;
nmonths = intck('month', startdate, enddate) +1 ;
date = startdate;
do i = 1 to nmonths;
month = month(date);
year = year(date);
date = intnx('month', startdate, i, 'b');
output;
end;
run;

Calculate the Number of Patients Being Treated In a Given Hour

I'd like to calculate the number of patients currently within an Emergency Room by hour and I'm having trouble conceptualizing an efficient code.
I have two time variables, 'Check In Time' and 'Release Time'. These date/time variables are obviously arbitrary and the 'release time' variable will come after the 'check in time variable'.
I would like the output for a given day to look something like this:
Hour Midnight 1am 2am 3am 4am.....
# of Pts 34 56 89 23 29
So for example, at 1am there were 56 patients currently in the ED -when considering both checkin and release times.
My initial thought is to:
1) round the time variables
2) Write a code a code the looks something like this...
data EDTimesl;
set EDDATA;
if checkin = '1am' and release = '2am' then OneAMToTwoAM = 1;
if checkin = '1am' and release = '3am' then OneAMToTwoAM = 1;
if checkin = '1am' and release = '3am' then TwoAMToThreeAM = 1;
....
run;
This, however, gives me pause because I feel there is a more efficient method!
Thanks in advance!
I found a code online that might answer the question! Please see below:
data have (keep=admitdate disdate);
/* generate some admission and discharge date time variables*/
year=2015; /* for example all of the admits are in 2015*/
format admitdate disdate datetime20.;
do day= 1 to 20;
do month=1 to 12;
hour = floor(24*ranuni(4445));
min = floor(50*ranuni(1234));
date = mdy(month,day,2015);
admitdate=dhms(date,hour,min,0);
/* random duration of stay*/
duration = 60 + floor(3000*ranuni(7777));
disdate = intnx('minute',admitdate,duration);
output;
end;
end;
run;
data occupancy;
set have;
format admitdate disdate datetime20.;
Do Occupanthour = (dhms(datepart(admitdate),hour(admitdate),0,0)) to
dhms(datepart(disdate),hour(disdate),0,0) by 3600;
HourOfDay = hour(OccupantHour);
DayOfWeek = Weekday(datepart(OccupantHour));
output;
end;
format OccupantHour datetime20.;
run;
Proc freq data=occupancy;
Tables HourOfDay;
run;
proc tabulate data=occupancy;
class DayOfWeek;
class HourOfDay;
tables HourOfDay,
(DayOfWeek All)*n;
run;

how to identify the third friday of a particular month using sas

I have a time series data. Data looks like the following:
date variable
01-Dec-2012 0.1
02-Dec-2012 0.1
03-Dec-2012 0.1
04-Dec-2012 0.1
05-Dec-2012 0.1
...
20-Dec-2012 0.1
21-Dec-2012 0.1
22-Dec-2012 0.1
I want to create a dummy variable which equals to 1 if date is in December and it is before or at the second Thursday. It equals to 0 if date is in December and after the second Thursday. It equals to missing if month(date) ^= 12.
Can anyone teach me how to identify the second Thursday of December and solve this problem please.
NWKDOM
Third Friday in a month, where the month /year are extracted from a SAS date.
Friday3 = NWKDOM(3, 6, month(sas_date), year( sas_date));
http://support.sas.com/documentation/cdl/en/lefunctionsref/63354/HTML/default/viewer.htm#p1kdveu0ry8ltxn1m3um2ntxs7d5.htm
Here's another approach for people who don't have SAS 9.3+ and can't use nwkdom to do this:
Dummy = intck('week.5',intnx('month',date,0)-1,date-1) < 2;
How this works, from the inside working outwards:
intnx is used to find the first day of the month.
Subtract 1 to get the last day of the previous month.
Subtract 1 from date to get yesterday's date.
Using intck, count the number of Thursdays (week.5) in between these two dates. N.B. this includes yesterday if it was a Thursday, but not the last day of the previous month if that was a Thursday.
If this number is less than 2, date is currently less than or equal to the second Thursday of the month.
Sample usage:
data _null_;
do date = '01dec2011'd to '30dec2011'd;
Dummy = intck('week.5',intnx('month',date,0)-1,date-1) < 2;
put date weekdate. +1 dummy;
end;
run;
EDIT: now works correctly when the first day of the month is a Thursday.
Think this will solve your problem. Have a feeling there is a nicer solution for this but it should work.
data YourData;
format date date9. ;
do i=1 to 100 ;
date=intnx('day', '17oct03'd,i);
var=rand('uniform');
output;
end;
drop i;
run;
Data Find;
set YourData;
Month=month(date);
day=day(date);
Weekday=WEEKDAY(date);
/* weekday=5 this is thursday */
if weekday=5 and month=12 then flag=1;
/* flag2 retains the value */
flag2+flag;
if month=12 and flag2 < 2 then Dummy=1;
else if month=12 and flag2=2 and flag=1 then Dummy=1;
else if month=12 then Dummy=0;
else Dummy=.;
run;

How can I select the first and last week of each month in SAS?

I have monthly data with several observations per day. I have day, month and year variables. How can I retain data from only the first and the last 5 days of each month? I have only weekdays in my data so the first and last five days of the month changes from month to month, ie for Jan 2008 the first five days can be 2nd, 3rd, 4th, 7th and 8th of the month.
Below is an example of the data file. I wasn't sure how to share this so I just copied some lines below. This is from Jan 2, 2008.
Would a variation of first.variable and last.variable work? How can I retain observations from the first 5 days and last 5 days of each month?
Thanks.
1 AA 500 B 36.9800 NH 2 1 2008 9:10:21
2 AA 500 S 36.4500 NN 2 1 2008 9:30:41
3 AA 100 B 36.4700 NH 2 1 2008 9:30:43
4 AA 100 B 36.4700 NH 2 1 2008 9:30:48
5 AA 50 S 36.4500 NN 2 1 2008 9:30:49
If you want to examine the data and determine the minimum 5 and maximum 5 values then you can use PROC SUMMARY. You could then merge the result back with the data to select the records.
So if your data has variables YEAR, MONTH and DAY you can make a new data set that has the top and bottom five days per month using simple steps.
proc sort data=HAVE (keep=year month day) nodupkey
out=ALLDAYS;
by year month day;
run;
proc summary data=ALLDAYS nway;
class year month;
output out=MIDDLE
idgroup(min(day) out[5](day)=min_day)
idgroup(max(day) out[5](day)=max_day)
/ autoname ;
run;
proc transpose data=MIDDLE out=DAYS (rename=(col1=day));
by year month;
var min_day: max_day: ;
run;
proc sql ;
create table WANT as
select a.*
from HAVE a
inner join DAYS b
on a.year=b.year and a.month=b.month and a.day = b.day
;
quit;
/****
get some dates to play with
****/
data dates(keep=i thisdate);
offset = input('01Jan2015',DATE9.);
do i=1 to 100;
thisdate = offset + round(599*ranuni(1)+1); *** within 600 days from offset;
output;
end;
format thisdate date9.;
run;
/****
BTW: intnx('month',thisdate,1)-1 = first day of next month. Deduct 1 to get the last day
of the current month.
intnx('month',thisdate,0,"BEGINNING") = first day of the current month
****/
proc sql;
create table first5_last5 AS
SELECT
*
FROM
dates /* replace with name of your data set */
WHERE
/* replace all occurences of 'thisdate' with name of your date variable */
( intnx('month',thisdate,1)-5 <= thisdate <= intnx('month',thisdate,1)-1 )
OR
( intnx('month',thisdate,0,"BEGINNING") <= thisdate <= intnx('month',thisdate,0,"BEGINNING")+4 )
ORDER BY
thisdate;
quit;
Create some data with the desired structure;
Data inData (drop=_:); * froget all variables starting with an underscore*;
format date yymmdd10. time time8.;
_instant = datetime();
do _i = 1 to 1E5;
date = datepart(_instant);
time = timepart(_instant);
yy = year(date);
mm = month(date);
dd = day(date);
*just some more random data*;
letter = byte(rank('a') +floor(rand('uniform', 0, 26)));
*select week days*;
if weekday(date) in (2,3,4,5,6) then output;
_instant = _instant + 1E5*rand('exponential');
end;
run;
Count the days per month;
proc sql;
create view dayCounts as
select yy, mm, count(distinct dd) as _countInMonth
from inData
group by yy, mm;
quit;
Select the days;
data first_5(drop=_:) last_5(drop=_:);
merge inData dayCounts;
by yy mm;
_newDay = dif(date) ne 0;
retain _nrInMonth;
if first.mm then _nrInMonth = 1;
else if _newDay then _nrInMonth + 1;
if _nrInMonth le 5 then output first_5;
if _nrInMonth gt _countInMonth - 5 then output last_5;
run;
Use the INTNX() function. You can use INTNX('month',...) to find the beginning and ending days of the month and then use INTNX('weekday',...) to find the first 5 week days and last five week days.
You can convert your month, day, year values into a date using the MDY() function. Let's assume that you do that and create a variable called TODAY. Then to test if it is within the first 5 weekdays of last 5 weekdays of the month you could do something like this:
first5 = intnx('weekday',intnx('month',today,0,'B'),0) <= today
<= intnx('weekday',intnx('month',today,0,'B'),4) ;
last5 = intnx('weekday',intnx('month',today,0,'E'),-4) <= today
<= intnx('weekday',intnx('month',today,0,'E'),0) ;
Note that those ranges will include the week-ends, but it shouldn't matter if your data doesn't have those dates.
But you might have issues if your data skips holidays.

Which months are included in a date range?

I have a dataset with from and to dates of registration for a group of users. I would like to programmatically find which months lie in between those dates for each user, without having to hard code in any months, etc. I only want a summary of numbers registered in each month, so if that makes it quicker, so much the better.
E.g. I have something like
User-+-From-------+-To-----------------
A + 11JAN2011 + 15MAR2011
A + 16JUN2011 + 17AUG2011
B + 10FEB2011 + 12FEB2011
C + 01AUG2011 + 05AUG2011
And I want something like
Month---+-Registrations
JAN2011 + 1 (A)
FEB2011 + 2 (AB)
MAR2011 + 1 (A)
APR2011 + 0
MAY2011 + 0
JUN2011 + 1 (A)
JUL2011 + 1 (A)
AUG2011 + 2 (AC)
Note I don't need the bit in brackets; that was just to try and clarify my point.
Thanks for any help.
One easy way is to construct an intermediate dataset and then PROC FREQ.
data have;
informat from to DATE9.;
format from to DATE9.;
input user $ from to;
datalines;
A 11JAN2011 15MAR2011
A 16JUN2011 17AUG2011
B 10FEB2011 12FEB2011
C 01AUG2011 05AUG2011
;;;;
run;
data int;
set have;
_mths=intck('month',from,to,'d'); *number of months after the current one (0=current one). 'd'=discrete=count 1st of month as new month;
do _i = 0 to _mths; *start with current month, iterate over months;
month = intnx('month',from,_i,'b');
output;
end;
format month MONYY7.;
run;
proc freq data=int;
tables month/out=want(keep=month count rename=count=registrations);
run;
You can eliminate the _mths step by doing that in the do loop.