Using Loop Function during Importing data by Data steps in SAS - sas

“Client_ID_12_months_bill.txt” contains the 12 months credit card bill against each client_id in single Import the data into SAS such that bill for each month is recorded as a separate observation and month number is also specific (Bonus question – Month name instead of month number)
In this Ques, i have Imported the data successfully.But the problem Which I am facing is that,i am not getting the value in Variable-'Balance' (instead of getting correct output i am getting only 1 in each and every observation of Variable-Balance).
Sample Data
108263 $946.00 $903.00 $804.00 $674.00 $663.00 $195.00 $922.00 $595.00 $157.00 $415.00 $868.00 $750.00
103681 $135.00 $573.00 $642.00 $208.00 $922.00 $592.00 $425.00 $658.00 $131.00 $648.00 $750.00 $515.00
116865 $624.00 $679.00 $402.00 $636.00 $358.00 $560.00 $884.00 $514.00 $565.00 $278.00 $117.00 $852.00
102998 $747.00 $505.00 $549.00 $942.00 $884.00 $991.00 $480.00 $326.00 $447.00 $617.00 $721.00 $874.00
115569 $254.00 $792.00 $420.00 $642.00 $851.00 $258.00 $872.00 $828.00 $658.00 $260.00 $499.00 $575.00
data Client_Bill (keep=client_ID balance month_num month_name i );
infile '/folders/myfolders/SAS Assignment/Assignment 8 files
Part-2/Client_ID_12_months_bill.txt' truncover;
informat month1-month12 Dollar6.2;
input client_ID $ month1-month12;
array months(*) month1-month12;
do i=1 to dim(months);
if not missing(months(i)) then do;
balance=month(i);
month_num=i;
month_name=put(mdy(i,1,2017),monname.);
output;
end;
end;
run;

Quite simply, balance is defined as the value returned from the function month (note the singular form), not an element of the array months (plural form).
The monthfunctions returns the number of a month, given a date. i is interpreted as a date, ie the number of days after 01JAN1960. You are giving it the values 1-12, which are all in January 1960. Thus, it returns month number 1 for all of them.

Related

How to create a new variable of age based upon an existing numeric date born variable in sas?

I want to create a numeric age variable using an existing numeric born date variable (MMDDYY10) in SAS. This "BORN" variable is numeric with a length of 8, the format is MMDDYY10. I'm assuming to use: age=today's date -BORN date. However, BORN date is like:-15226、-8803….I just don't understand why before these number, there is a minus signal. So what is the code to transfer to actual age?
I don't understand why before born date number, there is a minus signal. So how to use today's date minus born date of patient?
SAS is using a number for date/time. Dates are defined as number of days between 1.1. 1960 and specified date, so dates before that time are negative. To translate it to a (for people) readable form, you have to use formats (for example MMDDYY10.)
Similarly time is a number of seconds since midnight of the current day. SAS time values are between 0 and 86400.
Your code would look like this:
data have;
input born MMDDYY10.;
format born MMDDYY10.;
datalines;
03/17/2000
11/11/1988
08/11/1923
;
run;
data want;
set have;
age = floor((DATE()-born) / 365.25);
run;
SAS will correctly translate your input (if you correctly used your formats) into numbers, which are easy for a program to calculate with.

Quickest way to fill in missing dates in a sequence? SAS

Say you download data for stocks or bonds, you have the stock price or yield for every trading day. So you have two variables, stock price (or yield if bond) and date. What is the quickest way to add weekends and holidays to the dates variable while using the previous open day as the values for those missing days?
For example, if it were July 1, 2022 there would be a stock price, lets say $100, corresponding to that date, but during the long weekend (4th of July) there are no observations in the data with the date being July 2nd through 4th. How do you add those dates with the stock price equaling $100 until the next trading day, July 5th?
I used a do loop to create the dates then merged and retain, but I feel like theres got to be a quicker method
You could just add an OUTPUT statement in a DO loop. The tricky part is getting the next date. Here is a method using a second SET statement that is offset by one observation.
data want;
set have ;
by date;
set have(firstobs=2 keep=date rename=(date=next_date)) have(obs=0 drop=_all_);
next_date = coalesce(next_date,date);
do date=date to next_date;
output;
end;
run;
But your real data probably has multiple stocks. So add some BY group processing.
data want;
set have ;
by stock date;
set have(firstobs=2 keep=date rename=(date=next_date)) have(obs=0 drop=_all_);
if last.stock the next_date=date;
do date=date to next_date;
output;
end;
run;

How to calculate month number in sas

Hi I need to calculate the value of month supposed in sas
01jan1960 is equal to 1
02jan1960 is equal to 2
So I need to calculate for 01aug2020
I used intck function but no output
I want in datastep only .
SAS stores dates as the number of days since 1960 with zero representing first day of 1960. To represent a date in a program just use a quoted string followed by the letter D. The string needs to be something the DATE informat can interpret.
Let's run a little test.
6 data _null_;
7 do dt=0 to 3,"01-JAN-1960"d,'01AUG2020'd;
8 put dt= +1 dt date9.;
9 end;
10 run;
dt=0 01JAN1960
dt=1 02JAN1960
dt=2 03JAN1960
dt=3 04JAN1960
dt=0 01JAN1960
dt=22128 01AUG2020
So the date value for '01AUG2020'd is 22,128.
Subtraction works
days_interval = '01Aug2020'd - '01Jan1960'd;
Or looking at the unformatted value as SAS stores dates from 01Jan1960
days_interval = '01Aug2020'd;
format days_interval 8.;

Counting working days in SAS EG

Hi to all and good time of a day!
Here is my case I need to solve I will very gratefull if you can help me.
I have some data set it contains only one variable date format.
Example:
01JAN2016
06JAN2016
15FEB2016
The second data set is days - holidays for a period 5 years.
Example:
01JAN2016
02JAN2016
and etc, all these days are not working days.
The case is I need to count number of working days from date for every observation from first data set till now. It seems that I need to count number of days
"Now date" minus Date(from first data set) and minus number of days from second data set with holidays (count(date) where Date(from first data set)< date < "Now"
You can define your own type of interval to use with SAS funcions intck and intnx. Here's how to do it:
First create a table of weekdays for whichever years you have holidays for, up to present (or a future) year.
Here we'll start by including all weekdays from 2014 to 2016. This is assuming you don't want to count weekend days. If that's not the case, just modify the code so that the condition "weekday(date) in (2:6)" is not applied. You'll get the full 365 days of the year.
data mon_fri;
do date = "01JAN2014"d to "31DEC2016"d;
if weekday(date) in (2:6) then output;
end;
format date date9.;
run;
Then we'll create a table having all those dates we just created, minus the holidays we have in the table Holidays. We'll place the table in a library called myLib, and rename the date column to "Begin" for compliance with SAS custom intervals.
libname myLib "some/place/on/your/drive";
data mylib.workdays(RENAME=(date=Begin));
merge mon_fri (in=weekday)
Holidays (in=holiday);
by date;
if weekday and not holiday then output;
run;
Now we set up a custom interval which we'll simply call "workdays".
options intervalds=(workdays=mylib.workdays);
From there, all you have left to do is something like this:
data dateCalculations;
set mydata;
numOfDays = intck("workdays", theDate, today());
run;
SAS will take care of counting the number of dates (lines in the workdays dataset) separating the startdate (column called theDate) from the enddate (today's date).
Et voilà!
This is wonderful and very helpful. I use two different SAS systems (both on remote Unix servers). Setting the intervalds option only seems to work on one of them. I copy/paste the same code and on the other nothing happens - no warning, no error, it simply doesn't work.
Here is how I'm setting it (download the CSV from Yahoo! Finance for the S&P500, daily data, starting January 1950):
PROC IMPORT DATAFILE="sp500_1950_2016.csv"
OUT=sp500_1950_2016
DBMS=DLM
REPLACE;
delimiter=',';
getnames=yes;
RUN;
data trading_days;
set sp500_1950_2016 (keep = date rename=(date=begin));
where year(begin) < 2017;
run;
options intervalds=(TradingDay=trading_days) ;
Then I call it like so to count number of observations I should have from fund inception to Dec 31, 2016 or when the fund closed, whichever is sooner:
data ops2; set operations_master; where ~missing(inception);
if missing(enddate) then enddate = '31dec2016'd;
datadays = INTCK('TradingDay',inception,enddate);run;
proc univariate; var datadays;run;quit;
On system 1, this works just fine. On system 2, I get 0 for the variable datadays. I've already checked to see if there is a sys admin override on setting the intervalds option, and there is not. Is there another reason why this might not work on a given system?

week function giving strange result

using the week function to clean some data and eventually will order the weeks. I used week() on the date 8/26/2011 and I got 34, and when the function inserted the date 01/13/2012 it spit out 2. I thouhgt I was getting number of weeks since jan 1, 1960?
As per the WEEK Function documentation, the default U descriptor specifies the number of the week within the year, with Sunday being deemed the 1st day of the week. (You can use V if you want Monday to be considered the 1st day instead.)
The week function calculates the week of the current year. The answer to the implied question, "how do I calculated the number of days since 1/1/1960 [or some arbitrary date]," is the intck function.
data have;
input datevar date9.;
datalines;
01JAN1960
02JAN2013
13JAN2012
26AUG2011
;;;;
run;
data want;
set have;
wks = intck('week',0,datevar); *# of weeks from 0 to datevar [0=1/1/1960].
*Can replace 0 with any other date variable.;
run;