I face with some interesting task. How to generate a FIFO loop for my data set?
We have losses and recoveries.
losses:
Recoveries:
Using FIFO scheme I should get this table (loses)
Do you have any idea how to get this result?)) Using SAS Data step or Proc sql.
data have1;
input id$ date:mmddyy10. loss_amt;
format date date9.;
datalines;
id1 01.01.2019 100000
id1 02.05.2019 200000
id1 03.06.2019 300000
id1 04.12.2019 500000
id2 02.05.2019 200000
;
run;
data have2;
input id$ date:mmddyy10. recovery_amt;
format date date9.;
datalines;
id1 01.01.2019 50000
id1 02.06.2019 150000
id1 03.06.2019 110000
id1 04.05.2020 100000
id2 02.07.2019 200000
;
run;
I ve read some article about SAS Hash object but I do not understand how exactly apple this function to my data. Now I am trying to find out
Related
I am new to SAS
I have multiple datasets with the following variables
Dataset 1 Subid;visit; flag; date; time
Dataset 2 Subid;visit; flag; date; time
Dataset 3 Subid;visit; date; time
Dataset 4 Subid;visit; date; time
I need to,
When flag is present in the dataset compare date and time for the flag across datasets across visits
When flag is not present in dataset compare date across mentioned datasets and across visits
You have two datasets with the flag and two datasets without the flag. If you simply want a pure comparison of two datasets, proc compare will produce a report for you that compares variables with each other.
Example data:
data dataset1;
input subid visit flag date:date9. time:time.;
format date date9. time time.;
datalines;
1 1 1 01JAN2022 00:00
2 2 0 01JAN2022 01:00
;
run;
data dataset2;
input subid visit flag date:date9. time:time.;
format date date9. time time.;
datalines;
1 1 1 01JAN2022 00:00
2 2 1 03JAN2022 02:00
;
run;
Code:
proc sort data=dataset1;
by subid visit;
run;
proc sort data=dataset2;
by subid visit;
run;
proc compare base=dataset1 compare=dataset2;
id subid visit;
var date time;
run;
You can produce a dataset of only the differences as well.
proc compare base = dataset1
compare = dataset2
out = compare
outnoequal
noprint
;
id subid visit;
var date time;
run;
Hi I am trying to calculate how much the customer paid on the month by subtracting their balance from the next month.
Data looks like this: I want to calculate PaidAmount for A111 in Jun-20 by Balance in Jul-20 - Balance in June-20. Can anyone help, please? Thank you
For this situation there is no need to look ahead as you can create the output you want just by looking back.
data have;
input id date balance ;
informat date yymmdd10.;
format date yymmdd10.;
cards;
1 2020-06-01 10000
1 2020-07-01 8000
1 2020-08-01 5000
2 2020-06-01 10000
2 2020-07-01 8000
3 2020-08-01 5000
;
data want;
set have ;
by id date;
lag_date=lag(date);
format lag_date yymmdd10.;
lag_balance=lag(balance);
payment = lag_balance - balance ;
if not first.id then output;
if last.id then do;
payment=.;
lag_balance=balance;
lag_date=date;
output;
end;
drop date balance;
rename lag_date = date lag_balance=balance;
run;
proc print;
run;
Result:
Obs id date balance payment
1 1 2020-06-01 10000 2000
2 1 2020-07-01 8000 3000
3 1 2020-08-01 5000 .
4 2 2020-06-01 10000 2000
5 2 2020-07-01 8000 .
6 3 2020-08-01 5000 .
This is looking for a LEAD calculation which is typically done via PROC EXPAND but that's under the SAS/ETS license which not many users have. Another option is to merge the data with itself, offsetting the records by one so that the next months record is on the same line.
data want;
merge have have(firstobs=2 rename=balance = next_balance);
by clientID;
PaidAmount = Balance - next_balance;
run;
If you can be missing months in your series this is not a good approach. If that is possible you want to do an explicit merge using SQL instead. This assumes you have month as a SAS date as well.
proc sql;
create table want as
select t1.*, t1.balance - t2.balance as paidAmount
from have as t1
left join have as t2
on t1.clientID = t2.ClientID
/*joins current month with next month*/
and intnx('month', t1.month, 0, 'b') = intnx('month', t2.month, 1, 'b');
quit;
Code is untested as no test data was provided (I won't type out your data to test code).
Lets say I have the following dates for the observations
data dates;
input obs date$11.;
cards;
1 06/10/1949
2 01/07/1952
3 02/10/1947
;
run;
But now I want to insert another column next to date called new date under the date9. format and this new date column is to be numeric.
I tried the following,
data newdata;
set dates;
newdate=input(date,date9.);
run;
But when I run this, the new date column seems to be empty
Your string values are not using a format that is compatible with the DATE. informat. They appear to be using either MMDDYY. or DDMMYY., but it is not possible to tell which from your example values.
data dates;
input obs datestr :$11.;
date1 = input(datestr,mmddyy10.);
date2 = input(datestr,ddmmyy10.);
format date1 date2 date9. ;
cards;
1 06/10/1949
2 01/07/1952
3 02/10/1947
;
results:
Obs obs datestr date1 date2
1 1 06/10/1949 10JUN1949 06OCT1949
2 2 01/07/1952 07JAN1952 01JUL1952
3 3 02/10/1947 10FEB1947 02OCT1947
I have the following dataset:
DATA survey;
informat order_date date9. ;
INPUT id order_date ;
DATALINES;
1 11SEPT20016
2 12AUG2016
3 14JAN2016
;
RUN;
PROC PRINT data = survey;
format order_date date9.;
RUN;
What I would like to do now is classify the records based on their last visit. So what I want to do is:
Set a date (fe, 10SEPT 2016)
Classify all records that have a lastvisit > 30days as 1, Classify all records that have a lastvisit > 60days as 2 etc...
Any thoughts on how I need to program this?
You could build something like this (count the days between the dates, divide them by 30 and ceil them). Alternativly, if you want to use months and not 30 days, you can replace the first intck parameter with 'month' and remove the ceil and /30:
DATA survey;
informat order_date date9. ;
INPUT id order_date ;
DATALINES;
1 11SEP2016
2 12AUG2016
3 14JAN2016
4 09SEP2016
5 10AUG2016
;
RUN;
%let lastvisit=10SEP2016;
data result;
set survey;
days_30=ceil(intck('days', order_date,"&lastvisit"d)/30)-1;
run;
PROC PRINT data = result;
format order_date date9.;
RUN;
I have 4 columns in my SAS dataset as shown in first image below. I need to compare the dates of consecutive rows by ID. For each ID, if Date2 occurs before the next row's Date1 for the same ID, then keep the Bill amount. If Date2 occurs after the Date1 of the next row, delete the bill amount. So for each ID, only keep the bill where the Date2 is less than the next rows Date1. I have placed what the result set should look like at the bottom.
Result set should look like
You'll want to create a new variable that moves the next row's DATE1 up one row to make the comparison. Assuming your date variables are in a date format, use PROC EXPAND and make the comparison ensuring that you're not comparing the last value which will have a missing LEAD value:
DATA TEST;
INPUT ID: $3. DATE1: MMDDYY10. DATE2: MMDDYY10. BILL: 8.;
FORMAT DATE1 DATE2 MMDDYY10.;
DATALINES;
AA 07/23/2015 07/31/2015 34
AA 07/30/2015 08/10/2015 50
AA 08/12/2015 08/15/2015 18
BB 07/23/2015 07/24/2015 20
BB 07/30/2015 08/08/2015 20
BB 08/06/2015 08/08/2015 20
;
RUN;
PROC EXPAND DATA = TEST OUT=TEST1 METHOD=NONE;
BY ID;
CONVERT DATE1 = DATE1_LEAD / TRANSFORMOUT=(LEAD 1);
RUN;
DATA TEST2; SET TEST1;
IF DATE1_LEAD NE . AND DATE2 GT DATE1_LEAD THEN BILL=.;
RUN;
If you sort your data so that you are looking to the previous obs to compare your dates, you can use a the LAG Function in a DATA STEP.