Example Dataset:
record_id admin_dt_1
1 June 7th 2022
2 August 25th 2022
3 August 23rd 2022
4 July 8th 2022
5 August 5th 2022
I would like my output to show in the first column September 1st...2nd...so on to 30th which I have done but I would like the second column to show the number of people eligible for each day in September. Eligible means anyone after 28 days from their admin_dt_1. I also want the column to be cumulative it should look something like this: Since there are 5 data points it should add up to 5 in the frequency column.
Date Frequency eligible
September 1st 3
September 30th 5
data dose2eligible;
set request;
/*create September 1st to September 30th date*/
do date= '01sep2022'd to '30sep2022'd;
output;
end;
format
date date9.;
run;
proc freq data=dose2eligible; table date; run;
You were very close. Count the number of days between admin_dt_1 and date, then create a 1/0 flag using the shortcut var = (boolean comparision):
eligible = (admin_dt_1 - date > 28);
data dose2eligible;
set request;
/*create September 1st to September 30th date*/
do date= '01sep2022'd to '30sep2022'd;
eligible = (admin_dt_1 - date > 28);
output;
end;
format date date9.;
run;
You can then count the number of eligible people on each date:
proc sql;
select date
, sum(eligible) as total_eligible
from dose2eligible
group by date;
quit;
Related
I have a column called month as
month
JAN
FEB
...
DEC
I'd like to know how to convert them into 1,2,3,...,12 in SAS. Thanks a lot.
Use informat to convert it to number and use month() to get the month.
data have;
input month :$3. ##;
datalines;
JAN FEB DEC
;
data want;
set have;
x=month(input(month||'21',??monyy.));
run;
Concatenate the month with a year, make use of the MONYY. informat, use the MONTH. format and finally output as a numeric value using another input().
data have;
input month :$3. ##;
datalines;
JAN FEB DEC
;
data want;
set have;
month_num=input(put(input(catt(month, year(today())), monyy.), month.), 2.);
put month month_num;
run;
Results:
JAN 1
FEB 2
DEC 12
I've searched but none of the information shows how to plot a line graph from data that is given in a row, rather than column.
I have data in this form:
Firstname Lastname Sep Oct Nov Dec Jan Feb March April May June July
There are 100 rows of data with individual people. I have to plot each graph for each individual starting from Sep To July. My output will be 100 individual graphs. I know how to plot if the data is in column, but that is not what i am given. Changing the data is going to be too much work. I do not have any sas codes for rows:
**Proc sgplot data=data1;
series x=??? ( i need mths from Sep to July here)
Series y= ?? (will be the marks from the Sep to July)
Run;**
Here is how the output should look:
Your table needs to be in a flat format, e.g.:
FirstName LastName Date
John Smith 01JAN2018
Jane Doe 01JAN2018
This can be done with PROC TRANSPOSE. It is best to align your dates to a specific year/date. This will maintain the correct date order. Assume that your data is for 2018.
Create sample data
data have;
length name $10.;
array months[*] Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec;
retain goal 75;
do name = 'Mark', 'Jane', 'Jake', 'John', 'Jack', 'Jill', 'Bill', 'Jerry', 'Joseph';
do i = 1 to dim(months);
months[i] = round(100*rand('uniform') );
end;
output;
end;
drop i;
run;
Solution
proc sort data=have;
by name goal;
run;
proc transpose data=have
out=have_transposed;
by name goal;
var Jan--Dec;
run;
data want;
set have_transposed;
Month = input(cats('01', _NAME_, 2018), DATE9.);
rename COL1 = Score;
format month monname3.;
drop _NAME_;
run;
proc sgplot data=want;
by name;
series x=month y=goal / name='goal' lineattrs=(color=salmon thickness=2);
series x=month y=score / name='series' lineattrs=(thickness=2);
scatter x=month y=score / markerattrs=(symbol=circlefilled) name='points';
keylegend 'series' 'goal';
run;
The date in the table is not one set,
Days in the days column and months in the month column and years in the year column
I have concatenated the columns and then put these concatenation in where clause and put the parameter I have made but I got no result
I assume you are querying a date dimension table, and you want to extract the record that matches a certain date.
Solution:
I created a dates table to match with,
data dates;
input key day month year ;
datalines;
1 19 2 2018
2 20 2 2018
3 21 2 2018
4 22 2 2018
;;;
run;
Output:
In the where clause I parse the date '20feb2018'd using day, month & year functions: in SAS you have to quote the dates in [''d]
proc sql;
select * from dates
/*if you want to match todays' date: replace '20feb2018'd with today()*/
where day('20feb2018'd)=day and month('20feb2018'd)=month and year('20feb2018'd)=year;
quit;
Output:
if you compare date from day month and year, then use mdy function in where clause as shown below. it is not totally clear what you are looking for.
proc sql;
select * from dates
where mdy(month,day, year) between '19feb2018'd and '21feb2018'd ;
I'm currently struggling with following request in PowerBI:
I have two CSV files as PowerBI queries, one which defines fiscal months, and another one which lists all subscriptions including start and end date:
Fiscal month CSV:
Month Fiscal Start Fiscal End
January 03.01.2016 04.02.2016
February 05.02.2016 03.03.2016
March 04.03.2016 06.04.2016
April 07.04.2016 02.05.2016
May 03.05.2016 06.06.2016
June 07.06.2016 03.07.2016
July 04.07.2016 05.08.2016
August 06.08.2016 02.09.2016
Subscription CSV:
Account-ID Subscription-Start Subscription-End Item Count
101 08.01.2016 07.02.2016 5
102 15.01.2016 14.03.2016 3
103 05.01.2016 04.06.2016 10
101 08.02.2016 07.03.2016 3
104 10.04.2016 09.05.2016 5
105 16.04.2016 15.07.2016 2
My challenge now is to drill down all subscription item counts per fiscal month as a powerBI table.
Note: an Item Count is valid for a fiscal month if its Subscription-Start < Fiscal End and its Subscription-End > Fiscal End. (Example: A subscription from 15.01.2016 - 14.02.2016 should be counted in january, but not in february)
PowerBI table (schematical example):
Month Item Count
January 18
February 16
March 10
April 17
May 12
June 2
July 0
August 0
How can I implement this report in PowerBI?
THX in advance for your help and BR
bdriven
I've found following solution for my problem:
First I've created a new Table and made a crossjoin of the two queries. Then I've filtered for the lines, where my Subscription Start was before the Fiscal Month End and Subscription End was after the Fiscal Month End.
Based on this new table I can create all respective reports.
Example Code see below:
Fiscal Month Report =
FILTER(
CROSSJOIN(
ALL('Fiscal_month');
ALL('Subscription')
);
('Subscription'[Subscription-Start] < 'Fiscal_month'[Fiscal End] && 'Subscription'[Subscription-End] > 'Fiscal_month'[Fiscal End])
)
For the data set below(actual one is several thousand row long) I would like SAS to aggregate the income daily (many income lines everyday per machine), weekly, monthly (start of week is Monday, Start of month is 01 in any given year) by the machine. Is there a straight forward code for this? Any help is appreciated.
MachineNo Date income
1 01Jan2012 1500
1 02Jan2012 2000
1 27Aug2012 300
2 02Jan2012 1200
2 15Jun2012 50
3 03Mar2012 1000
4 08Apr2012 500
proc expand and proc timeseries are excellent tools for accumulation and aggregation to different frequencies of series. You can combine both with by-group processing to convert to any time period that you need.
Step 1: Sort by MachineNo and Date
proc sort data=want;
by MachineNo Date;
run;
Step 2: Find the min/max end dates of your series for date alignment
The format=date9. statement is important. For whatever reason, some SAS/ETS and HPF procedures require date literals for certain arguments.
proc sql noprint;
select min(date) format=date9.,
max(date) format=date9.
into :min_date,
:max_date
from have;
quit;
Step 3: Align each MachineNo by start/end date, and accumulate days per MachineNo
The below code will get you aligned daily accumulation, remove duplicate days per machine, and set Income on any missing days to 0. This step will also guarantee that your series has equal time intervals per by-group, allowing you to run hierarchical time-series analyses without violating the equal-spaced interval assumption.
proc timeseries data=have
out=want_day;
by MachineNo;
id date interval=day
align=both
start="&min_date"d
end="&max_date"d;
var income / accumulate=total setmiss=0;
run;
Step 4: Aggregate aligned Daily to Weekly shifted by 1 day, Monthly
SAS time intervals are able to be both multiplied and shifted. Since the standard weekday starts on a Sunday, we want to shift by 1 day to have it start on a Monday.
Standard Week
2 3 4 5 6 7 1
Mon Tue Wed Thu Fri Sat Sun
Shifted
1 2 3 4 5 6 7
Mon Tue Wed Thu Fri Sat Sun
Intervals follow the format:
TimeInterval<Multiplier>.<Shift>
The standard shift interval is 1. For all intents and purposes, consider 1 as 0: 1 means it's unshifted. 2 means it's shifted by 1 period. Thus, for a week to start on a Monday, we want to use the interval Week.2.
proc expand data=want_day
out=want_week
from=day
to=week.2;
id date;
convert income / method=aggregate observed=total;
run;
Step 5: Convert Week to Month
proc expand data=want_week
out=want_month
from=week.2
to=month;
id date;
convert income / method=aggregate observed=total;
run;
In case you don't have a license for SAS/ETS here's another way.
For the monthly data you can format the date in a proc means output.
I think WeekW. starts on Monday but it may not be in a format you want, so you'll need to create a new variable for week first if you wanted to use this method.
proc means data=have nway noprint;
class machineno date;
format date monyy7.;
var income;
output out=want sum(income)=income;
run;