I am completely new to SAS, and I'm editing someone else's code.
I have code similar to this:
data tablename1;
set schema.tablename;
if date_column > '08may2017'd;
run;
Instead of a hard coded date (08may2017) I need the date for the previous Monday.
How I would do this in SAS?
The intnx function is your answer:
data tablename1;
set schema.tablename;
if date_column > intnx('week1.2',date(),-1,'B');
run;
For more info about the intnx function, see this
Related
I have a dataset with many dates in them. I want to categorize these dates into a new column that organizes them by decade (1980s, 1990s, etc).
I have a good idea on how to use IF, AND, and ELSE statements to accomplish this, but I don't know how to have SAS extract the year and only the year from the date to apply it to the conditional logic.
You could always use multipliers and the intnx() function as well.
Using Allan's sample data...
data want;
set have;
decade=year(intnx('year10.',dateval,0,'beginning'));
run;
The intnx() part of the code returns the date corresponding to the start of the decade, then we just take the year portion from it.
The year10. parameter tells it we want to work with decades, the 0 parameter means shift the date supplied to the current decade, and the beginning parameter tells it to return the date corresponding to the beginning of the decade.
If you're not familiar with using intnx() to perform date calculations in SAS see here for a quick primer: https://stackoverflow.com/a/11211180/214994
No need for conditional logic - can use a combination of the year() and floor() functions with some simple arithmetic:
data have;
infile cards;
input dateval date9.;
cards;
01JAN2004
08FEB1996
07MAR1987
14SEP1982
;run;
data want;
set have;
decade=floor(year(dateval)/10)*10;
run;
Which gives:
I am calculating around 12 metrics (Say Sales for each month individually for latest 12 months). Every month I need to go manually and change the month everywhere. If there is any way to automate it, it would be very helpful. My code is
proc sql;
create table inter.calls as
select a.district_name,
sum(01JAN2016,01FEB2016,01MAR2016)/terr_count as q1_workingdays,
sum(01APR2016,01JUN2016,01MAY2016)/terr_count as q2_workingdays,
sum(01AUG2016,01SEP2016,01JUL2016)/terr_count as q3_workingdays,
sum(01NOV2016,01DEC2016,01OCT2016)/terr_count as q4_workingdays
from inter.calls_made_bymon_reg3 a left join inter.territory_count b
on a.district_name=b.district_name;
quit;
Now when I refresh for JAN2017, I need to change from FEB2016 to JAN2017 for latest 12months. Every time it is difficult to change the code manually.
I will be very thankful if I get any help!!
It's difficult to understand your problem completely, because as Reeza mentioned the arguments in your sum() functions are invalid (the names begin with a number).
However, I do understand the desire not to manually change dates all over the place. You might find a macro like the below helpful:
%macro prev_month(n);
%let latest_date = %sysfunc(intnx(month,%sysfunc(inputn(&latest_month.,monyy7.)),-&n.));
days_%sysfunc(putn(&latest_date.,monyy7.))
%mend prev_month;
And you would then use it in your query like so:
%let latest_month = JAN2017;
proc sql;
create table inter.calls as
select a.district_name,
sum(%prev_month(11),%prev_month(10),%prev_month(9))/terr_count as q1_workingdays,
sum(%prev_month(8) ,%prev_month(7) ,%prev_month(6))/terr_count as q2_workingdays,
sum(%prev_month(5) ,%prev_month(4) ,%prev_month(3))/terr_count as q3_workingdays,
sum(%prev_month(2) ,%prev_month(1) ,%prev_month(0))/terr_count as q4_workingdays
from inter.calls_made_bymon_reg3 a left join inter.territory_count b
on a.district_name=b.district_name;
quit;
Hope it helps.
Hi I have a time series data table from October 2013 to October 2016. I would like to plot the time series from October 2013 to November 2014, October 2014 to November 2015, and October 2015 to November 2016 on top of each other on the same graph to analyze any seasonal trends.
My idea is to create separate data tables with each subsegment, but is there an easier way to do this in SAS?
This is an example of the data table I want to plot the seasonality of.
The workflow I think here is to add a group variable that indicates, say, year, which has the same value for all rows you want plotted in one plot-grouping.
Then you use the group statement in whatever plot type you want. Something like:
data stocks_years;
set sashelp.stocks;
date_year = intck('YEAR','01AUG1986'd,date,'c')+1986;
date_month= month(date);
run;
proc sgplot data=stocks_years;
vline date_month/response=close group=date_year stat=mean;
run;
This is an example of doing that to see the average close per month of the three stocks in the SASHELP.STOCKS dataset. It is a terrible plot of course but it should give you some idea of what it would look like. Each of those differently colored lines is from a different year (aug->jul being defined as a year, with the number being the year number of aug).
The lead off provided by Joe gave me everything I needed. Here is the completed code for anyone else's reference.
%macro Plot_Seasonal_Worse_TP(tbl_name, tp, cutoff_date);
/*tp = transition probability */
proc sql;
create table &tbl_name._trim as
SELECT *
FROM &tbl_name
WHERE asofdt > &cutoff_date;
run;
data &tbl_name._trim;
set &tbl_name._trim;
date_year = intck('YEAR','01NOV2013'd,asofdt,'c')+2014;
date_month= MOD(month(asofdt)+2, 12); /* move november and december of previous year to front of time series */
run;
proc sgplot data=&tbl_name._trim;
vline date_month/response=&tp group=date_year;
title &tbl_name (&tp);
run;
%mend Plot_Seasonal_Worse_TP;
Output looks like this as well.
Let's say I have 50 years of data for each day and month. I also have a column which lists the max rainfall for each day of that dataset. I want to be able to compute the average monthly rainfall and standard deviation for each of those 50 years. How would I accomplish this task? I've considered using PROC MEANS:
PROC MEANS DATA = WORK.rainfall;
BY DATE;
VAR AVG(max_rainfall);
RUN;
but I'm unfamiliar on how to let SAS understand that I want to be using the MM of the MMDDYY format to indicate where to start and stop calculating those averages for each month. I also do not know how I can tell SAS within this PROC MEANS statement on how to format the data correctly, using MMDDYY10. This is why my code fails.
Update: I've also tried using this statement,
proc sql;
create table new as
select date,count(max_rainfall) as rainfall
from WORK.rainfall
group by date;
create table average as
select year(date) as year,month(date) as month,avg(rainfall) as avg
from new
group by year,month;
quit;
but that doesnt solve the problem either, unfortunately. It gives me the wrong values, although it does create a table. Where in my code could I have gone wrong? Am I telling SAS correctly that add all the rainfall's in 30 days and then divide it by the number of days for each month? Here's a snippet of my table.
You can use a format to group the dates for you. But you should use a CLASS statement instead of a BY statement. Here is an example using the dataset SASHELP.STOCKS.
proc means data=sashelp.stocks nway;
where date between '01JAN2005'd and '31DEC2005'd ;
class date ;
format date yymon. ;
var close ;
run;
I have a sas data set. In it i have some variables following a pattern
-W 51 Sales
-W 52 Sales
-W 53 Sales
and so on.
Now i want to rename all of these variables dynamically such that W 51 is replaced by starting date of that week and the new name becomes - 5/2/2013 Sales?
The reason i want to rename them is that i have sales data of all the 53 weeks in an year and the data set would be eassier for me to understand if i had the starting date of a week instead of W(week_no) Sales as a variable name
Is there any way i can do that in sas?
You really don't want to rename your variables. You may think you do, but it'll just bite you eventually.
What you can do instead is give them descriptive labels. This can be done via proc datasets.
proc datasets library=<lib>;
modify <dataset>;
label <variable> '5/2/2013 sales';
run;
Just for fun lets assume you want to do this anyway -- Safest thing to do is just create a copy of the dataset for your output...
this code assumes your variable names are named like w1_sales and output names are going to be renamed to 03JAN2013_sale or something like that.
data newDataSet;
set oldDataSet;
%MACRO rename_vars(mdataset,year);
data &mdataset.;
set &mdataset.;
%do i = 1 %to 53;
%let weekStartDate = %sysfunc(intnx('week&i','01jan&year.'d,0)); %*returns the starting day of week(i) uses sunday as starting date. If you want monday use 0.1 as last param;
%let weekstartDateFormatted = %sysfunc(putn(&weekStartDate.,DATE.)) %*formats into ddMONyyy. substitute whatever format you want;
rename w&i._Sale = &weekstartDateFormatted ._SALES;
%end;
run;
%MEND rename_vars;
%rename_vars(newDataSet,2013);
I don't have time to test this right now, so sommebody let me know if I screwed it up somewhere. This should at least get you going though. Or you can send me or post some code to read a small sample dataset (obviously if this is possible without having to share some proprietary info. You might have to genericize it a bit) with those vars like that and I'll debug it.