Plotting seasonal data, with years on top of each other in SAS? - sas

Hi I have a time series data table from October 2013 to October 2016. I would like to plot the time series from October 2013 to November 2014, October 2014 to November 2015, and October 2015 to November 2016 on top of each other on the same graph to analyze any seasonal trends.
My idea is to create separate data tables with each subsegment, but is there an easier way to do this in SAS?
This is an example of the data table I want to plot the seasonality of.

The workflow I think here is to add a group variable that indicates, say, year, which has the same value for all rows you want plotted in one plot-grouping.
Then you use the group statement in whatever plot type you want. Something like:
data stocks_years;
set sashelp.stocks;
date_year = intck('YEAR','01AUG1986'd,date,'c')+1986;
date_month= month(date);
run;
proc sgplot data=stocks_years;
vline date_month/response=close group=date_year stat=mean;
run;
This is an example of doing that to see the average close per month of the three stocks in the SASHELP.STOCKS dataset. It is a terrible plot of course but it should give you some idea of what it would look like. Each of those differently colored lines is from a different year (aug->jul being defined as a year, with the number being the year number of aug).

The lead off provided by Joe gave me everything I needed. Here is the completed code for anyone else's reference.
%macro Plot_Seasonal_Worse_TP(tbl_name, tp, cutoff_date);
/*tp = transition probability */
proc sql;
create table &tbl_name._trim as
SELECT *
FROM &tbl_name
WHERE asofdt > &cutoff_date;
run;
data &tbl_name._trim;
set &tbl_name._trim;
date_year = intck('YEAR','01NOV2013'd,asofdt,'c')+2014;
date_month= MOD(month(asofdt)+2, 12); /* move november and december of previous year to front of time series */
run;
proc sgplot data=&tbl_name._trim;
vline date_month/response=&tp group=date_year;
title &tbl_name (&tp);
run;
%mend Plot_Seasonal_Worse_TP;
Output looks like this as well.

Related

SAS Proc Freq Not Displaying Values

I am doing some simple cross tabulations using Proc Freq, but I'm noticing that the output SAS gives me doesn't contain any frequency counts; I'm only getting percents.
Here is an example code that I ran in SAS (I am using SAS 9.4):
data test;
input year 1-5 group $6;
cards;
2018 A
2018 A
2018 B
2018 B
2019 A
2019 A
2019 A
2019 B
;
run;
proc freq data = test;
table year * group / norow nopercent;
run;
I'm expecting a table that has the frequency counts with the column percentage below, but instead, this is what SAS is giving me:
Does anyone know how I can get the frequency values to be shown?
I ran your code and got this. I reckon there is something you are not telling us.
Thank you all for your help- I found the issue. It looks like there was an issue with the cross-tab frequency template that came with SAS. I was able to restore it by using the following code:
proc template;
delete base.freq.crosstabfreqs;
run;
Thank you all for your help!
#_null_ your image is NOT the output I get when running the questions code.
The Frequency and Col Pct are NOT in row header cells, and instead are shown in a box offset to the left from the table.

Ideomatic way to create a range of numbers in PROC SQL?

I want to create a list of years. Since I just need a handful, I do it like this:
create table work.year_range
(YEAR num);
insert into work.year_range
values (2006) values (2007) values (2008) values (2009) values (2010) values (2011)
values (2012) values (2013) values (2014) values (2015) values (2016) values (2017);
but that is clearly ugly. What is the ideomatic way to create a range of numbers in PROC SQL?
you have to use do loop and hard to emulate the same in proc sql;
data year_range;
do year= 2006 to 2017;
output;
end;
run;
Another crude way(not recommended) to do proc sql is to use undocumented monotonic function and using dataset which has more than 11 records as shown below
proc sql;
create tale work.year_range
select 2005 + monotonic() as year
from sashelp.class
where calculated year between 2006 and 2017;

Boxplot in SAS using proc gchart

First question, is it possible to produce a boxplot using proc gchart in SAS?
If it is possible, please give me a brief idea.
Or else, on the topic of using proc boxplot.
Suppose I have a dataset that has three variables ID score year;
something like,
data aaa;
input id score year;
datalines;
1 50 2008
1 40 2007
2 30 2008
2 20 2007
;
run;
I want to produce a boxplot showing for each ID in each year. (So in this case, 4 boxplots in a single plot)
How can i achieve this?
I have tried using
proc boxplot data=aaa;
plot score*ID;
by year;
run;
However, this is not working as we can see year is not sorted by order.
Is there a way to get other this?
You need to sort your input dataset first. Run this
proc sort data = aaa;
by year;
run;
and then your proc boxplot should work as written.
This is quite easy to do with sgplot, which is part of the newer ODS Graphics suite which is available in base SAS.
proc sgplot data=sashelp.cars;
vbox mpg_city/category=type group=origin grouporder=ascending;
run;
You would use category=id and group=year in your example data - you get one separate tick on the x axis for each category and then you get a separate bar clustered together for each group.

Averages in SAS with dates using months

Let's say I have 50 years of data for each day and month. I also have a column which lists the max rainfall for each day of that dataset. I want to be able to compute the average monthly rainfall and standard deviation for each of those 50 years. How would I accomplish this task? I've considered using PROC MEANS:
PROC MEANS DATA = WORK.rainfall;
BY DATE;
VAR AVG(max_rainfall);
RUN;
but I'm unfamiliar on how to let SAS understand that I want to be using the MM of the MMDDYY format to indicate where to start and stop calculating those averages for each month. I also do not know how I can tell SAS within this PROC MEANS statement on how to format the data correctly, using MMDDYY10. This is why my code fails.
Update: I've also tried using this statement,
proc sql;
create table new as
select date,count(max_rainfall) as rainfall
from WORK.rainfall
group by date;
create table average as
select year(date) as year,month(date) as month,avg(rainfall) as avg
from new
group by year,month;
quit;
but that doesnt solve the problem either, unfortunately. It gives me the wrong values, although it does create a table. Where in my code could I have gone wrong? Am I telling SAS correctly that add all the rainfall's in 30 days and then divide it by the number of days for each month? Here's a snippet of my table.
You can use a format to group the dates for you. But you should use a CLASS statement instead of a BY statement. Here is an example using the dataset SASHELP.STOCKS.
proc means data=sashelp.stocks nway;
where date between '01JAN2005'd and '31DEC2005'd ;
class date ;
format date yymon. ;
var close ;
run;

Stacked bar chart by group and subgroup in SAS

I am unable to create stacked charts by group and subgroup in sas9.4, I want charts which are similar to excel graphs. Please find the sample data and excel graph below (first image) and also the SAS graph (second image).
I am unable to set the common year for the SEGMENT 'ACTUAL' AND 'FORECAST' on the same axis (year). The actual means the data has up to 2014 and forecast means after 2014, Both should fall in the same axis.
goptions reset=all ;
goptions colors=(red blue green);
legend1 label=none ;
proc gchart data=NEW;
vbar year/ discrete type=sum sumvar=VALUE
group= segment subgroup=WKSCOPE ;
where year le 2020 AND YEAR ge 2012;
run;
I would solve this with annotation. I know SGPLOT better than GCHART so I'll answer it this way.
data have;
input segment $ year wkscope $ value;
datalines;
ACTUAL 2012 PH 5
ACTUAL 2012 PH 1
ACTUAL 2012 BHS 1
ACTUAL 2012 RES 2
ACTUAL 2013 PH 2
ACTUAL 2013 PH 5
ACTUAL 2013 BHS 1
ACTUAL 2014 RES 2
FORECAST 2015 PH 3
FORECAST 2015 BHS 0
FORECAST 2016 PH 4
FORECAST 2016 RES 1
FORECAST 2017 PH 5
FORECAST 2017 BHS 1
FORECAST 2017 RES 2
;;;;
run;
data sgannods;
x1space='wallpercent';
y1space='wallpercent';
x1=75;
y1=-10;
label="Forecast";
function='text';
output;
x1=25;
label="Actual";
output;
run;
proc sgplot data=have sganno=sgannods;
vbar year/response=value group=wkscope groupdisplay=stack;
run;
Basically, do everything except segment, then annotate using that value. You can generate it by hand like I do, or (preferably) generate it from the original data if it could change. I use WALLPERCENT since it's going to be first half is actual last half is forecast, but if it could change (2 actual 4 forecast) then you shouldn't do that; you should either use WALLPERCENT and work out the proper position from the data (with a proc freq, probably) or use DATAVALUE and put it under the middle value.
If this isn't close enough, I would go to robslink.com, which has a nice set of examples (and is written by one of the developers of the GCHART set of procs). Sanjay also has a blog, Graphically Speaking which has some great examples, and both post on SAS Communities.
The image I produce follows here. It's not particularly close in other manners but all of those are easy to fix (color scheme, sizes, location of legends).
Data labels are the one thing you can't really have this way; they're addable if you use VBARPARM, but that requires summarizing the data ahead of time. Sanjay covers this in one of his blog posts about 9.4M2 (if you have the M2 maintenance release); I also cover this in my MWSUG Paper, Labelling without the Hassle: How to Produce Labeled Stacked Bar Charts Using
SGPLOT and GTL Without Annotate if you have an older version.