Boxplot in SAS using proc gchart - sas

First question, is it possible to produce a boxplot using proc gchart in SAS?
If it is possible, please give me a brief idea.
Or else, on the topic of using proc boxplot.
Suppose I have a dataset that has three variables ID score year;
something like,
data aaa;
input id score year;
datalines;
1 50 2008
1 40 2007
2 30 2008
2 20 2007
;
run;
I want to produce a boxplot showing for each ID in each year. (So in this case, 4 boxplots in a single plot)
How can i achieve this?
I have tried using
proc boxplot data=aaa;
plot score*ID;
by year;
run;
However, this is not working as we can see year is not sorted by order.
Is there a way to get other this?

You need to sort your input dataset first. Run this
proc sort data = aaa;
by year;
run;
and then your proc boxplot should work as written.

This is quite easy to do with sgplot, which is part of the newer ODS Graphics suite which is available in base SAS.
proc sgplot data=sashelp.cars;
vbox mpg_city/category=type group=origin grouporder=ascending;
run;
You would use category=id and group=year in your example data - you get one separate tick on the x axis for each category and then you get a separate bar clustered together for each group.

Related

Is it possible to do a paired t-test is SAS using panel data?

I am working with paneldata that looks something like this:
I am going to perform a t-test in SAS 9.4 to find out if there is a significant change in var1 from 2014 to 2016, and I am assuming that I have to use a paired t-test, since I have several an observation in both 2014 and 2016 for each individual (ID).
My question is, can this be done in SAS, when I am using panel data like the one I have shown? Or do I need to create a a wide dataset with one variable containing the data from 2014 and one variable containing the data from 2016? I know that I have to do that in STATA, but maybe I don't have to change my entire dataset to do this in SAS?
You will have to transpose your data to to a paired t-test. You can use PROC TRANSPOSE though.
*sort for transpose;
proc sort data=have; by id year; run;
*reformat from long to wide;
proc transpose data=have out=want prefix=Year_;
by ID;
ID Year;
Var Var1;
run;
*Paired T-Test;
proc ttest data=want;
paired Year_2014*Year_2016;
run;
PS. Please include your data as text not an image in the future. We cannot write code off an image and I'm not typing out your data, so at present this is untested but should work.

Proc Freq into a dataset in SAS Studio

I have a data (colour) that looks something like
Id haircolour
1 black
2 brown
3 grey
.....
And using proc freq I got a table that looked like
Haircolour Frequency
Black 10
Brown 20
Grey 30
Is there any way for me to save this table as a new sas file with haircolour and frequency as the variables?
Thanks
Tim
There are two main ways to get data out of a proc.
One way is to use the OUT option on the TABLES statement.
Another is the ODS OUTPUT statement, however that depends on your table statement.
proc freq data=sashelp.class;
table sex*age / out=want;
run;
The ODS approach is outlined here.
https://blogs.sas.com/content/iml/2017/01/09/ods-output-any-statistic.html

Exporting SAS data into SPSS with value labels

I have a simple data table in SAS, where I have the results from a survey I sent to my friends:
DATA Questionnaire;
INPUT make $ Question_Score ;
CARDS;
Ned 1
Shadowmoon 2
Heisenberg 1
Athelstan 4
Arnold 5
;
RUN;
What I want to do, using SAS, is to export this table into SPSS (.sav), and also have the value labels for the Question_Score, like shown in the picture below:
I then proceed to create a format in SAS (in hope this would do it):
PROC FORMAT;
VALUE Question_Score_frmt
1="Totally Agree"
2="Agree"
3="Neutral"
4="Disagree"
5="Totally Disagree"
;
run;
PROC FREQ DATA=Questionnaire;
FORMAT Question_Score Question_Score_frmt.
;
TABLES Question_Score;
RUN;
and finally export the table to a .sav file using the fmtlib option:
proc export data=Questionnaire outfile="D:\Questionnaire.sav"
dbms=spss replace;
fmtlib=work.Q1frmt;
quit;
Only to disappoint myself seeing that it didn't work.
Any ideas on how to do this?
You didn't apply the format to the dataset, unfortunately, you applied it to the proc freq. You would need to use PROC DATASETS or a data step to apply it to the dataset.
proc datasets lib=work;
modify questionnaire;
format Question_Score Question_Score_frmt.;
run;
quit;
Then exporting will include the format, if it's compatible in SAS's opinion with SPSS's value label rules. I will note that SAS's understanding of SPSS's rules is quite old, based on I think SPSS version 9, and so it's fairly often that it won't work still, unfortunately.

Grouping plots and tables using a BY statement in SAS

I am using a BY statement with both proc boxplot and proc report to create a plot and a table for each level of the BY variable. As is, the code prints all the plots and then prints all of the tables. I would like it to print the plot and then the table for each level of the By variable (so the ouput would alternate between a plot and a table). Is there a way to do this?
This is the code I currently have for the plots and tables-
proc boxplot data=study;
plot Lead_Time*Study_ID/ horizontal;
by Project_Name;
format Lead_Time dum.;
run;
proc report data=study nowd;
column ID Title Contact Status Message Audience Priority;
by Project_Name;
run;
Thank You!!
Unfortunately, I don't think the ODS (Output Delivery System) can interleave outputs from procedures. You will need to use a macro to loop over all the by variables and call BOXPLOT and REPORT for each one.
Something like this:
%macro myreport();
%let byvars = A B C D;
%let n=4;
%do i=1 %to &n;
%let var = %scan(&byvars,&i);
proc something data=have(where=(byvar="&var"));
...;
run;
proc report data=have(where=(byvar="&var"));
....
run;
%end;
%mend;
%myreport();
Obviously you need to change this to fit your needs. There are plenty of examples on Stackoverflow of it. Here is one: looping over character values in SAS
This is in principle possible using PROC DOCUMENT and the ODS DOCUMENT output type. It's not exactly easy, per se, but it's possible, and has some advantages over the macro option, although I'm not sure sufficient to recommend its use. However, it's worth exploring nonetheless.
First off, this is largely guided (including, coincidentally, using the same dataset!) by Cynthia Zender's excellent tutorial, Have It Your Way: Rearrange and Replay Your Output with ODS DOCUMENT, presented during the 2009 SAS Global Forum. She initially describes a GUI method of doing this, but then later explains it in code, which would clearly be superior for this sort of thing. Kevin Smith covers similar ground in ODS DOCUMENT From Scratch, from 2012's SGF, though Cynthia's paper is a bit more applicable here (as she covers the exact topic).
First, you need to generate all of your results. Order here doesn't matter too much.
I generate a sample of SASHELP.PRDSALE that is sorted appropriately by country.
proc sort data=sashelp.prdsale out=prdsale;
by country;
run;
Then, we generate some tables; a proc means and a sgplot. Note the title uses #BYVAL1 to make sure the title is included - otherwise we lose the useful labels on the procs!
title "#BYVAL1 Report";
ods _all_ close;
ods document name=work.mydoc(write);
proc means data=prdsale sum;
by country;
class quarter year;
var predict;
run;
proc sgplot data=prdsale;
by country;
vbar quarter/response=predict group=year groupdisplay=cluster;
run;
ods document close;
ods preferences;
Now, we have something that is wrong, but is usable for what you actually want. You can use the techniques in Cynthia or Kevin's papers to look into this in detail; for now I'll just go into what you need for this purpose.
It's now organized like this, imagining a folder tree:
\REPORT\MEANS\COUNTRY\
What we need is:
\REPORT\COUNTRY\MEANS
That's easy enough to do. The code to do so is below. Obviously, for a production process this would be better automated; given the input dataset it should be trivial to generate this code. Note that the BYVALs increment for each by value, so CANADA is 1 and 4, GERMANY is 2 and 5, and USA is 3 and 6.
proc document name=work.mydoc_new(write);
make CANADA, GERMANY, USA; *make the lower level folders;
run;
dir ^^; *Go to the bottom level, think "cd .." in unix/windows;
dir CANADA; *go to Canada folder;
dir; *Notes to the Listing destination where we are, not that important;
copy \work.mydoc\Means#1\ByGroup1#1\Summary#1 to ^; *copy that folder from orig doc to here;
copy \work.mydoc\SGPlot#1\ByGroup4#1\SGPlot#1 to ^; *^ being current directory, like '.' in unix/windows;
*You could also copy \ByGroup1#1 and \Bygroup4#1 without the last level of the tree. That would give a slightly different result (a bit more of the text around the table would be included), so do whichever matches your expectations.;
**Same for Germany and USA here. Note that this is the part that would be easy to automate!;
dir ^^;
dir GERMANY;
dir;
copy \work.mydoc\Means#1\ByGroup2#1\Summary#1 to ^;
copy \work.mydoc\SGPlot#1\ByGroup5#1\SGPlot#1 to ^;
dir ^^;
dir USA;
dir;
copy \work.mydoc\Means#1\ByGroup3#1\Summary#1 to ^;
copy \work.mydoc\SGPlot#1\ByGroup6#1\SGPlot#1 to ^;
run;
quit; *this is one of those run group procedures, need a quit;
Now, you only have to replay the document to get it out the right way.
proc document name=mydoc_new;
replay;
run;
quit;
Tada, you have what you want.
If you're going to run the procs once per by value, that's pretty easy. Create a macro to run just one instance, then use proc sql to create a call for each instance. That is entirely dynamic, and could be easily adjusted to allow for other options such as multiple by variables, levels, etc.
Given a single by value:
*Macro that runs it once;
%macro run_reports(project_name=);
title "Report for &project_name.";
proc boxplot data=study;
plot Lead_Time*Study_ID/ horizontal;
where Project_Name="&project_name.";
format Lead_Time dum.;
run;
proc report data=study nowd;
column ID Title Contact Status Message Audience Priority;
where Project_Name="&project_name.";
run;
%mend run_Reports;
*SQL pull to create a list of macro calls;
proc sql;
select distinct cats('%run_Reports(project_name=',project_name,')')
into :runlist separated by ' '
from study;
quit;
&runlist.;
Turn options symbolgen; on to see what the runlist looks like, or look at your output window (or results window in 9.3+). When you're running this in production, add noprint to proc sql to avoid generating that table.

Seasonal Sub Series Plot in SAS

I'm studying a seasonal monthly time series and I like to plot the Seasonal Sub Series Plot or a Box Plot by Month in SAS.
Somthing like the ones at the bottom of the page at the following link:
http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc443.htm
I'm not sure how exactly to get this done in SAS. I appreciate any help.
SE
Can't speak to the seasonality, but box plots are pretty simple. Assuming you create a SAS dataset from the data in that web site, try this:
proc format ;
value mn_name 1='January'
2='February'
3='March'
4='April'
5='May'
6='June'
7='July'
8='August'
9='September'
10='October'
11='November'
12='December'
other='Invalid';
run;
proc sort data=have;
by month;
run;
proc boxplot data=have;
plot Oscillation*month;
format month mn_name.;
run;