SAS: Box-and-Whisker Plots using multiple datasets - sas

My objective is to create a Box-and-Whisker plot using data from multiple datasets. Important: the size the dataset are not the same - I am not sure if this can be an issue. I'm trying the following code:
%macro plot;
%do i=1 %to 10;
ods graphics on;
title 'Box Plot for Durations';
proc boxplot data=d&i; /*where d&i refers to my datasets*/
plot durations / *HERE I am also having some difficulties because I have to refer to a y(durations)*x values. But I only have a y(durations) the one I want to boxplot - my x corresponds to the different datasets where I take the value.
boxstyle = schematic
nohlabel;
label durations = 'Durations';
run;
%end;
%mend plot;
%plot;
I want my x values to refer to each datasets where I take the duration values to boxplot. Each d1 d2 d3...d10 are ten different datasets corresponding to 10 different firms. Therefore, I wish to have 10 boxplot on in one graph...any insights?

I figured that the best was to simply take all the data that I wish to plot from my datasets and merge them in one file. I created a unique id associated with each datasets prior to merging the data. Then its easy to box plot the data by doing:
title 'Box Plot for Durations';
proc boxplot data=ALL_DATA;
plot boxplotdata*id /
boxstyle = schematic
nohlabel;
label durations = 'Durations';
run;

Related

proc transreg not outputting curve fit plot

I am using proc transreg to test different transformations in the sashelp.baseball dataset. I request all plots and sometimes I can see a curve fit graph and sometimes I can't. Is there something I am missing if I want to output the regression fit with the code below?
DATA BASEBALL;
SET SASHELP.BASEBALL;
RUN;
ODS GRAPHICS ON;
ODS OUTPUT
NObs = num_obs
FitStatistics = fitstat
Coef = params
;
PROC TRANSREG
DATA=BASEBALL
PLOTS=ALL
SOLVE
SS2
PREDICTED;
;
MODEL_1:
MODEL POWER(logsalary/parameter=1) = log(nruns);
OUTPUT OUT = fitted_model;
RUN;
For clarity, the regression fit plot is a scatter plot with the estimated regression line fitted through
The fit plot is generated when the dependent variable does not have a transformation. You can create the transformation ahead of time to get this graph then.
From documentation:
ODS Graph Name: FitPlot
Plot Description: Simple Regression and Separate Group Regressions
Statement and Option: MODEL, a dependent variable that is not
transformed, one non-CLASS independent variable, and at most one CLASS
variable
This code works for me:
PROC TRANSREG
DATA=sashelp.BASEBALL
PLOTS=ALL
SOLVE
SS2
PREDICTED;
;
MODEL_1:
MODEL identity(logsalary) = log(nruns);
OUTPUT OUT = fitted_model;
RUN;
And generates the desired graph.

How to do sub plot using sas

I want to make a simple time series line plot without highlighting any dots on the line. I can plot var1 and var2 using the following code.
title "Title";
proc gplot data=test;
plot var1 *var2 /overlay grid hminor=0 ;
run;
quit;
However I want to add another variable into the plot. I tried the following code. Because the scale of var1 and var3 are quite large, so var3 are not properly scaled in the graph. Can anyone teach me how to use different scale for var1 and var3 please.
title "Title";
proc gplot data=Test;
plot var1 *var2 Var3*var2 /overlay grid hminor=0 ;
run;
quit;
Additionally, may I ask whether sas can do subplot as matlab please. Essentially, I got one big graph with two separate sub-graph. If possible, please teach me how to achieve this. I tried vpercent = 50, but it seems there are something wrong in my code.
proc gplot data=Test vpercent=50;
plot VAR1 *VAR2 VAR3*VAR2 /overlay grid hminor=0 ;
run;
quit;
With Thanks
Assuming I understand what you mean, if you have access to SGPLOT you can specify that X3 should be on a different axis. Here's an example with the SASHELP.STOCKS data which plots the open price on one Y axis and then the trade volume on the second Y axis.
proc sgplot data=sashelp.stocks;
where stock='IBM';
series x=date y=open;
series x=date y=volume/y2axis;
run;quit;
Here is some SAS code that builds on Reeza's excellent example and suggestion to use SGPANEL. See the PANELBY statement and the options used there.
*** SUBSET DATA AND SORT ***;
proc sort data=sashelp.stocks out=ibm;
where stock='IBM';
by date;
run;
*** TRANSPOSE DATA FROM "SHORT-AND-WIDE" TO "LONG-AND-THIN" ***;
proc transpose data=ibm out=ibm_t;
by date;
var open volume;
run;
proc sgpanel data=ibm_t;
*** ROW LATTICE OPTION STACKS PLOTS ***;
*** UNISCALE OPTION LETS EACH PANEL HAVE IT'S OWN SCALE ***;
*** NOVARNAME SUPPRESSES LABEL FOR THE Y-AXIS ON THE RIGHT SIDE ***;
panelby _name_ / layout=rowlattice uniscale=column novarname;
series x=date y=col1;
*** SUPPRESS LABEL FOR THE Y-AXIS ON THE LEFT SIDE ***;
rowaxis display=(nolabel);
run;

SAS-Dynamically Writing Bar Charts to Excel workbook in SAS Macro

I am trying to dynamically generate and export bar charts to and an excel workbook. My Macro pulls certain distinct Identifier codes and creates two summary tables (prov_&x table and the prov_revcd_&x) and populates to single excel sheet, for each respective ID. I have been unable however to successfully generate bar charts for the data and export to excel. Below is a condensed version of code. I have removed the steps for creating the prov_&x table and the prov_revcd_&x table to help keep the post as concise as possible. I have tried using the GOUT function and NAME function and then explicitly calling those but that does not seem to work. Any suggestions are welcomed and I understand my macro code is a little sloppy but it generates the tables so I will clean up once I can get the bar charts to generate.
Also, I can see in my results viewer that the graphs are generating, so I'm assuming the problem is in how I am trying to reference them to the workbook. THANKS!
%macro runtab(x);
/*Create summary chart for generating graph of codes billed per month*/
proc sql;
CREATE TABLE summary_&x AS
select DISTINCT month, COUNT (CH_ICN) AS ICN_Count, CLI_Revenue_Cd_Category_Cd
FROM corf_data1_sorted
WHERE BP_Billing_Prov_Num_OSCAR=&x
group by month ,CLI_Revenue_Cd_Category_Cd;
run;
/*Create a graph of Services Per Month and group by the Revenue Code*/
proc sgplot data=summary_&x NAME= 'graph_&x';
title 'Provider Revenue Analysis';
vbar month / response=ICN_count group=CLI_Revenue_Cd_Category_Cd stat=sum
datalabel datalabelattrs=(weight=bold);
yaxis grid label='Month';
run;
%mend runtab;
/*Create a macro variable of all the codes */
proc sql noprint;
select BP_Billing_Prov_Num_OSCAR
into :varlist separated by ' ' /*Each code in the list is sep. by a single space*/
from provider;
quit;
%let cntlist = &sqlobs; /*Store a count of the number of oscar codes*/
%put &varlist; /*Print the codes to the log to be sure our list is accurate*/
/*write a macro to generate the output tables*/
%macro output(x);
ods tagsets.excelxp options(sheet_interval='none' sheet_name="&x");
proc print data=prov_&x;
run;
proc print data=prov_revcd_&x;
run;
proc print data=graph_&x;
run;
%mend;
/*Run a loop for each oscar code. Each code will enter the document generation loop*/
%macro loopit(mylist);
%let else=;
%let n = %sysfunc(countw(&mylist)); /*let n=number of codes in the list*/
data
%do I=0 %to &n;
%let val = %scan(&mylist,&I); /*Let val= the ith code in the list*/
%end;
%do j=0 %to &n;
%let val = %scan(&mylist,&j); /*Let val= the jth code in the list*/
/*Run the macro loop to generate the required tables*/
%runtab(&val);
%output(&val);
%end;
run;
%mend;
/*Run the macro loop over the list of significant procedure code values*/
ods tagsets.excelxp file="W:\user\test_wkbk.xml";
%loopit(&varlist)
ods tagsets.excelxp close;
You can't export charts with ODS TAGSETS.EXCELXP, unfortunately.
You have a few options if you need to export charts.
Use ODS Excel, available in the more recent maintenance releases of SAS 9.4. See Chris H's blog post for more information on that. It is fairly similar to Tagsets.ExcelXP, but not identical. It does generate a "Real" excel file (.xlsx).
Create an HTML file that Excel can read, using TAGSETS.MSOFFICE2K or regular HTML. Chevell Parker, a SAS tech support analyst, has a few papers like this one on the different options.
Use DDE to write your image to the excel file. Not a preferred option but included for completeness.
There is also a new proc - proc mschart - that is enabled in SAS 9.4 TS1M3 due out in a month or two, that will generate Excel charts in ODS EXCEL (ie, not an image, but telling Excel to make a chart here please).

drawing histogram and boxplot in SAS

I wrote the following code in sas, but I did not get result!
The result histogram in grey and the range of data is not as I specified! what is the problem?
I got the following warning too: WARNING: The MIDPOINTS= list was extended to accommodate the data
what about color?
axis1 order=(0 to 100000 by 50000);
axis2 order=(0 to 100 by 5);
run;
proc capability data=HW2 noprint;
histogram Mvisits/midpoints=0 to 98000 by 10000
haxis=axis1
cfill=blue;
run;
.......................................
I have the same problem with boxplot, for example I got the following plot and I want to change the distances, then I could see the plot better, but I could not.
The below is for proc univariate rather than proc capability, I do not have access to SAS/QC to test, but the user guide shows very similar syntax for the histogram statements. Hopefully, you'll be able to translate it back.
It looks like you are having problems with the colour due to your output system. Your graphs are probably delivered via ODS, in which case the cfill option does not apply (see here and not the Traditional Graphics tag).
To change the colour of the histogram bars in ODS output you can use proc template:
proc template;
define style styles.testStyle;
parent = styles.htmlblue;
style GraphDataDefault /
color = green;
end;
run;
ods listing style = styles.testStyle;
proc univariate data = sashelp.cars;
histogram mpg_city;
run;
An example explaining this can be found here.
Alternatively you can use proc sgplot to create a histogram with more control of the colour as follows:
proc sgplot data = sashelp.cars;
histogram mpg_city / fillattrs = (color = red);
run;
As to your question of truncating the histogram. It doesn't really make a great deal of sense to ignore the extreme values as it will give you an erroneous image of the distribution, which somewhat defeats the purpose of the histogram. That said, you can achieve what you are asking for with bit of a hack:
data tempData;
set sashelp.cars;
tempClass = 1;
run;
proc univariate data = tempData noprint;
class tempClass;
histogram mpg_city / maxnbin = 5 endpoints = 0 to 25 by 5;
run;
In the above a dummy class tempClass is created and then comparative histograms are requested using the class statement. maxnbins will limit the number of bins displayed only in a comparative histogram.
Your other option is to exclude (or cap) your extreme points before creating the histogram, but this will lead to slightly erroneous frequency counts/percentages/bar heights.
data tempData;
set sashelp.cars;
mpg_city = min(mpg_city, 20);
run;
proc univariate data = tempData noprint;
histogram mpg_city / endpoints = 0 to 25 by 5;
run;
This is a possible approach to original question (untested as no SAS/QC or data):
proc capability data = HW2 noprint;
histogram Mvisits /
midpoints = 0 to 300000 by 10000
noplot
outhistogram = histData;
run;
proc sgplot data = histData;
vbar _MIDPT_ /
response = _OBSPCT_
fillattrs = (color = blue);
where _MIDPT_ <= 100000;
run;

How to create by group in proc gplot

I want to create multiple plots by category. Currently my code is as follows:
proc gplot data=data;
plot (a b)*week
*by category;
/vaxis=axis3 haxis=axis3 legend=legend1 overlay skipmiss;
title font='HELVETICA' height=1.2 "Volumes";
run;
but this includes all the categories. How do I create distinct charts for different categories? Also the chart here is a scatter plot. How do I create a line chart?
A fellow SAS 9.1.x user? Assuming that you require a gplot-based example:
proc summary data = sashelp.class nway;
var height;
class sex age;
output out = class mean=;
run;
symbol1 interpol = join;
proc gplot data = class;
by sex;
plot height * age;
run;
quit;
Here proc summary conveniently produces a sorted output dataset without any duplicate y-values, allowing gplot to produce a pair of reasonable line charts via the by statement. I'm sure there are much nicer-looking alternatives via proc sgplot if you have a more recent version of SAS, but some of us have to make do with gplot.