SAS PROC PLOT unnecessarily huge chart - sas

This is code for a HW question in my Multivariate Analysis Class
options nodate nonumber;
TITLE 'Problem #1 ';
DATA IRIS;
INFILE '~\iris.txt';
INPUT SL SW PW PL Species;
PROC GLM;
CLASS Species;
MODEL SL SW PW PL = Species;
MANOVA H=SPECIES/PRINTE PRINTH;
RUN;
proc candisc data=IRIS out=ftstat;
CLASS Species;
VAR SL SW PW PL;
TITLE 'Discriminant Analysis for Problem #1';
RUN;
goptions reset=all;
PROC PLOT DATA=ftstat uniform;
PLOT CAN2*CAN1=Species;
RUN;quit;
PROC PLOT is currently generating a huge chart, maybe 5-10 pages tall, with ridiculous vertical scaling (something like .05 = an inch+ of computer screen.) It's too big to put in a word document to hand in, and it's not informative as is.
My question is why is my SAS doing this, and can I fix it? I'd love it to be scaled down to a 5" x 5" or something like that...can I do this? (I've a working knowledge of SAS, but I'm far from skilled at it.)

Try using ods graphics and PROC SGPLOT.
ods graphics on / width=5in height=5in;
PROC SGPLOT DATA=ftstat;
SCATTER can2*can1=species;
run;

Related

Graphing in SAS

New to SAS so please bear with me. :) I'm trying to graph an output table with three y variables and one x. I've tried gplot and plot, however, I'm still getting iffy results and I can't figure out how to make them all show in one graph either :( I think my table is too large for pasting here so I uploaded to office.com and hopefully, someone smarter than I can figure this out.
https://1drv.ms/u/s!AnxXzVHJV4pKghj1MoJoWOQxzYTd?e=cJ1J5y
Use three SERIES statements in SGPLOT
Example:
data have;
do x = -10 to 10 by .1;
y1 = x**2 / 10;
y2 = 4 * sin ( x / 5 );
y3 = x;
output;
end;
run;
ods html file='plot.html' style=plateau;
proc sgplot data=have;
series x=x y=y1;
series x=x y=y2;
series x=x y=y3;
run;
ods html close;
Output:
The old school Proc GPLOT would use the PLOT / OVERLAY option:
goptions reset=all;
symbol value=none interpol=join;
proc gplot data=have;
plot (y1-y3) * x / overlay; /* ( list of y-variables ) * x-variable */
run;
proc sgplot might be what you are looking for.
https://support.sas.com/resources/papers/proceedings10/154-2010.pdf
It is not clear from the link or question what exactly you are looking for but if I had to guess I would say this would certainly help.

Only output the ROC curve in SAS

I am looking to create a pdf with 4 nice graphs for different analysis. My question is, how do I output only the ROC curve for my logistic regression?
I use the following code
TITLE2 JUSTIFY=CENTER "Rank ordering characteristic curve (ROC)";
ODS GRAPHICS ON;
PROC LOGISTIC
DATA = input
plots(only)=(roc(id=obs))
;
MODEL y
(Event = '1')= x
/
SELECTION=NONE
LINK=LOGIT;
RUN;
QUIT;
ODS GRAPHICS OFF;
and a dummy dataset can be imagined using this
DATA HAVE;
DO I = 1 TO 100;
Y = RAND('integer',0,1);
x = ranuni(i);
output;
end;
run;
Thanks
EDIT: just to be explicit, I'm looking to output just a plot of the ROC curve and nothing else, i.e. the tables containing the somers' D etc.
ODS SELECT ROCCURVE;
ODS SELECT allows you to control the output and include only the tables/output you want.
You can wrap your code in ODS TRACE ON, ODS TRACE OFF to find out what the table name, or check the documentation.

How to do sub plot using sas

I want to make a simple time series line plot without highlighting any dots on the line. I can plot var1 and var2 using the following code.
title "Title";
proc gplot data=test;
plot var1 *var2 /overlay grid hminor=0 ;
run;
quit;
However I want to add another variable into the plot. I tried the following code. Because the scale of var1 and var3 are quite large, so var3 are not properly scaled in the graph. Can anyone teach me how to use different scale for var1 and var3 please.
title "Title";
proc gplot data=Test;
plot var1 *var2 Var3*var2 /overlay grid hminor=0 ;
run;
quit;
Additionally, may I ask whether sas can do subplot as matlab please. Essentially, I got one big graph with two separate sub-graph. If possible, please teach me how to achieve this. I tried vpercent = 50, but it seems there are something wrong in my code.
proc gplot data=Test vpercent=50;
plot VAR1 *VAR2 VAR3*VAR2 /overlay grid hminor=0 ;
run;
quit;
With Thanks
Assuming I understand what you mean, if you have access to SGPLOT you can specify that X3 should be on a different axis. Here's an example with the SASHELP.STOCKS data which plots the open price on one Y axis and then the trade volume on the second Y axis.
proc sgplot data=sashelp.stocks;
where stock='IBM';
series x=date y=open;
series x=date y=volume/y2axis;
run;quit;
Here is some SAS code that builds on Reeza's excellent example and suggestion to use SGPANEL. See the PANELBY statement and the options used there.
*** SUBSET DATA AND SORT ***;
proc sort data=sashelp.stocks out=ibm;
where stock='IBM';
by date;
run;
*** TRANSPOSE DATA FROM "SHORT-AND-WIDE" TO "LONG-AND-THIN" ***;
proc transpose data=ibm out=ibm_t;
by date;
var open volume;
run;
proc sgpanel data=ibm_t;
*** ROW LATTICE OPTION STACKS PLOTS ***;
*** UNISCALE OPTION LETS EACH PANEL HAVE IT'S OWN SCALE ***;
*** NOVARNAME SUPPRESSES LABEL FOR THE Y-AXIS ON THE RIGHT SIDE ***;
panelby _name_ / layout=rowlattice uniscale=column novarname;
series x=date y=col1;
*** SUPPRESS LABEL FOR THE Y-AXIS ON THE LEFT SIDE ***;
rowaxis display=(nolabel);
run;

drawing histogram and boxplot in SAS

I wrote the following code in sas, but I did not get result!
The result histogram in grey and the range of data is not as I specified! what is the problem?
I got the following warning too: WARNING: The MIDPOINTS= list was extended to accommodate the data
what about color?
axis1 order=(0 to 100000 by 50000);
axis2 order=(0 to 100 by 5);
run;
proc capability data=HW2 noprint;
histogram Mvisits/midpoints=0 to 98000 by 10000
haxis=axis1
cfill=blue;
run;
.......................................
I have the same problem with boxplot, for example I got the following plot and I want to change the distances, then I could see the plot better, but I could not.
The below is for proc univariate rather than proc capability, I do not have access to SAS/QC to test, but the user guide shows very similar syntax for the histogram statements. Hopefully, you'll be able to translate it back.
It looks like you are having problems with the colour due to your output system. Your graphs are probably delivered via ODS, in which case the cfill option does not apply (see here and not the Traditional Graphics tag).
To change the colour of the histogram bars in ODS output you can use proc template:
proc template;
define style styles.testStyle;
parent = styles.htmlblue;
style GraphDataDefault /
color = green;
end;
run;
ods listing style = styles.testStyle;
proc univariate data = sashelp.cars;
histogram mpg_city;
run;
An example explaining this can be found here.
Alternatively you can use proc sgplot to create a histogram with more control of the colour as follows:
proc sgplot data = sashelp.cars;
histogram mpg_city / fillattrs = (color = red);
run;
As to your question of truncating the histogram. It doesn't really make a great deal of sense to ignore the extreme values as it will give you an erroneous image of the distribution, which somewhat defeats the purpose of the histogram. That said, you can achieve what you are asking for with bit of a hack:
data tempData;
set sashelp.cars;
tempClass = 1;
run;
proc univariate data = tempData noprint;
class tempClass;
histogram mpg_city / maxnbin = 5 endpoints = 0 to 25 by 5;
run;
In the above a dummy class tempClass is created and then comparative histograms are requested using the class statement. maxnbins will limit the number of bins displayed only in a comparative histogram.
Your other option is to exclude (or cap) your extreme points before creating the histogram, but this will lead to slightly erroneous frequency counts/percentages/bar heights.
data tempData;
set sashelp.cars;
mpg_city = min(mpg_city, 20);
run;
proc univariate data = tempData noprint;
histogram mpg_city / endpoints = 0 to 25 by 5;
run;
This is a possible approach to original question (untested as no SAS/QC or data):
proc capability data = HW2 noprint;
histogram Mvisits /
midpoints = 0 to 300000 by 10000
noplot
outhistogram = histData;
run;
proc sgplot data = histData;
vbar _MIDPT_ /
response = _OBSPCT_
fillattrs = (color = blue);
where _MIDPT_ <= 100000;
run;

How to create by group in proc gplot

I want to create multiple plots by category. Currently my code is as follows:
proc gplot data=data;
plot (a b)*week
*by category;
/vaxis=axis3 haxis=axis3 legend=legend1 overlay skipmiss;
title font='HELVETICA' height=1.2 "Volumes";
run;
but this includes all the categories. How do I create distinct charts for different categories? Also the chart here is a scatter plot. How do I create a line chart?
A fellow SAS 9.1.x user? Assuming that you require a gplot-based example:
proc summary data = sashelp.class nway;
var height;
class sex age;
output out = class mean=;
run;
symbol1 interpol = join;
proc gplot data = class;
by sex;
plot height * age;
run;
quit;
Here proc summary conveniently produces a sorted output dataset without any duplicate y-values, allowing gplot to produce a pair of reasonable line charts via the by statement. I'm sure there are much nicer-looking alternatives via proc sgplot if you have a more recent version of SAS, but some of us have to make do with gplot.