How to customize a boxplot in SAS? - sas

DataThe data I am looking at is different genes (categories) and the number of them (continuous) and the outcome of having cancer (dichotomous/ yes or no).
I am trying to create a boxplot of my data in SAS and want them all on the same boxplot instead of individual. My y axis should be my number and my x axis I want all of my genes and have those in two group of having cancer or not. How would I do this? I have tried using Proc Boxplot but couldn't figure out what to do. Below is the code I tried but I could not seem to add multiple variables to it.
ods graphics on;
ods html;
proc boxplot data=genes2;
plot A6_A8*cancer_status;
by cancer_status;
run;
ods html close;
ods graphics off;

Related

Can you specify axis ranges on proc kde plots?

Is it possible to specify axis ranges in the kde procedure? Not in a secondary sgplot. As in the following code:
proc kde data=have; bivar y / plots=contour; by z; run;
The plots I get from proc kde are gorgeous but the axes aren't the scale I need. I can fix the axes easily in a scatter plot using the data output by proc kde but the plot is hideous. So I guess secondary question is how do I make the sgplot look like the proc kde plot.
From proc kde:
From sgplot:
There are gridl option to set the lower grid limit and gridu option to set the upper limit.
You may use them like the following:
data bivnormal;
seed = 1283470;
do i = 1 to 1000;
z1 = rannor(seed);
z2 = rannor(seed);
x = 3*z1+z2;
output;
end;
drop seed;
run;
ods graphics on;
proc kde data=bivnormal;
univar x /gridl=-14 gridu=14 plots=(histdensity);
run;
ods graphics off;
If you want to do more with this graph, try output the data of graph. So you can use proc sgplot later.
ods output HistogramDensity=HistogramDensity;
ods graphics on;
proc kde data=bivnormal;
univar x /gridl=-14 gridu=14 plots=(histdensity);
run;
ods graphics off;
ods output close;

How to insert the #byval value in the png filename when using sgplot with a by statement in SAS

I'm making several png image using sgplot and the by statement like this:
ods html path="&graphPath" body="index.html"
image_dpi=300 style=sciensano1 device=png;
ods graphics on / reset noborder imagename="boom"
height=10cm width=16cm ;
title;footnote;
options byline;
proc sgplot data=sashelp.class;
histogram height;
by Sex;
run;quit;
ods html close;
This creates 1 html file (index.html)
and 2 png files (boom1.png & boom3.png)
I'm wondering if it's possible to name the png files according to the by values.
Similarly as adding #byval in the title for example.
Currently, number are added automatically to the imagename from the ods graphics, I would like to get rid of the numbers and use the byval instead.
The only solution I have so far is to make all graphs individually, using a %do loop in a macro, this way I can parametrize the imagename and give it a macro variable name. The problem with that is that it's much more complex to implement and much slower.
EDIT: Using SAS 9.3
In my SAS 9.4M4, there is no such feature in SGPLOT HISTOGRAM options, nor in ODS GRAPHICS IMAGENAME or INDEX.
Ideally, a future release would see ODS honor #BYVAL and #BYVAR substitution options.
ods graphics / imagename="boom#byval1"; * not real;
ods graphics / imagename="boom#byval(sex)"; * not real;
or
ods graphics / imagename="boom" reset=index(#byval1); * not real;
Fall back:
The GCHART Procedure statements, such as VBAR support the name= option which honors the #BYVAL substitution option.
vbar height / name="basename#byval1"; * creates gfx file whose name contains the by var value;
Looks like this feature is in v9.4M5--dig it: Keeping Up To Date With ODS Graphics.

How to have multiple graphs generated by one gplot procedure output to a single PDF file?

The following GPLOT procedure generates many graphs(It gives sales by different product). Say if my product has 'Sofa', 'bed', 'Chairs', it will give 3 graphs, one for sofa, one for chairs, one for bed.
I'd like to have all the three graphs generated to be output to one single PDF file. I tried the following, but it only keep the last graph generated. Any ideas how I can do this?
ODS PDF FILE= 'OUTPUT.PDF';
PROC GPLOT data = AB.TEMP;
plot sales*Months=Product;
by Region;
run;
ODS PDF CLOSE;
Thanks!
Sandwich your code between ODS PDF and ODS PDF CLOSE statements.
ODS PDF FILE='my_file.pdf' style=meadow;
PROC GPLOT data = AB.TEMP;
plot sales*Months=Product;
by Region;
run;
ODS PDF CLOSE;
Does this work for you? If so, then you have something wrong in your code. Post your code and log in that case.
proc sort data=sashelp.cars out=cars;
by origin;
run;
ods pdf file="C:\_localdata\temp.pdf" style=meadow;
proc gplot data=cars;
plot mpg_city*msrp=make;
by origin;
run;
ods pdf close;

Output to new data set

Suppose that I have a model and I want to output the studentized residuals, leverages, Cook's Distance and the DFFITS statistics from my regression model to a new data set. How would I do this?
Answering as a general question of how do I get certain pieces of output from a proc to a dataset, you will want to look at ODS TRACE.
ods trace on;
proc reg <stuff>;
<stuff>;
run;
ods trace off;
Now, look at the log, and see what different output options you have. All of the different things that go to the screen will be here, plus additional tables sometimes that don't go to any output window by default. Find the name for the tables you want data from, and then direct them to an ods output statement.
ods output <name>=<datasetname>;
proc reg <stuff>;
<stuff>;
run;
ods output close;
You can specify multiple names and multiple output datasets, assuming you want more than one thing.

How to export selected p-values to the table in SAS?

I'm trying to write a program in SAS that supports decision-making process in selecting the best formula of linear regression model. I even had one but in R environment. Now I have to implement it in SAS. The final result should be a dataset with each line decribing different regression formulas i.e. names of explanatory variables, R-squared, p-values for different statistical tests, etc.
As an example, one of the tests is Durbin-Watson test for autocorrelation. My goal is to insert a p-value into the table I've mentioned. I use the code:
proc reg data=indata outest=outdata EDF ridge=0 OUTVIF;
model PKB = PK INV SI / DW;
run;
quit;
And as a result I get in the output window:
Durbin-Watson Statistics
Order DW Pr < DW Pr > DW
1 1.2512 0.0038 0.9962
I want to insert those p-values directly into the SAS table. I tried to find an answer in the SAS OnlineDoc and on the forum but with no success.
ODS OUTPUT is the best way to get information that you can print to the screen into datasets. Use ODS TRACE ON; before your code, run it, then inspect the log; see what table name matches what you're looking for. Then use ODS OUTPUT <tablename>=<datasetname>.
For example, in this PROC FREQ, I see ONEWAYFREQS is the table I want.
ods trace on;
proc freq data=sashelp.class;
var age;
run;
ods trace off;
So I use ODS OUTPUT:
ods output onewayfreqs=ages;
proc freq data=sashelp.class;
table age;
run;
ods output close;
and get a nice dataset. (ODS TRACE is not necessary if you know the name of the table you're looking for.)