I have the following code which produces a bar graph; however, I have over 30 observations(names of providers) in the first_md_seen variable thus the graph's x axis is super cluttered.
Is there a way to expand the size of the graph area so I can clearly see every observations within first_md_seen variable and add some space between each of the observations so they are discernible?
As you can see in my code I changed the display positioning of the observations to vertical, but this did not help much.
Thanks!
proc sgplot data = day1;
vbar first_md_seen / response = median fillattrs=(color=lightblue)
dataskin=gloss datalabel
categoryorder=respdesc nostatlabel;
xaxis grid display=(nolabel)
fitpolicy=rotate
valuesrotate=vertical;
yaxis grid discreteorder=data display=(nolabel);
run;
Related
I've got a panel of three histograms and I've been able to figure out how to tweak all of the formatting except for one thing: getting the ticks to be the endpoints for the bins, instead of the midpoints.
I know that in 'proc univariate,' one can use an 'endpoints=' option in the histogram statement.
However, I cannot find a similar statement in the documentation for 'proc sgpanel'
Here is my code:
ods graphics on;
title "Baseline";
proc sgpanel data=baseline;
panelby scrp_cohort2 / rows=3 layout=rowlattice;
histogram pt_eq5d3l_health_state / boundary=lower group=scrp_cohort2;
where time=0;
colaxis min=0 max=100 grid values=(0 to 100 by 10);
run;
ods graphics off;
Specify a colaxis offsetmin and offsetmax that are 1/2 the bin width (as fraction).
Example:
Three SGPANEL runs to compare and contrast. The final one is the one you want.
data have;
call streaminit(2021);
do panel = 1 to 3;
do _n_ = 1 to 100 + rand('integer',50);
id + 1;
group = rand('integer',3);
do time = 0 to 10;
status = rand('integer',0,100);
output;
end;
end;
end;
stop;
run;
ods html file='gfx.html';
ods graphics on/ height=400 width=500;
title "Baseline";
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group;
where time=0;
run;
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group;
where time=0;
colaxis min=0 max=100 grid values=(0 to 100 by 20);
run;
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group;
where time=0;
colaxis grid values=(0 to 100 by 10)
offsetmin=0.05
offsetmax=0.05
;
run;
ods graphics off;
ods html close;
The issue here is that you're trying to manipualte a histogram, which is a chart that is not a discrete-values chart, even though it looks like it is such a chart. For example, VBAR would offer a discreteoffset option that would let you do exactly what you ask.
However, a histogram is a chart that graphs not discrete values on an x/y axis, just in a particular way that ends up looking sort of like a bar chart. So it won't let you move the labels around, because they're not just labels - they're fixed positions on the axis, which the histogram is collapsing points around.
Unfortunately, the endpoints option isn't available for PROC SGPANEL, which of course would be how you'd ideally solve this issue. You have a couple of options for what would work, depending on what you want to do exactly and what your data look like.
First, you can simply summarize your data using proc univariate or whatever works best, and then use vbar to graph the (now discrete) data. You can get a histogram dataset out of proc univariate easily enough (with ODS OUTPUT or OUTHISTOGRAM= option) with by statement for your group/panel values, and then you can graph that with VBAR in SGPANEL.
Second, you can make some adjustments to how things are done in SGPANEL, which might be enough for your needs. Look at the following graph, using Richard's example data.
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group binstart=-5 binwidth=10;
where time=0;
colaxis min=0 max=100 grid values=(0 to 100 by 10) ;
run;
What it does is start the bins at -5, instead of at 0, but the colaxis is still starting at zero. That's now accurately doing what you want, I think - except that 0 itself ends up in the -5 bar, which you might not want. The bins are now centered at 5/15/25/35/etc., which is hopefully what you do want. If you do have 0 in your data, you may be able to use options to move where 0 is bucketed (but it would affect all of the other exact endpoints also).
This is what that looks like with the 0's removed. If there are actual 0's, then you would have a bar to the left of the plot area, though.
Here is the same thing but with 0's in it, which you'll note means a bar to the left of 0.
This is a similar plot but with 0's allowed, and with boundary=upper which moves all of the exactly-on-bin-boundaries to the upper bin (so 0 goes to the 0-10 bin). Note the other changes - and there is now a 100-110 bar which contains the 100 values.
Code for the latter chart (earlier chart is same but boundary=lower):
title "Baseline";
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group binstart=-5 binwidth=10 boundary=upper;
where time=0;
colaxis min=0 max=100 grid values=(0 to 100 by 10) ;
run;
I have data that look like the following:
data have;
format date date9.;
input date:mmddyy10. Intervention _24hrPtVolumeESI_1_5;
datalines;
9/17/2018 0 204
9/24/2018 0 139
10/17/2018 0 527
10/23/2018 1 430
11/01/2018 1 231
;
run;
I would like to create a bar chart where the x axis contains ranges of median wait time (e.g. 100-125, 126-150 etc.) while displaying those times comparatively based on intervention (0 or 1). Thus, each range would have two bars-one for preintervention (0) and post interventions(1) The Y axis would simply show the counts for how man given median scores fell within the x axis range.
I've tried toying around with a sgplot code but that produces sloppy results.
proc sgplot data=WORK.FelaCombo;
vbar _24hrPtVolumeESI_1_5 / response=_24hrPtVolumeESI_1_5 stat=sum
group=intervention nostatlabel
groupdisplay=cluster;
xaxis display=(nolabel);
yaxis grid;
run;
Try using a histogram instead. vbar is more for discrete categories, whereas histogram will automatically create bins.
proc sgplot data=WORK.have;
histogram _24hrPtVolumeESI_1_5 /
scale=count
binstart=100
binwidth=25
group=intervention
transparency=0.5
showbins
;
xaxis display=(nolabel);
yaxis grid;
run;
In SAS, is there a way to display the variable label instead of the variable name in a stacked correlation matrix? Specifically in the row that goes across at the top of the matrix? I'm applying a template that modifies base.corr.stackedmatrix, changing the color of significant p-values to red, and I know using RowLabel for the column displays the variable label. I can't figure out how to display the label for the row of variable names so only the variable labels are displayed.
proc format;
value pvalsig low-.05 ="red" .05-high="black";
run;
proc template;
edit base.corr.stackedmatrix;
column (RowLabel) (Matrix) * (Matrix2) * (Matrix3) * (Matrix4);
edit matrix2;
style={foreground=pvalsig.};
end;
end;
run;
I want to plot Y by X plot where I group by year, but color code year based on different variable (dry). So each year shows as separate line but dry=1 years plot one color and dry=0 years plot different color. I actually figured one option (yeah!) which is below. But this doesn't give me much control.
Is there a way to put a where clause in the series statement to select specific categories so that I can specifically assign a color (or other format)? Or is there another way? This would be analogous to R where one can use multiple line statements for different subsets of data.
Thanks!!
This code works.
proc sgplot data = tmp;
where microsite_id = "&msit";
by microsite_id ;
yaxis label= "Pct. Stakes" values = (0 to 100 by 20);
xaxis label= 'Date' values = (121 to 288 by 15);
series y=tpctwett x=jday / markers markerattrs=(symbol=plus) group = year grouplc=dry groupmc=dry;
format jday tadjday metajday jdyfmt.;
label tpctwett='%surface water' tadval1='breed' metaval1='meta';
run;
Use an Attribute map, see the documentation
You can use the DRY variable to set the specific colours. For each year, assign the colour using the DRY variable in a data step.
proc sort data=tmp out=attr_data; by year; run;
data attrs;
set attr_data;
id='year';
if dry=0 then linecolor='green';
if dry=1 then linecolor='red';
keep id linecolor;
run;
Then add the dattrmap=attrs in the PROC SGPLOT statement and the attrid=year in the SGPLOT options.
ods graphics / attrpriority=none;
proc sgplot data = tmp dattrmap=attrs;
where microsite_id = "&msit";
by microsite_id ;
yaxis label= "Pct. Stakes" values = (0 to 100 by 20);
xaxis label= 'Date' values = (121 to 288 by 15);
series y=tpctwett x=jday / markers markerattrs=(symbol=plus) group = year grouplc=dry groupmc=dry attrid=year;
format jday tadjday metajday jdyfmt.;
label tpctwett='%surface water' tadval1='breed' metaval1='meta';
run;
Note that I tested and edited this post so it should work now.
Is there any way to specify formats directly for axis values and data labels? As far as I can tell, it uses whatever format is applied to the dependent variable.
Example:
data sample;
input group $ number;
format number dollar6.1;
cards;
A 55.2
B 20.3
C 47.1
D 43.2
;
run;
axis1 minor=none order=0 to 60 by 10;
proc gchart data=sample;
vbar group/ type=sum sumvar=number sum levels=all raxis=axis1;
run;
If I set the format to dollar6.1 then the axis labels have an unecessary decimal (0.0, 10.0, 20.0, etc.)
But, if I set the format to dollar6.0, then the labels on the tops of each bar are missing the decimal that I would like to show.
Any way to specify formats independantly for either of these?
I don't believe you can control the formats separately; you have limited kinds of control as far as time axis, log axis, etc., but otherwise no control over the numeric format.
What you can do is one of two things. At least in SGPLOT, you can create a secondary variable with a different format, and produce an empty graph (or an identical copy of your bar chart but with no label) using the variable formatted how you want the axis formatted; then produce the chart with the second, otherly formatted variable.
Secondly, you can assign explicit values to the axis. Rather than using the automatic values arising from your data, you can just use VALUE= to overwrite the labels placed on the tick marks. This isn't optimal if you have a varying axis (ie, you produce twenty of these with different axis amounts or whatnot), but if it's a fixed axis then you can probably get away with this. Look at the AXIS statement in GChart for more information.
How you'd do the first option:
data sample;
input group $ number;
format number dollar6.1;
axis_number = number;
format axis_number dollar6.0;
cards;
A 55.2
B 20.3
C 47.1
D 43.2
;
run;
proc sgplot data=sample;
vbar group /response=axis_number;
vbar group /response=number datalabel;
yaxis label='Number (sum)';
run;
That creates the bar chart twice, once with axis_number which then defines the axis, and once with number which defines the labels.
You can do this sort of thing using an annotate dataset. I'd give a better explanation of this if I had a solid understanding of how it works, but I use it so rarely that it's usually more of a trial-and-error process:
data sample;
input group $ number;
format number dollar6.0;
cards;
A 55.2
B 20.3
C 47.1
D 43.2
;
run;
Create anno dataset. I pulled this from the link above and got rid of extraneous stuff. Set [function]='label', [position] = '2' to place the labels above the bars, xsys = 2' and ysys = 2 to base the coordinates on the data values. size and style control the font.
midpoint=group puts the labels on the bars, y=number makes the y coordinate of the label equal the height of the bars, and text is where you specify the value and format of your label.
SAS Annotate Dictionary
data anno;
length function style $12;
retain function 'label' size 1 position '2'
xsys '2' ysys '2' style 'Albany AMT';
set sample;
midpoint=group;
y=number;
text=put(number,dollar6.1);
run;
Make your chart using your current code, but removing the sum and inserting annotate=anno.
axis1 minor=none order=0 to 60 by 10;
proc gchart data=sample;
vbar group/ type=sum sumvar=number annotate=anno levels=all raxis=axis1;
run;
If you're running 9.2 or later, and are happy to use the Graphics Template Language (GTL) then you can do it like this:
Add a new column to your data that rounds the value:
data sample;
input group $ number;
format number dollar6.1;
axisval=round(number,1);
cards;
A 55.2
B 20.3
C 47.1
D 43.2
;
run;
Define the chart:
proc template;
define statgraph mychart;
begingraph;
layout overlay;
barchartparm x=group y=axisval / datalabel=number;
endlayout;
endgraph;
end;
run;
Render the chart using the data we created earlier:
proc sgrender data=sample template=mychart;
run;
The trick here is using the datalabel= option of the barchartparm statement to specify which column contains the values for the labels. There may be some other ways to do this using the GTL and specifying formats but this seemed pretty straightforward to me.
The GTL is included in Base SAS 9.2 onwards I believe.