How do I change colours of different bars in SAS? - sas

proc sgplot data=WORK.CUSTOMERDATA;
title height=14pt "Bar Chart of Gender";
vbar Gender / fillattrs= (color=CX024ae6) datalabel;
yaxis grid;
run;
ods graphics / reset;
title;
how do I change the colour of individual bars ? I have tried fill= and styleattrs datacolors= but it doesn't seem to work..

Should you wish to have one color for the graduated and another for the ones that did not, the following should provide the desired output
proc sgplot data=customerdata;
title height=14pt "Bar Chart of Graduated";
styleattrs datacolors=(blue red) ;
vbar Graduated / group=Graduated filltype=solid datalabel;
yaxis grid;
run;

Related

Overlay the average trend on group by trends using Proc sgplot

I want to create a line graph that includes the overall trend of a disease rate and the specific trends for males and females. I use the following code for to create the group by trends. How to add he average trend to this line graph. Thanks for your help.
proc sgplot data=have ;
vline year/response=disease_rate group=sex stat=mean datalabel=disease_rate ;
yaxis values=(0,1) label="Percentage";
run;
Here's an example of summarizing it and then displaying it on the graph. There are more than one way to do this though, this is just one.
data have;
set sashelp.heart(in=a);
year=round(2021-ageAtStart, 10);
disease_rate= status="Dead";
run;
proc means data=have mean noprint;
class sex year;
types sex sex*year;
var disease_rate;
output out=summary_stats mean=average_value;
run;
proc sort data=summary_stats;
by sex year;
run;
data graph_data;
merge summary_stats(where=(_type_=2) rename=average_value=mean_sex_year)
summary_stats(where=(_type_=3) rename=average_value = mean_sex);
by sex;
format mean_sex: percent12.1;
run;
proc sgplot data=graph_data ;
*where year > 1990;
vline year/response=mean_sex_year group=sex stat=mean datalabel=mean_sex_year ;
vline year/response=mean_sex group=sex stat=mean datalabel=mean_sex ;
run;
Use series instead of vline so that you can overlay a regression on top of it to get an average trend line. For example:
proc sql;
create table have as
select date
, region
, sum(sale) as sale
from sashelp.pricedata
group by region, date
order by region, date
;
quit;
proc sgplot data=have;
series x=date y=sale / group=region;
reg x=date y=sale / group=region;
xaxis fitpolicy=rotatethin;
run;

How do I adjust bins to endpoint instead of midpoint in proc sgpanel

I've got a panel of three histograms and I've been able to figure out how to tweak all of the formatting except for one thing: getting the ticks to be the endpoints for the bins, instead of the midpoints.
I know that in 'proc univariate,' one can use an 'endpoints=' option in the histogram statement.
However, I cannot find a similar statement in the documentation for 'proc sgpanel'
Here is my code:
ods graphics on;
title "Baseline";
proc sgpanel data=baseline;
panelby scrp_cohort2 / rows=3 layout=rowlattice;
histogram pt_eq5d3l_health_state / boundary=lower group=scrp_cohort2;
where time=0;
colaxis min=0 max=100 grid values=(0 to 100 by 10);
run;
ods graphics off;
Specify a colaxis offsetmin and offsetmax that are 1/2 the bin width (as fraction).
Example:
Three SGPANEL runs to compare and contrast. The final one is the one you want.
data have;
call streaminit(2021);
do panel = 1 to 3;
do _n_ = 1 to 100 + rand('integer',50);
id + 1;
group = rand('integer',3);
do time = 0 to 10;
status = rand('integer',0,100);
output;
end;
end;
end;
stop;
run;
ods html file='gfx.html';
ods graphics on/ height=400 width=500;
title "Baseline";
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group;
where time=0;
run;
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group;
where time=0;
colaxis min=0 max=100 grid values=(0 to 100 by 20);
run;
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group;
where time=0;
colaxis grid values=(0 to 100 by 10)
offsetmin=0.05
offsetmax=0.05
;
run;
ods graphics off;
ods html close;
The issue here is that you're trying to manipualte a histogram, which is a chart that is not a discrete-values chart, even though it looks like it is such a chart. For example, VBAR would offer a discreteoffset option that would let you do exactly what you ask.
However, a histogram is a chart that graphs not discrete values on an x/y axis, just in a particular way that ends up looking sort of like a bar chart. So it won't let you move the labels around, because they're not just labels - they're fixed positions on the axis, which the histogram is collapsing points around.
Unfortunately, the endpoints option isn't available for PROC SGPANEL, which of course would be how you'd ideally solve this issue. You have a couple of options for what would work, depending on what you want to do exactly and what your data look like.
First, you can simply summarize your data using proc univariate or whatever works best, and then use vbar to graph the (now discrete) data. You can get a histogram dataset out of proc univariate easily enough (with ODS OUTPUT or OUTHISTOGRAM= option) with by statement for your group/panel values, and then you can graph that with VBAR in SGPANEL.
Second, you can make some adjustments to how things are done in SGPANEL, which might be enough for your needs. Look at the following graph, using Richard's example data.
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group binstart=-5 binwidth=10;
where time=0;
colaxis min=0 max=100 grid values=(0 to 100 by 10) ;
run;
What it does is start the bins at -5, instead of at 0, but the colaxis is still starting at zero. That's now accurately doing what you want, I think - except that 0 itself ends up in the -5 bar, which you might not want. The bins are now centered at 5/15/25/35/etc., which is hopefully what you do want. If you do have 0 in your data, you may be able to use options to move where 0 is bucketed (but it would affect all of the other exact endpoints also).
This is what that looks like with the 0's removed. If there are actual 0's, then you would have a bar to the left of the plot area, though.
Here is the same thing but with 0's in it, which you'll note means a bar to the left of 0.
This is a similar plot but with 0's allowed, and with boundary=upper which moves all of the exactly-on-bin-boundaries to the upper bin (so 0 goes to the 0-10 bin). Note the other changes - and there is now a 100-110 bar which contains the 100 values.
Code for the latter chart (earlier chart is same but boundary=lower):
title "Baseline";
proc sgpanel data=have;
panelby panel / rows=3 layout=rowlattice;
histogram status / boundary=lower group=group binstart=-5 binwidth=10 boundary=upper;
where time=0;
colaxis min=0 max=100 grid values=(0 to 100 by 10) ;
run;

Organize ODS output by an ID

I need to create 3 graphs for each facility and output these onto 1 page. I have 600 facilities to do this for so I will have a 600 page document. I have created my graphs using the code below. If I specify a "where ID=X" in the proc sgplot statement, it outputs everything fine, but only for facility X. If I don't, it prints Graph 1 for every facility before going to the next graph. I'm guessing I need a macro... does anyone have any advice?
OPTIONS orientation=vertical nodate;
ods rtf file="C:\Users\filename.rtf" STYLE=Styles.rtf;
ods listing close;
ods noproctitle ;
ODS ESCAPECHAR='^';
title ; footnote;
*First graph;
ods graphics on / height=2.7 in width=8in;
ods rtf startpage=NOW;
ods rtf text= "^{style[fontweight=bold fontsize=11pt textalign=c] Employees}";
ods graphics/noborder;
proc sort data=clean4; by ID warehouse county; run;
proc sgplot data=clean4;
by pfi name;
title2 "ID= #byval(ID) ";
title3 "Name: #byval(warehouse) ";
title4 "County: #byval(county) ";
series x=date y=emp / markers markerattrs=(symbol=CircleFilled color=blue) lineattrs=(color=blue thickness=2 pattern=1 ) legendlabel='Number of Employees' dataskin=pressed;
yaxis label='Count' valueattrs=(size=11pt) labelattrs=(size=11pt weight=bold) offsetmin=0 integer;
xaxis label='Date' valueattrs=(size=11pt) labelattrs=(size=11pt weight=bold) ;
option NOBYLINE;
run;
*Second graph;
ods graphics on / height=2.7 in width=8in;
ods rtf startpage=NO;
ods rtf text=' ';
ods rtf text= "^{style[fontweight=bold fontsize=12pt textalign=c] Hats used daily}";
ods graphics/noborder;
proc sort data=clean4; by ID; run;
proc sgplot data=clean4;
by ID;
title2; title3;
series x=date y=hats / markers markerattrs=(symbol=CircleFilled color=red)
lineattrs=(color=red thickness=2 pattern=1 ) legendlabel='Number of hats used' dataskin=pressed;
yaxis label='Count' valueattrs=(size=11pt) labelattrs=(size=11pt weight=bold) fitpolicy=thin
offsetmin=0 integer;
xaxis label='Date' valueattrs=(size=11pt) labelattrs=(size=11pt weight=bold) ;
run;
*Third graph;
ods graphics on / height=2.7 in width=8in;
ods rtf startpage=NO;
ods rtf text=' ';
ods rtf text= "^{style[fontweight=bold fontsize=11pt textalign=c] LOESS}";
ods graphics/noborder;
proc sort data=clean4; by ID; run;
proc sgplot data=clean4;
by ID;
loess y=var1 x=date/ legendlabel="LOESS" lineattrs=(color=blue)
FILLEDOUTLINEDMARKERS MARKERFILLATTRS=(color=black);
yaxis label='LOESS Plot' valueattrs=(size=11pt) labelattrs=(size=11pt weight=bold) offsetmin=0;
xaxis label='Date' valueattrs=(size=11pt) labelattrs=(size=11pt weight=bold) THRESHOLDMIN=0
THRESHOLDMAX=0 ;
option NOBYLINE;
run;
ods rtf close;
ods listing ;
Because you are using the same data set clean4 for producing output in three different ways for each ID you need to change only a minimal amount of code when you convert to macro (macroize) the existing code.
Two steps
macroize existing code to operate on a single 'ID' value (the do'er)
run the macro for each ID (the run'er)
Do'er
%macro doReport(ID=);
* move the sort to the top;
* only need to sort the data once (for your situation);
proc sort data=clean4 out=clean4_oneID;
by ID warehouse county;
where ID = "&ID";
run;
* Place all the graphing code here;
* change all the 'clean4' data set references to 'clean4_oneID';
%mend;
Run'er
* Place ODS RTF and settings here;
* obtain list of each id;
proc sort nodupkey data=clean4 out=id_list; by id; run;
* 'stackingly' invoke macro for each id;
data _null_;
set id_list;
call execute (cats('%nrstr(%doReport(ID=',id,'))');
run;
* stacked execute code will now be submitted by SAS supervisor;
* close the RTF here;

proc sgplot multiple line title

How can have a title of a graph with multiple lines? I would like to have to title in the first line and then a paragraph underneath that title to explain the graph.
My attempt is:
proc sgplot data= maindata.small_medium_big_firms;
title "Number of big, medium and small firms"
title1 " this is to explain the graph .........";
series x=year y=group_1/lineattrs=(color=red) legendlabel= "small";
series x=year y=group_2/lineattrs=(color=blue) legendlabel= "medium";
series x=year y=group_3/lineattrs=(color=black) legendlabel= "big";
YAXIS LABEL = 'Number of firms';
XAXIS LABEL = 'Year';
run;
Title and Title1 are the same command. By design, if you submit a new TITLE statement it overwrites any other TITLE statements of the same number and higher numbers.
http://support.sas.com/documentation/cdl/en/grstatproc/69716/HTML/default/viewer.htm#n1ukd9sqgqiwwhn1mrx4c1rbse1j.htm
This uses SASHELP data set to run, so anyone with SAS should be able to run the code correctly.
proc sgplot data= sashelp.stocks;
title1 "My Title - Title1" ;
title2 "Other Text - title2";
where stock='IBM';
series x=date y=open/lineattrs=(color=red) legendlabel= "Open";
series x=date y=close/lineattrs=(color=blue) legendlabel= "Close";
series x=date y=high/lineattrs=(color=black) legendlabel= "High";
YAXIS LABEL = 'Stock Price';
XAXIS LABEL = 'Date';
run;

Binned Bar chart using SAS

I'm trying to make a bar chart using SAS. I have multiple salaries data and I'd like to show a bar chart with the frequencies of these salaries. I've made this:
ODS GRAPHICS ON;
PROC FREQ DATA=WORKERS.SORT ORDER=INTERNAL;
TABLES salaries / NOCUM SCORES=TABLE plots(only)=freq;
RUN;
ODS GRAPHICS OFF;
It works, the problem is, that now I can see all (hundreds) of the salaries on the x-axis. I'd like to have just intervals of these salaries (like 20) so that I can get a more readable chart. I just can't find out how to do it. I've also tried it with this:
PROC CHART DATA=WORK.SORT;
vbar salaries;
RUN;
but that's a text representation of the chart, so I can't use it.
You can create a format and apply the format to the variable you want to group into buckets. Here's an example:
proc format ;
value myfmt
low - 13 = '13 and Under'
14 - high = '14 and Above';
run;
ODS GRAPHICS ON;
PROC FREQ DATA=sashelp.class ORDER=INTERNAL;
format age myfmt.;
TABLES age / NOCUM SCORES=TABLE plots(only)=freq;
RUN;
ODS GRAPHICS OFF;
Use PROC UNIVARIATE with the HISTOGRAM statement. http://support.sas.com/documentation/cdl/en/procstat/66703/HTML/default/viewer.htm#procstat_univariate_toc.htm
ods html;
proc univariate data=sashelp.cars noprint;
var msrp;
histogram;
run;
There are options for specifying bin size:
ods html;
proc univariate data=sashelp.cars noprint;
var msrp;
histogram / midpoints=30000 to 180000 by 30000;
run;
And just for completeness, I'll add another solution in case you want more control over the chart's appearance. Using the Graphics Template Language you can create some very nice looking charts.
The proc template statement defines how the chart will look. The sgrender runs the chart against the specified dataset. There's all kinds of options that are best explored in the online doc: http://support.sas.com/documentation/cdl/en/grstatgraph/65377/HTML/default/viewer.htm#p1sxw5gidyzrygn1ibkzfmc5c93m.htm
I've just taken the sample they provided and added the / nbins=20 option to have it automatically group into 20 bins. It also has options for start and end bin, bin size, etc..
proc template;
define statgraph histogram;
begingraph;
entrytitle "Histogram of Vehicle Weights";
layout overlay /
xaxisopts=(label="Vehicle Weight (LBS)")
yaxisopts=(griddisplay=on);
histogram weight / nbins=20;
endlayout;
endgraph;
end;
run;
proc sgrender data=sashelp.cars template=histogram;
run;