SAS - SGPLOT - adding XAXIS group label - sas

I am looking for an option to add a group label (grouping value of Xaxis). I tried X2AXIS and REFLINE option but none is working exactly. (see attached image for reference - I would like to add G1, G2 and G3 with brackets)

As a starting point, consider
data have;
call streaminit(2021);
do x = 1 to 6;
do _n_ = 1 to 2+ rand('integer', 5);
y = 5 + rand('integer', 10);
group = cats ('G', int((x+1)/2));
output;
end;
end;
run;
ods html file='plot.html';
proc sgplot data=have;
vbox y / group=group category=x;
run;
ods html close;
which produces

Related

Paired bar charts in SAS

I need help creating a single bar chart, that pairs bars together (by two levels of Group), for four time periods. Here's what my table of data look like, sorted by 'Group':
I've figured out how to plot the means for both groups, but only for one time period at a time:
proc sgplot data=Testdata;
vbar Group /
response=Baseline
stat=mean
GROUPDISPLAY = CLUSTER;
run;
Which gets me this:
However, I'd like to "smoosh" these two bars together, so that they're touching, and then add the means, for each level of group, for the other three time periods, all in one plot. I've tried just adding the other time periods to the 'response=' line (both with, and without commas) but that doesn't work.
Please help!
(And I know this is kind of greedy, but it would be great if anyone could tell me how to change the bar color based on Group level)
TIA for any help.
You will want to transpose the data so you have columns id, group, period, result.
The VBAR satement would change from
VBAR GROUP to
VBAR PERIOD
and you can use the VBAR features
group = GROUP
datalabel = result
statlabel
Example:
data have;
call streaminit(123);
do group = 'A' , 'B';
do _n_ = 1 to ifn(group='A',6,11);
id + 1;
baseline = ifn(group='A', 2125, 4400) + rand('integer', 400) - 200;
period1 = ifn(group='A', 2425, 4100) + rand('integer', 600) - 300;
period2 = ifn(group='A', 1800, 3600) + rand('integer', 500) - 250;
period3 = ifn(group='A', 1600, 2800) + rand('integer', 500) - 250;
output;
end;
end;
label
baseline = 'Basline'
period1 = '14 Day Average'
period2 = '30 Day Average'
period3 = '60 Day Average'
;
run;
proc transpose data=have
out=plotdata (
rename=(
_name_ = period
_label_ = period_label
col1 = result
))
;
by id group notsorted;
var baseline period1-period3;
label period = ' ';
label period_label = ' ';
run;
ods html file='plot.html' style=plateau;
proc sgplot data=plotdata;
vbar period_label /
response = result
stat = mean
groupdisplay = cluster
group = group
datalabel = result statlabel
;
xaxis display=(nolabel);
run;
ods html close;
Image

how to vertically sum a range of dynamic variables in sas?

I have a dataset in SAS in which the months would be dynamically updated each month. I need to calculate the sum vertically each month and paste the sum below, as shown in the image.
Proc means/ proc summary and proc print are not doing the trick for me.
I was given the following code before:
`%let month = month name;
%put &month.;
data new_totals;
set Final_&month. end=end;
&month._sum + &month._final;
/*feb_sum + &month._final;*/
output;
if end then do;
measure = 'Total';
&month._final = &month._sum;
/*Feb_final = feb_sum;*/
output;
end;
drop &month._sum;
run; `
The problem is this has all the months hardcoded, which i don't want. I am not too familiar with loops or arrays, so need a solution for this, please.
enter image description here
It may be better to use a reporting procedure such as PRINT or REPORT to produce the desired output.
data have;
length group $20;
do group = 'A', 'B', 'C';
array month_totals jan2020 jan2019 feb2020 feb2019 mar2019 apr2019 may2019 jun2019 jul2019 aug2019 sep2019 oct2019 oct2019 nov2019 dec2019;
do over month_totals;
month_totals = 10 + floor(rand('uniform', 60));
end;
output;
end;
run;
ods excel file='data_with_total_row.xlsx';
proc print noobs data=have;
var group ;
sum jan2020--dec2019;
run;
proc report data=have;
columns group jan2020--dec2019;
define group / width=20;
rbreak after / summarize;
compute after;
group = 'Total';
endcomp;
run;
ods excel close;
Data structure
The data sets you are working with are 'difficult' because the date aspect of the data is actually in the metadata, i.e. the column name. An even better approach, in SAS, is too have a categorical data with columns
group (categorical role)
month (categorical role)
total (continuous role)
Such data can be easily filtered with a where clause, and reporting procedures such as REPORT and TABULATE can use the month variable in a class statement.
Example:
data have;
length group $20;
do group = 'A', 'B', 'C';
do _n_ = 0 by 1 until (month >= '01feb2020'd);
month = intnx('month', '01jan2018'd, _n_);
total = 10 + floor(rand('uniform', 60));
output;
end;
end;
format month monyy5.;
run;
proc tabulate data=have;
class group month;
var total;
table
group all='Total'
,
month='' * total='' * sum=''*f=comma9.
;
where intck('month', month, '01feb2020'd) between 0 and 13;
run;
proc report data=have;
column group (month,total);
define group / group;
define month / '' across order=data ;
define total / '' ;
where intck('month', month, '01feb2020'd) between 0 and 13;
run;
Here is a basic way. Borrowed sample data from Richard.
data have;
length group $20;
do group = 'A', 'B';
array months jan2020 jan2019 feb2020 feb2019 mar2019 apr2019 may2019 jun2019 jul2019 aug2019 sep2019 oct2019 oct2019 nov2019 dec2019;
do over months;
months = 10 + floor(rand('uniform', 60, 1));
end;
output;
end;
run;
proc summary data=have;
var _numeric_;
output out=temp(drop=_:) sum=;
run;
data want;
set have temp (in=t);
if t then group='Total';
run;

Proc Format for traffic lighting not working

I am attempting to create traffic lighting in a report using proc format. But even though the values can be greater than 1 or below 1 the colors are always the lowest color. In this case, they are all red. Why is SAS not seeing the values?
proc format;
value forecast
low - < 0.70 = 'red'
0.70 - <0.90 = 'yellow'
0.90 - high = 'green';
run;
%macro perform_target (cc, year, career_id, cc_name) ;
data Performance_&career_id._&cc._&year. ;
set post_target_comparisons1;
where institution = "&cc." and year = "&year." and career_id = "&career_id.";
run;
ods excel file = "Y:\General - CTE\2019 CTE Accountability\2020\Need Assessment Performance
Tables\Performance_&career_id._&cc._&year..xlsx" style = sasdocprinter;
ods excel options(autofilter="2-39" sheet_name = "Performance &year." embedded_titles = 'yes');
run;
title j= C "Actual to Target Comparisons";
title2 j = C "Academic Year - &year.";
run;
proc report data = Performance_&career_id._&cc._&year.;
column community_college cluster_label measure_label year total_students total_male percent_male
target forecast_male forecast_female forecast_AI forecast_AS forecast_AA forecast_HI forecast_PI
forecast_W forecast_MU forecast_disabled forecast_farms forecast_single forecast_displaced
forecast_ell forecast_nontrad;
define forecast_male/display 'Percent to Forecast Male' style(column) = [cellwidth=1in
tagattr="format:####.##\%" fontweight = bold foreground = forecast.];
run;
ods excel close;
%mend perform_target;
%perform_target (010042, 2016, 01);
This works so I expect your data is the issue.
proc format;
value forecast
low - < 0.70 = 'red'
0.70 - <0.90 = 'yellow'
0.90 - high = 'green';
run;
data have;
do x = 0 to 1 by .05;
output;
end;
run;
ods excel file='test.xlsx';
proc report data=have list;
columns x;
define x / display /*format=percent12.2*/ style(column)=[cellwidth=1in tagattr="format:####.##\%" fontweight = bold foreground = forecast.];
run;
ods excel close;

SAS: Replace rare levels in variable with new level "Other"

I've got pretty big table where I want to replace rare values (for this example that have less than 10 occurancies but real case is more complicated- it might have 1000 levels while I want to have only 15). This list of possible levels might change so I don't want to hardcode anything.
My code is like:
%let var = Make;
proc sql;
create table stage1_ as
select &var.,
count(*) as count
from sashelp.cars
group by &var.
having count >= 10
order by count desc
;
quit;
/* Join table with table including only top obs to replace rare
values with "other" category */
proc sql;
create table stage2_ as
select t1.*,
case when t2.&var. is missing then "Other_&var." else t1.&var. end as &var._new
from sashelp.cars t1 left join
stage1_ t2 on t1.&var. = t2.&var.
;
quit;
/* Drop old variable and rename the new as old */
data result;
set stage2_(drop= &var.);
rename &var._new=&var.;
run;
It works, but unfortunately it is not very officient as it needs to make a join for each variable (in real case I am doing it in loop).
Is there a better way to do it? Maybe some smart replace function?
Thanks!!
You probably don't want to change the actual data values. Instead consider creating a custom format for each variable that will map the rare values to an 'Other' category.
The FREQ procedure ODS can be used to capture the counts and percentages of every variable listed into a single table. NOTE: Freq table/out= captures only the last listed variable. Those counts can be used to construct the format according to the 'othering' rules you want to implement.
data have;
do row = 1 to 1000;
array x x1-x10;
do over x;
if row < 600
then x = ceil(100*ranuni(123));
else x = ceil(150*ranuni(123));
end;
output;
end;
run;
ods output onewayfreqs=counts;
proc freq data=have ;
table x1-x10;
run;
data count_stack;
length name $32;
set counts;
array x x1-x10;
do over x;
name = vname(x);
value = x;
if value then output;
end;
keep name value frequency;
run;
proc sort data=count_stack;
by name descending frequency ;
run;
data cntlin;
do _n_ = 1 by 1 until (last.name);
set count_stack;
by name;
length fmtname $32;
fmtname = trim(name)||'top';
start = value;
label = cats(value);
if _n_ < 11 then output;
end;
hlo = 'O';
label = 'Other';
output;
run;
proc format cntlin=cntlin;
run;
ods html;
proc freq data=have;
table x1-x10;
format
x1 x1top.
x2 x2top.
x3 x3top.
x4 x4top.
x5 x5top.
x6 x6top.
x7 x7top.
x8 x8top.
x9 x9top.
x10 x10top.
;
run;

Transparent line in a line plot?

Hello intelligent people.
Can anyone tell me how to get a graph/line in a line plot transparent in SAS EG or in a SAS PROC GPLOT procedure? (2 lines are on top of eachother, so cannot see the underlying line).
Just add the option transparency
ods graphics on / width=66%;
data test;
do x = 1 to 10 by .1;
up = 11 - x;
down = 9 - x;
y = x*x/10;
output;
end;
proc sgplot data=test;
series x=x y=y / lineattrs=(color=orange thickness=3);
band x=x upper=up lower=down / transparency=.3;
run;