Conditionally change the column headers in Proc report in SAS - sas

Is it possible to conditionally change the headers of columns in a report by using proc report in SAS? I would like to change the names of the columns based on groups of data in a dataset. Ex: Let's consider there are 3 groups of data in my dataset. For the first group, I would like the names of 3 variables to be A, B and C. For the second group, I would like the names of 3 variables to be D, E and F. For the third group, I would like the names of 3 variables to be X, Y and Z.
Thank you
I have tried to set a temp variable in proc report by using a compute block and then use it in the following compute block to accomplish this task. However, I cannot use define statement in a compute block. Due to this deficiency, I failed. Below is the SAS code, which I had tried.
proc report data=x1 nowd headline headskip missing split="|" formchar(2)='_';
column ("&undl." pg grp subgrp aedecod _65 _6574 _7584);
define pg/order noprint;
define grp/order noprint;
define subgrp/order noprint;
define aedecod/display "Preferred Term" width=55 left spacing=0 flow;
compute before subgrp;
temp=subgrp;
endcomp;
compute _65;
if temp = "AB" then
define _65/ display "Age <65| (N=&le65_1.)| __________________| n (%) [Events]" width=20 style(column) = [just=D] flow;
if temp = "CD" then
define _65/ display "Age <65| (N=&le65_2.)| __________________| n (%) [Events]" width=20 style(column) = [just=D] flow;
if temp = "EF" then
define _65/ display "Age <65| (N=&le65_3.)| __________________| n (%) [Events]" width=20 style(column) = [just=D] flow;
endcomp;
compute _6574;
if temp = "AB" then
define _6574/ display "Age 65-74| (N=&bt6574_1.)| __________________| n (%) [Events]" width=20 style(column) = [just=D] flow;
if temp = "CD" then
define _6574/ display "Age 65-74| (N=&bt6574_2.)| __________________| n (%) [Events]" width=20 style(column) = [just=D] flow;
if temp = "EF" then
define _6574/ display "Age 65-74| (N=&bt6574_3.)| __________________| n (%) [Events]" width=20 style(column) = [just=D] flow;
endcomp;
compute _7584;
if temp = "AB" then
define _7584/ display "Age 75-84| (N=&bt7584_1.)| __________________| n (%) [Events]" width=20 style(column) = [just=D] flow;
if temp = "CD" then
define _7584/ display "Age 75-84| (N=&bt7584_2.)| __________________| n (%) [Events]" width=20 style(column) = [just=D] flow;
if temp = "EF" then
define _7584/ display "Age 75-84| (N=&bt7584_3.)| __________________| n (%) [Events]" width=20 style(column) = [just=D] flow;
endcomp;
break after pg/page;
break after grp/skip;
break after subgrp/page;
compute before _page_;
line #1 'Subgroup:' subgrp $122.;
endcomp;
compute after _page_;
line #1 134*'_';
endcomp;
run;
Please see the above code that I am using. As you can see, there are three parent subgroups, AB, CD and EF. Within these three parent subgroups, there are 3 other child age subgroups, which are Age <65, Age 65-74 and Age 75-84. The number of subjects (N) within each age subgroup (child) will change depending on the number of subjects in their parent subgroups, which are AB, CD and EF. I am interested in knowing if I can conditionally reflect this change in N in the output by using proc report.

Related

how to average a computed column in SAS `proc report`

In SAS proc report, a computed column is calculated row by row.
This applies to the summary lines too, but that is not always wat you want.
As an example, take this study of the Body Mass Index in SASHELP.CLASS:
title Study Body Mass Index (BMI) by sex in class;
title2 Erroneously calculate average BMI from the average weight and height;
proc report data=sasHelp.class nowindows headline headskip split='*'
style(summary) = {font_style=italic foreground=blue};
where Name contains 'J'; * reduce the size to facilitate manual calculation ;
columns sex name age height m weight kg BMI ;
define sex / group ;
define age / analysis mean format = 6.2;
define height / analysis mean noprint;
define weight / analysis mean noprint;
define kg / computed format = 6.2 'Weight*(kg)';
define m / computed format = 6.2 'Height*(meter)';
define BMI / computed format = 6.2 'BMI*(kg/m²)';
compute m;
m = height.mean * .02540;
endcomp;
compute kg;
kg = weight.mean * 0.45359237;
endcomp;
compute BMI;
BMI = kg/m/m;
if name eq '' then name = 'mean';
endcomp;
break after sex /summarize;
run;
It is wrong, because the BMI is not in the summary, i.e. for mean, is not the mean of the above BMI's, it is calculated from the height and weight left of it.
This is a correct calculation, summing BMI's and counting students manually.
title2 manually : summing BMI and counting students;
proc report data=sasHelp.class nowindows headline headskip split='*'
style(summary) = {font_style=italic foreground=blue};
where Name contains 'J'; * reduce the size to facilitate manual calculation ;
columns sex name age height m weight kg BMI ;
define sex / group ;
define age / analysis mean format = 6.2;
define height / analysis mean noprint;
define weight / analysis mean noprint;
define kg / computed format = 6.2 'weight*(kg)';
define m / computed format = 6.2 'height*(meter)';
define BMI / computed format = 6.2 'body mass*(kg/m²)';
* initialize the sum and counter *;
compute before sex;
sumBMI = 0;
count = 0;
endcomp;
compute m;
m = height.mean * .02540;
endcomp;
compute kg;
kg = weight.mean * 0.45359237;
endcomp;
compute BMI;
if name eq '' then do;
name = 'mean';
* use the sum and counter *;
BMI = sumBMI / count;
end;
else do;
BMI = kg/m/m;
* increase the sum and counter *
sumBMI = sumBMI + BMI;
count = count + 1;
end;
endcomp;
break after sex /summarize;
run;
Is there a way to let proc report itself do the averaging correctly?
You could say I want to do analysis on a computed column, but you can only define a column an analysis column if it is on the input dataset.
Create an alias column of an existing data set column.
Redo the BMI computation for the alias column.
In the summary line apply the alias column mean to the BMI column
In this example the column alias weight=bmiX is used.
proc report data=sasHelp.class nowindows headline headskip split='*'
style(summary) = {font_style=italic foreground=blue};
where Name contains 'J'; * reduce the size to facilitate manual calculation ;
columns sex name age height m weight kg weight=bmiX BMI ;
define sex / group ;
define age / analysis mean format = 6.2;
define height / analysis mean ;
define weight / analysis mean ;
define kg / computed format = 6.2 'Weight*(kg)';
define m / computed format = 6.2 'Height*(meter)';
define BMI / computed format = 6.2 'BMI*(kg/m²)';
* define bmiX / noprint;
compute m;
m = height.mean * .02540;
endcomp;
compute kg;
kg = weight.mean * 0.45359237;
endcomp;
compute BMI;
BMI = kg/m/m;
if name eq '' then do;
name = 'mean';
BMI = bmiX;
end;
endcomp;
compute bmiX;
bmiX = kg/m/m;
endcomp;
break after sex /summarize;
run;

Proc tabulate,in my column i have more than 50 variable how can i reduce it to only 5?

proc tabulate data=D.Arena out=work.Arena ;
class Row1 Column1/ order=freq ;
table Row1,Column1 ;
run;
after running this i received these results and now i want to restrict the columns to only 5 variables
Use a 'where' statement to restrict the col1 values being tabulated.
You can restrict based on a value property such as starts with the letter A
where col1 =: 'A';
You can restrict based on a value list:
where col1 in ('Apples', 'Lentils', 'Oranges', 'Sardines', 'Cucumber');
Sample data:
data have;
call streaminit(123);
array col1s[50] $20 _temporary_ (
"Apples" "Avocados" "Bananas" "Blueberries" "Oranges" "Strawberries" "Eggs" "Lean beef" "Chicken breasts" "Lamb" "Almonds" "Chia seeds" "Coconuts" "Macadamia nuts" "Walnuts" "Asparagus" "Bell peppers" "Broccoli" "Carrots" "Cauliflower" "Cucumber" "Garlic" "Kale" "Onions" "Tomatoes" "Salmon" "Sardines" "Shellfish" "Shrimp" "Trout" "Tuna" "Brown rice" "Oats" "Quinoa" "Ezekiel bread" "Green beans" "Kidney beans" "Lentils" "Peanuts" "Cheese" "Whole milk" "Yogurt" "Butter" "Coconut oil" "Olive oil" "Potatoes" "Sweet potatoes" "Vinegar" "Dark chocolate"
);
do row1 = 1 to 20;
do _n_ = 1 to 1000;
col1 = col1s[ceil(rand('uniform',50))];
x = ceil(rand('uniform',250));
output;
end;
end;
run;
Frequency tabulation, also showing ALL counts
* col1 values shown in order by value;
proc tabulate data=have;
class row1 col1;
table ALL row1,col1;
run;
* col1 values shown in order by ALL frequency;
proc tabulate data=have;
class row1;
class col1 / order=freq;
table ALL row1,col1;
run;
* Letter T col1 values shown in order by ALL frequency;
proc tabulate data=have;
where col1 =: 'T';
class row1;
class col1 / order=freq;
table ALL row1,col1;
run;
A top 5 only list of Col1s would require a step that determines which col1s meet that criteria. A list of those col1s can be used as part of a where in clause.
* determine the 5 col1s with highest frequency count;
proc sql noprint outobs=5;
select
quote(col1) into :top5_col1_list separated by ' '
from
( select col1, count(*) as N from have
group by col1
)
order by N descending;
quit;
proc tabulate data=have;
where col1 in (&top5_col1_list);
class row1;
class col1 / order=freq;
table ALL row1,col1;
run;
Col1s in order of value
Col1s in order of frequency
T Col1s
Top 5 Col1s

Proc Report-Assigning a different format for different rows

I have a table that is current laid out in the way I want. The only issue is that when I went to assign the format, it carried over the format for all values. I have a row that should be total, but I'm unsure how to strip the formatting on this row only in proc report:
Output
Want: total line to show no decimals as they are count but the rest of the table to keep same format.
%let gray=CXBFBFBF;
%let blue=CX13478C;
%let purple=CXDEDDED;
title j=c h=10pt f='Calibri' color=black "Table 1-Distribution, CY2016-CY2018";
options orientation = landscape nonumber nodate leftmargin=0.05in rightmargin=0.05in;
ods noproctitle noresults escapechar='^';
ods rtf file = "path.rtf";
proc report data= work.temp nowd spanrows style(report)={width=100%}
style(header)=[vjust=b font_face = Calibri fontsize=9pt font_weight=bold background=&blue. foreground=white borderrightcolor=black];
/*List variables in order to select order of columns in table*/
col ( m_type
('^S={borderbottomcolor=&blue. vjust=b borderbottomwidth=0.02 }'('^S={borderbottomcolor=&blue. vjust=b borderbottomwidth=0.01 cellheight=0.20in}Age in Years' d_char_desc))
('^S={cellheight=0.20in}Missing Information'
('^S={borderbottomcolor=&blue. borderbottomwidth=0.02 cellheight=0.18in}' percentage16_1)
('^S={borderbottomcolor=&blue. borderbottomwidth=0.02 cellheight=0.18in}' percentage17_1)
('^S={borderbottomcolor=&blue. borderbottomwidth=0.02 cellheight=0.18in}' percentage18_1))
);
define m_type /order=data group noprint style = [vjust=b just=left cellwidth=0.60in font_face='Times New Roman' fontsize=9pt];
define d_char_desc / order=data display style = [vjust=b just=left cellwidth=0.60in font_face='Times New Roman' fontsize=9pt]
'' style(header)=[vjust=b just=left cellheight=0.18in] style(column)=[vjust=b just=left cellheight=0.35in cellwidth=0.60in];
define percentage16_1 /display style = [vjust=b just=center cellwidth=0.60in cellheight=0.05in font_face='Times New Roman' fontsize=9pt]
'CY2016' style(header)=[vjust=b just=center cellheight=0.18in] style(column)=[vjust=b just=center cellheight=0.20in cellwidth=0.40in];
define percentage17_1 /display style = [vjust=b just=center cellwidth=0.45in cellheight=0.05in font_face='Times New Roman' fontsize=9pt]
'CY2017' style(header)=[vjust=b just=center cellheight=0.18in] style(column)=[vjust=b just=center cellheight=0.20in cellwidth=0.40in];
define percentage18_1 /display style = [vjust=b just=center cellwidth=0.45in cellheight=0.05in font_face='Times New Roman' fontsize=9pt]
'CY2018' style(header)=[vjust=b just=center cellheight=0.18in] style(column)=[vjust=b just=center cellheight=0.20in cellwidth=0.40in];
compute m_type;
if m_type = 'm_tot' then
call define (_row_, 'style', 'style=[fontweight=bold background=&gray. font_face=Times]');
endcomp;
run;
ods rtf close;
You will have to explicitly format the numeric values in respective compute blocks. The numeric value referenced will depend on the analysis statistic and the syntax will be variable.statistic.
Your data (not shown) appears to be some form of pre-computed aggregation, based on the m_type = 'm_tot' source code. In that case the reference would be something like percentage16_1.sum (sum is the default analysis for numeric variables when there is a grouping specified)
Example:
Summarize some SASHELP.CARS variables and change the format for the Ford make.
proc report data=sashelp.cars;
column (
make
horsepower mpg_city mpg_highway
horsepower_custom
mpg_city_custom
mpg_highway_custom
);
define make / group ;*noprint;
define horsepower / analysis noprint mean ;
define mpg_city / analysis noprint mean ;
define mpg_highway / analysis noprint mean ;
define horsepower_custom / computed style=[textalign=right];
define mpg_city_custom / computed style=[textalign=right];
define mpg_highway_custom / computed style=[textalign=right];
compute horsepower_custom / character length=10;
if make = 'Ford'
then horsepower_custom = put (horsepower.mean, 10.4);
else horsepower_custom = put (horsepower.mean, 8.1);
endcomp;
compute mpg_city_custom / character length=10;
if make = 'Ford'
then mpg_city_custom = put (mpg_city.mean, 10.5);
else mpg_city_custom = put (mpg_city.mean, 8.2);
endcomp;
compute mpg_highway_custom / character length=10;
if make = 'Ford'
then mpg_highway_custom = put (mpg_highway.mean, 10.6);
else mpg_highway_custom = put (mpg_highway.mean, 8.3);
endcomp;
run;

Merge cells horizontally in RTF output using proc report

I am trying to create a summary row above each group in my data. I have 2 questions:
How do I merge the first 2 cells horizontally (the ones in red below) in the summary rows.
How do I remove the duplicated F and M in the Sex column (at the moment I can work around this by changing only those cell's text colours to white, but hopefully there's a better way)
The output is an RTF file, and I'm using SAS 9.4 - the desktop version.
Is this possible using proc report?
Code:
options missing=' ';
proc report data=sashelp.class nowd;
columns sex name age weight;
define sex / order;
break before sex / summarize;
run;
I don't think you can merge cells in the summarize line.
Some trickery with compute blocks and call define can alter the cell values and appearances.
For example (Just J names for smaller image):
proc report data=sashelp.class nowd;
where name =: 'J';
columns sex name age weight;
define sex / order;
define age / sum;
define weight / sum;
break before sex / summarize style=[verticalalign=bottom];
compute name;
* the specification of / order for sex sets up conditions in the name value
* that can be leveraged in the compute block;
if name = ' ' then do;
* a blank name means the current row the compute is acting on
* is the summarization row;
* uncomment if stat is not obvious or stated in title;
* name = 'SUM';
* 'hide' border for appearance of merged cell;
call define (1, 'style', 'style=[fontsize=18pt borderrightcolor=white]');
end;
else do;
* a non-blank name means one of the detail rows is being processed;
* blank out the value in the sex column of the detail rows;
* the value assignment can only be applied to current column or those
* to the left;
sex = ' ';
end;
endcomp;
compute after sex;
* if you want more visual separation add a blank line;
* line ' ';
endcomp;
run;

How to indent or center a PROC REPORT column output to RTF?

This is the output that I need in RTF format:
**DEMOGRAPHICS A-B**
Age
n 18
Mean 30.4
SD 6.29
Min 18
Median 30.5
Max 39
but I am getting this result:
**DEMOGRAPHICS A-B**
Age
n 18
Mean 30.4
SD 6.29
Min 18
Median 30.5
Max 39
How do I left align age and center the remaining variables?
Here is my code:
proc report data = FINAL2 split = "#"
STYLE(REPORT)=[BACKGROUND=WHITE BORDERCOLOR=BLACK BORDERWIDTH=0.1 ASIS=on FRAME=HSIDES RULES=GROUPS]
STYLE(HEADER)=[BACKGROUND=WHITE];
COLUMN DESC STAT1;
define DESC / "Demographic Characteristics" style(column)=[cellwidth=30%] style(header)=[just=left asis = on] ;
define STAT1 /"A - B#(N=18)" style(column header)=[cellwidth = 20%] style(header)=[just = left asis = yes];
You can use a compute block to do this. This would be executed per row but you could conditionally apply a column-specific style from there based on the variable's value being 'Age' or something else.
For example (you can add this after the define statements in your report step):
compute desc;
if desc ^= 'Age' then
call define(_COL_, "style", "style=[paddingleft=3em]");
endcomp;
This would apply a 3em padding to each desc column that doesn't match 'Age'.