In my Proc Tabulate output, the class headings are above the class levels. Is there a way to move the class headings into a column of their own that sits next to the class levels? In the desired output in the image, the class heading of 'Education' is in it's own cell next to the class levels. How can I accomplish this?
Class Headings example
PROC FORMAT;
PICTURE PCTF (ROUND) OTHER='009.9%';
RUN;
ODS HTML PATH="%SYSFUNC(GETOPTION(WORK) )" STYLE=JOURNAL1A;
TITLE "Question 21x";
PROC TABULATE DATA = 208s;
CLASS EDUC
AREA
AGE
SEX
CENRACE
POVERTY
EDUC
INSURE
HEALTH
Q21x;
CLASSLEV EDUC AREA AGE SEX CENRACE POVERTY EDUC INSURE HEALTH Q21x ;
TABLE AREA = 'Area in Region' * (ROWPCTN=' '*f=PCTF.)
AGE = 'Age' * (ROWPCTN=' '*f=PCTF.)
SEX * (ROWPCTN=' '*f=PCTF.)
CENRACE = 'Race' * (ROWPCTN=' '*f=PCTF.)
POVERTY = 'Poverty Status' * (ROWPCTN=' '*f=PCTF.)
EDUC * (ROWPCTN=' '*f=PCTF.)
INSURE * (ROWPCTN=' '*f=PCTF.)
HEALTH * (ROWPCTN=' '*f=PCTF.) , Q21x = ' ';
RUN;
You can transpose existing data into a categorical form, which will give you greater control over the row dimension layout. Move the ROWPCTN into the column dimension to eliminate the blank column (in the row header) that would otherwise appear if ROWPCTN was in the row dimension. Use NOCELLMERGE to prevent merged cells in the first data row.
For example, start with
data have;
do personid = 1 to 1000;
area = cats('area_',0 + floor(5 * ranuni(123)));
age = cats('age_',13 + floor(7 * ranuni(123)));
sex = cats('sex_',1 + floor( 2 * ranuni(123)));
q21x = byte(65+(5*ranuni(123)));
output;
end;
label area = 'Area Label';
run;
proc tabulate data=have;
class area age sex q21x;
table
( area age sex ) * (rowpctn=' '), q21x
/ nocellmerge;
run;
And the transposed data version
proc transpose data=have out=have_for_table;
by personid q21x notsorted;
var area age sex;
run;
proc tabulate data=have_for_table missing;
class _name_ _label_ col1 q21x;
table
_name_='' * _label_='' * col1=''
,
q21x * (rowpctn='')
/
nocellmerge
;
run;
Related
I am a beginner so my knowledge is lacking. I have a data set consisting of the following columns:
Subject
Age
Height
Weight
I wish to create a table such that for every 1 person i have three rows called Age, Height, Weight.
I have tried to use Proc Tabulate :
proc tabulate data=new;
class person;
var NEWCOLUMN;
table person,NEWCOLUMN;
run;
However i am getting an error because the new column is not the correct type.
You can pivot the data per person and REPORT or TABULATE the cell values.
Example:
proc transpose data=sashelp.class out=pivoted;
by name;
var age height weight;
where name <= 'C';
run;
ods html file='output.html' style=plateau;
options nodate nonumber nocenter;
title; footnote;
proc report data=pivoted;
column name _name_ col1;
define name / ' ' order order=data;
define _name_ / ' ';
define col1 / ' ';
run;
proc tabulate data=pivoted;
class name _name_ / order=data;
var col1;
table name*_name_='', col1=''*min='' / nocellmerge;
run;
ods html close;
Output
In SAS proc report, a computed column is calculated row by row.
This applies to the summary lines too, but that is not always wat you want.
As an example, take this study of the Body Mass Index in SASHELP.CLASS:
title Study Body Mass Index (BMI) by sex in class;
title2 Erroneously calculate average BMI from the average weight and height;
proc report data=sasHelp.class nowindows headline headskip split='*'
style(summary) = {font_style=italic foreground=blue};
where Name contains 'J'; * reduce the size to facilitate manual calculation ;
columns sex name age height m weight kg BMI ;
define sex / group ;
define age / analysis mean format = 6.2;
define height / analysis mean noprint;
define weight / analysis mean noprint;
define kg / computed format = 6.2 'Weight*(kg)';
define m / computed format = 6.2 'Height*(meter)';
define BMI / computed format = 6.2 'BMI*(kg/m²)';
compute m;
m = height.mean * .02540;
endcomp;
compute kg;
kg = weight.mean * 0.45359237;
endcomp;
compute BMI;
BMI = kg/m/m;
if name eq '' then name = 'mean';
endcomp;
break after sex /summarize;
run;
It is wrong, because the BMI is not in the summary, i.e. for mean, is not the mean of the above BMI's, it is calculated from the height and weight left of it.
This is a correct calculation, summing BMI's and counting students manually.
title2 manually : summing BMI and counting students;
proc report data=sasHelp.class nowindows headline headskip split='*'
style(summary) = {font_style=italic foreground=blue};
where Name contains 'J'; * reduce the size to facilitate manual calculation ;
columns sex name age height m weight kg BMI ;
define sex / group ;
define age / analysis mean format = 6.2;
define height / analysis mean noprint;
define weight / analysis mean noprint;
define kg / computed format = 6.2 'weight*(kg)';
define m / computed format = 6.2 'height*(meter)';
define BMI / computed format = 6.2 'body mass*(kg/m²)';
* initialize the sum and counter *;
compute before sex;
sumBMI = 0;
count = 0;
endcomp;
compute m;
m = height.mean * .02540;
endcomp;
compute kg;
kg = weight.mean * 0.45359237;
endcomp;
compute BMI;
if name eq '' then do;
name = 'mean';
* use the sum and counter *;
BMI = sumBMI / count;
end;
else do;
BMI = kg/m/m;
* increase the sum and counter *
sumBMI = sumBMI + BMI;
count = count + 1;
end;
endcomp;
break after sex /summarize;
run;
Is there a way to let proc report itself do the averaging correctly?
You could say I want to do analysis on a computed column, but you can only define a column an analysis column if it is on the input dataset.
Create an alias column of an existing data set column.
Redo the BMI computation for the alias column.
In the summary line apply the alias column mean to the BMI column
In this example the column alias weight=bmiX is used.
proc report data=sasHelp.class nowindows headline headskip split='*'
style(summary) = {font_style=italic foreground=blue};
where Name contains 'J'; * reduce the size to facilitate manual calculation ;
columns sex name age height m weight kg weight=bmiX BMI ;
define sex / group ;
define age / analysis mean format = 6.2;
define height / analysis mean ;
define weight / analysis mean ;
define kg / computed format = 6.2 'Weight*(kg)';
define m / computed format = 6.2 'Height*(meter)';
define BMI / computed format = 6.2 'BMI*(kg/m²)';
* define bmiX / noprint;
compute m;
m = height.mean * .02540;
endcomp;
compute kg;
kg = weight.mean * 0.45359237;
endcomp;
compute BMI;
BMI = kg/m/m;
if name eq '' then do;
name = 'mean';
BMI = bmiX;
end;
endcomp;
compute bmiX;
bmiX = kg/m/m;
endcomp;
break after sex /summarize;
run;
I have been trying to create a demographic table like below this but I can't seem append the different tables. Please advise on where I can make adjustments in the code.
Group A Group B
chort 1 cohort 2 cohort 3 subtotal cohort 4 cohort 5 cohort 6 subtotal
Age
n
mean
sd
median
min
Gender
n
female
male
Race
n
white
asian
hispanic
black
My Code:
PROC FORMAT;
value content
1=' '
2='Age'
3='Gender'
4='Race'
value sex
1=' n'
2=' female'
3=' male';
value race
1=' n'
2=' white'
3=' asian'
4=' hispanic'
5=' black';
value stat
1=' n'
2=' Mean'
3=' Std. Dev.'
4=' Median'
5=' Minimum';
RUN;
DATA testtest;
SET test.test(keep = id group cohort age gender race);
RUN;
data tottest;
set testtest;
output;
if prxmatch('m/COHORT 1|COHORT 2|COHORT 3/oi', cohort) then do;
cohort='Subtotal';
output;
end;
if prxmatch('m/COHORT 4|COHORT 5|COHORT 6/oi', cohort) then do;
cohort='Subtotal';
output;
end;
run;
data count;
if 0 then set testtest nobs=npats;
call symput('npats',put(npats,1.));
stop;
run;
proc freq data=tottest;
tables cohort /out=patk0 noprint;
tables cohort*sex /out=sex0 noprint;
tables cohort*race /out=race0 noprint;
run;
PROC MEANS DATA = testtest n mean std min median;
class cohort;
VAR age;
RUN;
I know that I would have to transpose it and out it in a report. But before I do that, how do I get the variable out of my proc means, proc freq, etc?
I have a Proc Tabulate output with row percentages, and I want a total count of all respondents for each summary variable. The closest I get is that it adds a row below each variable for the count, but I really only need an additional column at the end that represents the total count.
PROC TABULATE DATA = CHSS2017 f=10.2 S=[foreground=black just=c cellwidth=75];
CLASS EDUC
AREA
AGE
SEX
CENRACE
POVERTY
EDUC
INSURE
HEALTH
Q21;
CLASSLEV EDUC / style=[font_weight=medium background=colfmt.];
CLASSLEV AREA / style=[font_weight=medium background=colfmt.];
CLASSLEV AGE / style=[font_weight=medium background=colfmt.];
CLASSLEV SEX / style=[font_weight=medium background=colfmt.];
CLASSLEV CENRACE / style=[font_weight=medium background=colfmt.];
CLASSLEV POVERTY / style=[font_weight=medium background=colfmt.];
CLASSLEV INSURE / style=[font_weight=medium background=colfmt.];
CLASSLEV HEALTH / style=[font_weight=medium background=colfmt.];
CLASSLEV Q21;
TABLE AREA = 'Area in Region' * (ROWPCTN=' '*f=PCTF.)
AGE = 'Age' * (ROWPCTN=' '*f=PCTF.)
SEX * (ROWPCTN=' '*f=PCTF.)
CENRACE = 'Race' * (ROWPCTN=' '*f=PCTF.)
POVERTY = 'Poverty Status' * (ROWPCTN=' '*f=PCTF.)
EDUC * (ROWPCTN=' '*f=PCTF.)
INSURE * (ROWPCTN=' '*f=PCTF.)
HEALTH * (ROWPCTN=' '*f=PCTF.), Q21 = ' ' ALL*f=8;
RUN;
I keep trying to play around with adding "*n" or "*all" to the summary variables (Area, Sex, Age), but only get errors. My desired output should look like image, except that the "Count" column is the total sum not 100 or 100%
Full table picture
data WORK.CLASS(label='Survey Data');
infile datalines dsd truncover;
input age:3. sex:3. cenrace:3. q21:3. regionwt:16.;
datalines;
5 4 2 2 0.1214634338
5 3 2 2 1.1946976229
7 4 2 2 0.6734857715
7 4 2 2 2.5191297921
5 3 2 1 0.2390983852
;;;;
Without data it's hard to answer, but try this:
TABLE AREA = 'Area in Region' * (ROWPCTN=' '*f=PCTF.)
AGE = 'Age' * (ROWPCTN=' '*f=PCTF.)
SEX * (ROWPCTN=' '*f=PCTF.)
CENRACE = 'Race' * (ROWPCTN=' '*f=PCTF.)
POVERTY = 'Poverty Status' * (ROWPCTN=' '*f=PCTF.)
EDUC * (ROWPCTN=' '*f=PCTF.)
INSURE * (ROWPCTN=' '*f=PCTF.)
HEALTH * (ROWPCTN=' '*f=PCTF.)
ALL , Q21 = ' ';
EDIT: Re-reading your question, I think this is what you want. Add the N with the ROWPCTN but after with nothing in between. You can add a format if needed. You likely need to do this for each one as well.
AREA = 'Area in Region' * (ROWPCTN=' '*f=PCTF. N*f=8.)
I have the following proc report
proc report data=sashelp.class;
col
sex
age
weight
;
define sex / group;
define age / group;
define weight / analysis sum;
run;
However I do not want to show the sum of weight. Instead I would like to have the proportion of the grouped sum. So first row should be 6.23%. How can I achieve this?
Now I have found a workaround:
proc sql noprint;
CREATE TABLE class AS
SELECT a.*
,b.sumweight
FROM sashelp.class a
LEFT JOIN (SELECT sex, sum(weight) as sumweight
FROM sashelp.class
GROUP BY sex
) b
ON a.sex=b.sex
;
quit;
proc report data=class;
col
sex
age
weight
sumweight
perc
;
define sex / group;
define age / group;
define weight / analysis sum;
define sumweight / analysis mean noprint;
define perc / computed format=percent6.2;
compute perc;
perc = weight.sum/sumweight.mean;
endcomp;
run;
But maybe there is a solution without additional proc sql step...