change the order of bars on sas graph - sas

I have a macro that produce a graph as in image graph. I want to change the order of bars to Dev 201601 201602 201603 201604 201605 instead of
201601 201602 201603 201604 201605 Dev. Here is my code
goptions reset=global gunit=pct border cback=white
colors=(black red green blue) ftitle=swissb
ftext=ZAPFB htitle=5 htext=1.7
offshadow=(1.5,1.5);
axis1 label=none;
axis2 label=none value=none
order=(0 to 110 by 10)
major=none
minor=none;
legend1 cborder=black cblock=CXF0E68C label=none
shape=bar(3,3)
POSITION=(bottom center);
pattern1 color=vpapb;
pattern2 color=bigy;
pattern3 color=paoy;
pattern4 color=vligb;
pattern5 color=lipk;
pattern6 color=vpap;
pattern7 color=pab;
pattern8 color=lio;
proc gchart data= Recent.char_&value;
title ;
vbar Period / sumvar=Percent
subgroup=&value
inside=subpct
width=12
space=8
maxis=axis1
raxis=axis2
legend=legend1
cframe=BWH
coutline=black;
What should I do to solve the problem?

The easiest way to do this is to have your bar value be an ordering value that you use a format to display as your actual value. So here I have 0 = Dev, 1 = 201601, 2 = 201602, etc.; I use a picture format to get there since things work so nicely, but you could use a regular value format if there's not a simple logical mapping like this.
data test_data;
input period value; /* you may need to convert from '201601' to '1' if your data already has 201601 type values; use an informat (opposite of format). */
datalines;
1 10
2 20
3 25
4 25
5 30
0 20
;;;;
run;
proc format;
picture periodF
0 = 'Dev' /* map 0 to Dev */
1-12 = 99 (prefix='2016'); /* for values 1-12, display in 2 digit zero padded format, prefix with 2016) */
quit;
proc sgplot data=test_data;
format period periodf.; /*apply format here or in an earlier datastep */
vbar period/response=value;
run;

Related

Order that variables appear in plot output of proc freq

I have created a frequency plot using the plot option in proc freq. However, I am not able to order that I want. I have categories of '5 to 10 weeks' 'Greater than 25 weeks', '10 to 15 weeks', '15 to 20 weeks'. I want them to go in the logical order of increasing weeks but I'm not sure how to do that. I tried using the order option but nothing seemed to fix that.
A possible solution would be to code the order I want as values of 1-5, order them using the order= option and then have a label for 1-5. But I'm not sure if that's possible.
Tried the order= option, however, that didn't fix the issue.
I want the bins to show up as 'less then 5 weeks' '5 to 10 weeks' '10 to 15 weeks' '15 to 20 weeks' '20 to 25 weeks' 'greater then 25 weeks'
When the Proc FREQ plot displays the tabled variables values in alphabetic order, and the plot option order= is not specified you have the following scenario
variable is character
display order is default (INTERNAL)
Note: Other frequency plotting techniques, such a SGPLOT VBAR recognize midpoint axis specification that can control the explicit order the character values appear. Proc FREQ does not have a plot option for mxaxis.
You are correct in presuming an inverse map (or remap, or unmap) from label to a desired ordered value is essential. The are two main ways to remap
custom format to map label to a character value (via PUT)
custom informat to map label to a numeric value (via INPUT)
Once you have remapped the labels to a value, you need a second custom format to map the values back to the original labels.
Example:
* format to map unmapped labels back to original labels;
proc format;
value category
1 = 'Less than 5 weeks'
2 = '5 to 10 weeks'
3 = '10 to 15 weeks'
4 = '15 to 20 weeks'
5 = '20 to 25 weeks'
6 = 'Greater than 25 weeks'
;
* informat to unmap labels to numeric with desired freq plot order;
invalue category_to_num
'Less than 5 weeks' = 1
'5 to 10 weeks' = 2
'10 to 15 weeks' = 3
'15 to 20 weeks' = 4
'20 to 25 weeks' = 5
'Greater than 25 weeks' = 6
;
* generate sample data;
data have;
do itemid = 1 to 500;
cat_num = rantbl(123,0.05,0.35,0.25,0.15,0.07); * for demonstration purposes;
cat_char = put(cat_num, category.); * your actual category values;
output;
end;
run;
* demonstration: numeric category (unformatted) goes ascending internal order;
proc freq data=have;
table cat_num / plots=freqplot(scale=percent) ;
run;
* demonstration: numeric category (formatted) in desired order with desired category text;
proc freq data=have;
table cat_num / plots=freqplot(scale=percent) ;
format cat_num category.;
run;
* your original plot showing character values being ordered alphabetically
* (as is expected from default order=internal);
proc freq data=have;
table cat_char / plots=freqplot(scale=percent) ;
run;
* unmap the category texts to numeric values that are ordered as desired;
data have_remap;
set have;
cat_numX = input(cat_char, category_to_num.);
run;
* table the numeric values computed during unmap, using format to display
* the desired category texts;
proc freq data=have_remap;
table cat_numX / plots=freqplot(scale=percent) ; * <-- cat_numX ;
format cat_numX category.; * <-- format ;
run;

Multiple transactions lines to base table SAS

I am new to sas and are trying to handle some customer data, and I'm not really sure how to do this.
What I have:
data transactions;
input ID $ Week Segment $ Average Freq;
datalines;
1 1 Sports 500 2
1 1 PC 400 3
1 2 Sports 350 3
1 2 PC 550 3
2 1 Sports 650 2
2 1 PC 700 3
2 2 Sports 720 3
2 2 PC 250 3
;
run;
What I want:
data transactions2;
input ID Week1_Sports_Average Week1_PC_Average Week1_Sports_Freq
Week1_PC_Freq
Week2_Sports_Average Week2_PC_Average Week2_Sports_Freq Week2_PC_Freq;
datalines;
1 500 400 2 3 350 550 3 3
2 650 700 2 3 720 250 3 3
;
run;
The only thing I got so far is this:
Data transactions3;
SET transactions;
if week=1 and Segment="Sports" then DO;
Week1_Sports_Freq=Freq;
Week1_Sports_Average=Average;
END;
else DO;
Week1_Sports_Freq=0;
Week1_Sports_Average=0;
END;
run;
This will be way too much work as I have a lot of weeks and more variables than just freq/avg.
Really hoping for some tips are, as I'm stucked.
You can use PROC TRANSPOSE to create that structure. But you need to use it twice since your original dataset is not fully normalized.
The first PROC TRANSPOSE will get the AVERAGE and FREQ readings onto separate rows.
proc transpose data=transactions out=tall ;
by id week segment notsorted;
var average freq ;
run;
If you don't mind having the variables named slightly differently than in your proposed solution you can just use another proc transpose to create one observation per ID.
proc transpose data=tall out=want delim=_;
by id;
id segment _name_ week ;
var col1 ;
run;
If you want the exact names you had before you could add data step to first create a variable you could use in the ID statement of the PROC transpose.
data tall ;
set tall ;
length new_name $32 ;
new_name = catx('_',cats('WEEK',week),segment,_name_);
run;
proc transpose data=tall out=want ;
by id;
id new_name;
var col1 ;
run;
Note that it is easier in SAS when you have a numbered series of variable if the number appears at the end of the name. Then you can use a variable list. So instead of WEEK1_AVERAGE, WEEK2_AVERAGE, ... you would use WEEK_AVERAGE_1, WEEK_AVERAGE_2, ... So that you could use a variable list like WEEK_AVERAGE_1 - WEEK_AVERAGE_5 in your SAS code.

2x2 table with frequency and overall percentage in SAS - proc tabulate?

I have a set of pre and post scores, with values that can be 1 or 2, e.g.:
Pre Post
1 2
1 1
2 2
2 1
1 2
2 1
etc.
I need to create a 2x2 table that lists the frequencies, with percentages ONLY in the total row/column:
1 2 Total
1 14 60 74 / 30%
2 38 12 50 / 20%
Total 52 / 21% 72 / 29% 248
It doesn't need to be formatted specifically with the / between the n and percent, they can be on different lines. I just need to make sure the total percentages (no cumulative percentages) are in the table.
I think that I should use proc tabulate to get this, but I'm new to SAS and haven't been able to figure it out. Any help would be greatly appreciated.
Code I've tried:
proc tabulate data=.bilirubin order=data;
class pre ;
var post ;
table pre , post*( n colpctsum);
run;
You could make your own report. For example you could use PROC SUMMARY to get frequencies. Add a data step to calculate the percent and generate a character variable with the text you want to display. Then use PROC REPORT to display it.
proc summary data=have ;
class pre post ;
output out=summary ;
run;
proc format ;
value total .='Total (%)';
run;
data results ;
set summary ;
length display $20 ;
if _type_=0 then n=_freq_;
retain n;
if _type_ in (0,3) then display = put(_freq_,comma9.);
else display = catx(' ',put(_freq_,comma9.),cats('(',put(_freq_/n,percent8.2),')'));
run;
proc report missing data=results ;
column pre display,post n ;
define pre / group ;
define post / across ;
define n / noprint ;
define display / display ' ';
format pre post total.;
run;

Two Way Transpose SAS Table

I am trying to create a two way transposed table. The original table I have looks like
id cc
1 2
1 5
1 40
2 55
2 2
2 130
2 177
3 20
3 55
3 40
4 30
4 100
I am trying to create a table that looks like
CC CC1 CC2… …CC177
1 264 5 0
2 0 132 6
…
…
177 2 1 692
In other words, how many id have cc1 also have cc2..cc177..etc
The number under ID is not count; an ID could range from 3 digits to 5 digits ID or with numbers such as 122345ab78
Is it possible to have percentage display next to each other?
CC CC1 % CC2 %… …CC177
1 264 100% 5 1.9% 0
2 0 132 6
…
…
177 2 1 692
If I want to change the CC1 CC2 to characters, how do I modify the arrays?
Eventually, I would like my table looks like
CC Dell Lenovo HP Sony
Dell
Lenovo
HP
Sony
The order of the names must match the CC number I provided above. CC1=Dell CC2=Lenovo, etc. I would also want to add percentage to the matrice. If Dell X Dell = 100 and Dell X Lenovo = 25, then Dell X Lenovo = 25%.
This changes your data structure to a wide format with an indicator for each value of CC and then uses proc corr (correlation) to create the summary table.
Proc Corr will generate the SCCP - which is the uncorrected sum of squares and crossproducts. It's something that's related to correlation, but the gist is it creates the table you're looking for. The table is output in the SAS results window and the ODS OUTPUT statement will capture the table in a dataset called coocs.
data temp;
set have;
by ID;
retain CC1-CC177;
array CC_List(177) CC1-CC177;
if first.ID then do i=1 to 177;
CC_LIST(i)=0;
end;
CC_List(CC)=1;
if last.ID then output;
run;
ods output sscp=coocs;
ods select sscp;
proc corr data=temp sscp;
var CC1-CC177;
run;
proc print data=coocs;
run;
Here's another answer, but it's inefficient and has it's issues. For one, if a value is not anywhere in the list it will not show up in the results, i.e. if there is no 20 in the dataset there will be no 20 in the final data. Also, the variables are out of order in the final dataset.
proc sql;
create table bigger as
select a.id, catt("CC", a.cc) as cc1, catt("CC", b.cc) as cc2
from have as a
cross join have as b
where a.id=b.id;
quit;
proc freq data=bigger noprint;
table cc1*cc2/ list out=bigger2;
run;
proc transpose data=bigger2 out=want2;
by cc1;
var count;
id cc2;
run;

How to create nice tables using PROC REPORT and ODS RTF output

I want to create a 'nice looking table' using the SAS ODS RTF output and the PROC REPORT procedure. After spending the whole day on Google I've managed to produce the following:
The dataset
DATA survey;
INPUT id var1 var2 var3 var4 var5 var6 ;
DATALINES;
1 1 35 17 7 2 2
17 1 50 14 5 5 3
33 1 45 6 7 2 7
49 1 24 14 7 5 7
65 2 52 9 4 7 7
81 2 44 11 7 7 7
2 2 34 17 6 5 3
18 2 40 14 7 5 2
34 2 47 6 6 5 6
50 2 35 17 5 7 5
;
RUN;
DATA survey;
SET survey;
LABEL var1 ='Variable 1';
LABEL var2 ='Fancy variable 2';
LABEL var3 ='Another variable no 3';
RUN;
LIBNAME mylib 'C:\my_libs';
RUN;
PROC FORMAT LIBRARY = mylib.survey;
VALUE groups 1 = 'Group A'
2 = 'Group B'
;
OPTIONS FMTSEARCH = (mylib.survey);
DATA survey;
SET survey;
FORMAT var1 groups.;
RUN;
** The code for creating the rtf-file **
ods listing close;
ods escapechar = '^';
ods noproctitle;
options nodate number;
footnote;
ODS RTF FILE = 'C:\my_workdir\output.rtf'
author = 'NN'
title = 'Table 1 name'
bodytitle
startpage = no
style = journal;
options papersize = A4
orientation = landscape;
title1 /*bold*/ /*italic*/ font = 'Times New Roman' height = 12pt justify = center underlin = 0 color = black bcolor = white 'Table 1 name';
footnote1 /*bold*/ /*italic*/ font = 'Times New Roman' height = 9pt justify = center underlin = 0 color = black bcolor = white 'Note: Created on January 2012';
PROC REPORT DATA = survey nowindows headline headskip MISSING
style(header) = {/*font_weight = bold*/ font_face = 'Times New Roman' font_size = 12pt just = left}
style(column) = {font_face = 'Times New Roman' font_size = 12pt just = left /*asis = on*/};
COLUMN var1 var1=var1_n var1=var1_pctn;
DEFINE var1 / GROUP ORDER=FREQ DESCENDING 'Variable';
DEFINE var1_n / ANALYSIS N 'Data/(N=)';
DEFINE var1_pctn / ANALYSIS PCTN format = percent8. '';
RUN;
ODS RTF CLOSE;
This generates an RTF table in Word something like the following (a little simplified):
However, I want to add a variable lable 'Variable 1, n (%)' above the groups in the variable name column as a separate row (NOT in the header row). I also want to add additional variables and statistics in an aggregated table.
In the end, I want something that looks like this:
I have tried "everything" - is there anyone who knows how to do this?
I know this has been open for awhile, but I too was struggling with this for awhile, and this is what I figured out. So...
In short, SAS has trouble outputting nicely formatted tables that contain more than one type of table "format" in them. For instance, a table where the columns change midway through (like you commonly find in the "Table 1" of a research study describing the study population).
In this case, you're trying to use PROC REPORT, but I don't think it's going to work here. What you want to do is stack two different reports on top of each other, really. You're changing the column value midway through and SAS doesn't natively support that.
Some alternative approaches are:
Perform all your calculations and carefully output them to a data set in SAS, in the positions you want. Then, use PROC PRINT to print them. This is what I can only describe as a tremendous effort.
Create a new TAGSET that allows you to output multiple files, but removes the spacing between each one and aligns them to the same width, effectively creating a single table. This is also quite time consuming; I attempted it using HTML with a custom CSS file and tagset, and it wasn't terribly easy.
Use a different procedure (in this case, PROC TABULATE) and then manually delete the spacing between each table and fiddle with the width to get a final table. This isn't fully automated, but it's probably the quickest option.
PROC TABULATE is cool because you can use multiple table statements in a single example. Below, I put some code in that shows what I'm talking about.
DATA survey;
INPUT id grp var1 var2 var3 var4 var5;
DATALINES;
1 1 35 17 7 2 2
17 1 50 14 5 5 3
33 1 45 6 7 2 7
49 1 24 14 7 5 7
65 2 52 9 4 7 7
81 2 44 11 7 7 7
2 2 34 17 6 5 3
18 2 40 14 7 5 2
34 2 47 6 6 5 6
50 2 35 17 5 7 5
;
RUN;
I found your example code to be a little confusing; var1 looked like a grouping variable, and var2 looked like the first actual analysis variable, so I slightly changed the code. Next, I quickly created the same format you were using before.
PROC FORMAT;
VALUE groupft 1 = 'Group A' 2 = 'Group B';
RUN;
DATA survey;
SET survey;
LABEL var1 ='Variable 1';
LABEL var2 ='Fancy variable 2';
LABEL var3 ='Another variable no 3';
FORMAT var1 groupft.;
RUN;
Now, the meat of the PROC TABULATE statement.
PROC TABULATE DATA=survey;
CLASS grp;
VAR var1--var5;
TABLE MEDIAN QRANGE,var1;
TABLE grp,var2*(N PCTN);
RUN;
TABULATE basically works with commas and asterisks to separate things. The default for something like grp*var1 is an output where the column is the first variable and then there are subcolumns for each subgroup. To add rows, you use a column; to specify which statistics you want, you add a keyword.
This above code gets you something close to what you had in your first example (not ODS formatted, but I figure you can add that back in); it's just in two different tables.
I found the following papers useful when I was tackling this problem:
http://www.lexjansen.com/pharmasug/2005/applicationsdevelopment/ad16.pdf
http://www2.sas.com/proceedings/sugi31/089-31.pdf
1 ODS has some interesting formatting features (like aligning the numbers so a decimal point goes at the same column) but their usefulness is limited for more complex cases. The most flexible solution is to create a formatted string yourself and bypass PROC REPORT's formatting facility completely, like:
data out;
length str $25;
set statistics;
varnum = 1;
group = 1;
str = put( median, 3. );
output;
group = 2;
str = put( q1, 3. ) || " - " || put( q3, 3. );
output;
run;
You can set varnum and group as ORDER variables in PROC REPORT and add headings like "Variable 1" or "Fancy variable 2" via COMPUTE BEFORE; LINE
2 To further keep PROC REPORT from messing up the layout in ODS RTF output, consider re-enabling ASIS style option:
define str / "..." style( column ) = { asis= on };