I have the following data
DATA HAVE;
input year dz $8. area;
cards;
2000 stroke 08
2000 stroke 06
2000 stroke 06
;
run;
After using the proc freq
proc freq data=have;
table area*dz/ list nocum ;
run;
I get the below output
I want to replace any values below 5 in "frequency" column and 0 in "percent" column with "<5" and "0", respectively.
I have the below proc format code
proc format;
picture count (round)
0-4 = ' <5' (NOEDIT);
picture pcnt (round)
0 = ' - '
other = '009.9%';
But I am not understanding how to use it in the data step to get the desired results. Please guide.
Thanks!
You can save the result of proc freq as a dataset then report it as your wish.
proc freq data=have noprint;
table area*dz/ list nocum out = want;
run;
proc print;
format COUNT count. PERCENT pcnt.;
run;
If you want your proc freq output result just as your wish, not using an extra report or print procedure, you need to code a little more. The answer is about ods style and there is a very nice article about it: Using Styles and Templates to Customize SASĀ® ODS Output
Related
I have the following data
DATA HAVE;
input year dz area;
cards;
2000 1 08
2000 1 06
2000 1 06
;
run;
proc freq data=have;
table area*dz / norow nocol;
run;
I get the following output
I would like to format it to put frequency in one column and percent in another column and I don't want the total column. Is there a way to do it?
Thank you!
Try adding the LIST option to get a different layout:
proc freq data=have;
table area*dz / norow nocol LIST;
run;
Pipe it to a data set and format as desired:
proc freq data=have;
table area*dz / norow nocol LIST out=want;
run;
proc print data=want;run;
Use PROC TABULATE instead (not shown), which allows you more control over your layout and formats.
I have the following dataset :
data test;
input business_ID $;
datalines;
'busi1'
'busi1'
'busi1'
'busi2'
'busi3'
'busi3'
;
run;
proc freq data = test ;
table business_ID;
run;
I would like the average nummber of lines per business, that is count the total number of observations and divide it by the number of distinct businesses.
In my example : 6 observations, 3 businesses -> 6/2=3 lines per business.
I was thinking about using a proc freq or a proc mean step but so far I got only the number of lines (~freq) per business and do not know how to get to my goal.
Any idea?
You could use PROC FREQ to get the counts and then run PROC MEANS on the output.
proc freq data=test ;
tables business_id / noprint out=counts ;
run;
proc means data=counts;
var count;
run;
Or you could count them directly with PROC SQL code.
proc sql ;
select count(*)/count(distinct business_id) as mean_count
from test
;
quit;
I have the following statement
Proc Freq data =test;
tables gender;
run;
I want this to generate an output based on a condition applied to the gender variable. For example - if count of gender greater than 2 then output.
How can I do this in SAS?
Thanks
If you mean an output dataset, you can put a where clause directly in the output dataset options.
Proc Freq data =sashelp.class;
tables sex/out=sex_freq(where=(count>9));
run;
I'm not aware of how you can accomplish this only using proc freq but you can redirect the output to a data set and then print the results.
proc freq data=test;
tables gender / noprint out=tmp;
run;
proc print data=tmp;
where count > 2;
run;
Alternatively you could use proc summary, but this still requires two steps.
proc summary data=test nway;
class gender;
output out=tmp(where=(_freq_ > 2));
run;
proc print data=tmp;
run;
I am running fast cluster process in Sas.
The frequency value i get is in Scientific form in excel output i get. I dont want it. What part of code I should change. Attached below is the snipeet of my code and the output I am getting
%MACRO LOOP(Start,End);
%DO MAXC=&Start %TO &End;
ODS HTML PATH="J:\DIAC-CITI\Client_Data\CARDS\03 Analysis\Clustering\Pre-Optimization\Output\Cluster Outputs"
BODY="MaxCluster&MAXC..xls"
STYLE=DEFAULT;
ODS LISTING CLOSE;
OPTIONS NOLABEL;
PROC FASTCLUS
DATA=Pre_Modeling
OUT=test
MAXCLUSTERS=&MAXC
MAXITER=100;
OUTSTAT=stat&maxc;
FREQ FREQUENCY;
WEIGHT REGIONAL_WTS;
VAR RISK_SCORE;
TITLE ' ';
RUN;
ODS HTML CLOSE;
ODS LISTING;
data stat&maxc;
set stat&maxc(rename=( _type_=type));
where type in('RSQ','PSEUDO_F','CCC');
run;
proc sort data=stat&maxc;
by type;
run;
proc transpose data=stat&maxc out=stat&maxc prefix=value&maxc.;
by type;
var over_all;
run;
%END;
%MEND LOOP;
%LOOP(1,30)
Output
Cluster Frequency
1 69
2 2295564
4 172098
6 6
9 6941
12 32
18 872126
8 4.56E+07
16 34347
17 1.98E+07
15 9568079
10 8824842
7 9669026
3 5855012
5 3353213
11 876159
13 313310
14 202065
19 33736
The frequency I am getting is in scientific form. I dont want that.If I change it to number in excel, it gets rounded off and the original number gets lost.
Can anyone help me
When I added format to the code
PROC FASTCLUS
DATA=Pre_Modeling
OUT=test
MAXCLUSTERS=22
MAXITER=100
OUTSTAT=stat22;
FREQ frequency format best16. ;
WEIGHT REGIONAL_WTS;
VAR RISK_SCORE;
TITLE ' ';
RUN;
PROC FASTCLUS
DATA=Pre_Modeling
OUT=test
MAXCLUSTERS=22
MAXITER=100
OUTSTAT=stat22;
FREQ format frequency best16. ;
WEIGHT REGIONAL_WTS;
VAR RISK_SCORE;
TITLE ' ';
RUN;
Both the codes did not run and it gave an error.
Format statement should be put in the last section of your code: Proc Transpose.
proc transpose data=stat&maxc out=stat&maxc prefix=value&maxc.;
by type;
var over_all;
format over_all best16.;
run;
I have a SAS dataset similar to the one created here.
data have;
input date :date. count;
cards;
20APR2012 10
20APR2012 20
20APR2012 20
27APR2012 15
27APR2012 5
;
run;
proc sort data=have;
by date;
run;
I want to create a column containing the sum for each date, so it would look like
date total
20APR2012 50
27APR2012 20
I have tried using first. but I think my syntax is off. Thanks.
This is what proc means is for.
proc means data=have;
class date;
var count;
output out=want sum=total;
run;
The code below works to give you your desired result.
proc sql;
create table wanted_tab as
select
date format date9.,
sum(count) as Total
from have
group by date;
;
quit;