A user here gave me the following code (SAS: PROC UNIVARIATE: Output trimmed mean to dataset) to calculate and output a winsorized mean to a datset:
proc sort data=sashelp.class out=have;
by sex;
run;
ods trace on;
PROC UNIVARIATE DATA=have trimmed=0.05;
VAR age;
by sex;
ods output TrimmedMeans=trimmedMeans;
run;
ods trace off;
How can I output a new version of the sashelp.class dataset with ALL the observations for age winsorized, rather than calculating a winsorized mean by sex. I don't want to winsorize at the category level, as I will be censoring data that is an outlier in that category and not necessarily an outlier in the entire datset.
Could you add a variable with a constant value and group by it?
This should solve the grouping issue.
Related
I'm trying to export quartile information on a grouped dataset as a dataset in SAS but when I run this code my output is a table with the correct information displayed but the dataset WORK.TOP_1O_PERC is only summary statistics of the set (no quartiles). Does anyone know how I can export this as the CLASS (PDX) and its 25th and 75th percentiles? Thanks!
PROC MEANS DATA=WORK.TOP_10_DX P25 P75;
CLASS PDX;
VAR AmtPaid;
OUTPUT OUT = WORK.TOP_10_PERC;
RUN;
I like the STACKODS output that is a data set which is like the default printed output.
proc means data=sashelp.class n p25 p75 stackods;
ods output summary=summary;
run;
proc print;
run;
You can use output statement with <statistics>= options.
PROC MEANS DATA=WORK.TOP_10_DX NOPRINT;
CLASS PDX;
VAR AmtPaid;
OUTPUT OUT = WORK.TOP_10_PERC P25=P25 P75=P75;
RUN;
Compared to ods output, output statement is much faster but less flexible with multiple analysis variables or by statement specified situation.
I have the following statement
Proc Freq data =test;
tables gender;
run;
I want this to generate an output based on a condition applied to the gender variable. For example - if count of gender greater than 2 then output.
How can I do this in SAS?
Thanks
If you mean an output dataset, you can put a where clause directly in the output dataset options.
Proc Freq data =sashelp.class;
tables sex/out=sex_freq(where=(count>9));
run;
I'm not aware of how you can accomplish this only using proc freq but you can redirect the output to a data set and then print the results.
proc freq data=test;
tables gender / noprint out=tmp;
run;
proc print data=tmp;
where count > 2;
run;
Alternatively you could use proc summary, but this still requires two steps.
proc summary data=test nway;
class gender;
output out=tmp(where=(_freq_ > 2));
run;
proc print data=tmp;
run;
I'm trying to make a bar chart using SAS. I have multiple salaries data and I'd like to show a bar chart with the frequencies of these salaries. I've made this:
ODS GRAPHICS ON;
PROC FREQ DATA=WORKERS.SORT ORDER=INTERNAL;
TABLES salaries / NOCUM SCORES=TABLE plots(only)=freq;
RUN;
ODS GRAPHICS OFF;
It works, the problem is, that now I can see all (hundreds) of the salaries on the x-axis. I'd like to have just intervals of these salaries (like 20) so that I can get a more readable chart. I just can't find out how to do it. I've also tried it with this:
PROC CHART DATA=WORK.SORT;
vbar salaries;
RUN;
but that's a text representation of the chart, so I can't use it.
You can create a format and apply the format to the variable you want to group into buckets. Here's an example:
proc format ;
value myfmt
low - 13 = '13 and Under'
14 - high = '14 and Above';
run;
ODS GRAPHICS ON;
PROC FREQ DATA=sashelp.class ORDER=INTERNAL;
format age myfmt.;
TABLES age / NOCUM SCORES=TABLE plots(only)=freq;
RUN;
ODS GRAPHICS OFF;
Use PROC UNIVARIATE with the HISTOGRAM statement. http://support.sas.com/documentation/cdl/en/procstat/66703/HTML/default/viewer.htm#procstat_univariate_toc.htm
ods html;
proc univariate data=sashelp.cars noprint;
var msrp;
histogram;
run;
There are options for specifying bin size:
ods html;
proc univariate data=sashelp.cars noprint;
var msrp;
histogram / midpoints=30000 to 180000 by 30000;
run;
And just for completeness, I'll add another solution in case you want more control over the chart's appearance. Using the Graphics Template Language you can create some very nice looking charts.
The proc template statement defines how the chart will look. The sgrender runs the chart against the specified dataset. There's all kinds of options that are best explored in the online doc: http://support.sas.com/documentation/cdl/en/grstatgraph/65377/HTML/default/viewer.htm#p1sxw5gidyzrygn1ibkzfmc5c93m.htm
I've just taken the sample they provided and added the / nbins=20 option to have it automatically group into 20 bins. It also has options for start and end bin, bin size, etc..
proc template;
define statgraph histogram;
begingraph;
entrytitle "Histogram of Vehicle Weights";
layout overlay /
xaxisopts=(label="Vehicle Weight (LBS)")
yaxisopts=(griddisplay=on);
histogram weight / nbins=20;
endlayout;
endgraph;
end;
run;
proc sgrender data=sashelp.cars template=histogram;
run;
I want to turn the entire row red for people whose names begin with 'J'. Is this possible using proc print?
ods html file=odsout style=htmlblue ;
proc print data=sashelp.class noobs label;
var name age;
run;
ods html close;
I don't believe it's possible with PROC PRINT. PROC REPORT can generate the identical output but with the rows red, however.
Identical:
proc report data=sashelp.class nowd;
columns name age;
run;
With red:
proc report data=sashelp.class nowd;
columns name age;
compute name;
if substr(name,1,1)='J' then
call define(_row_, "style", "style=[backgroundcolor=red]");
endcomp;
run;
I would consider it somewhat cleaner to use a style definition of course but for a one-off sort of thing this is easy.
Is it usual that the Univariate Frequencies does not display the BY-variable while Univariate BasicMeasures does show the BY-variable?
In the example below I load in some data and want to show gas prices by zipcode. The output for PROC FREQ shows the BY-variable (zipcode) in the output as does the UNIVARIATE BasicMeasures. But the UNIVARIATE Frequencies is not showing the BY-variable in the output.
Am I doing something wrong? I've even set the templates to default, with the ODS PATH statement, in case the templates got messed up by other code (or other coders using same account).
DATA prices;
INPUT zipcode price;
DATALINES;
90066 3.10
90066 3.17
90066 3.26
98101 2.99
98101 3.06
98101 3.16
;
run;
proc sort;
by zipcode;
run;
ods path sashelp.tmplmst(read) ;
ods pdf file = "gasprices.pdf";
PROC FREQ data = prices;
tables price;
by zipcode;
run;
ods select Frequencies;
PROC UNIVARIATE data = prices freq;
var price;
by zipcode;
run;
ods select BasicMeasures;
PROC UNIVARIATE data = prices;
var price;
by zipcode;
run;
ods pdf close;
You can specify more than one object in the ODS SELECT, so you could pull both tables out of the same PROC FREQ like this:
ods pdf file = "gasprices.pdf";
ods select BasicMeasures Frequencies;
PROC UNIVARIATE data = prices freq;
var price;
by zipcode;
run;
ods pdf close;
I know that doesn't exactly solve your problem, but it looks to me like the BY-variable display just isn't properly linked to the PROC UNIVARIATE frequency tables (e.g., try ODS SELECT Moments - works fine). Might be worth reporting to SAS.