Weighted mean in SAS - sas

I have problem with some SAS code. Within first weighted mean grouping by "date", I want to compute again weighted mean using "group" with by option and "w2" as weight. How can I do this?
proc univariate data=set_out;
by date;
weight w1;
VAR price;
run;

The weight statement accepts only one variable, so you will need to use UNIVARIATE twice:
proc sort data=have;
by date;
proc univariate data=have;
by date;
weight w1;
VAR price;
output out=want mean=mean_price;
run;
and
proc sort data=have;
by group;
proc univariate data=have;
by group;
weight w2;
VAR price;
output out=want mean=mean_price;
run;
If you don't want to sort the data use a CLASS statement instead of BY

Related

How to create nested rows in SAS?

I am a beginner so my knowledge is lacking. I have a data set consisting of the following columns:
Subject
Age
Height
Weight
I wish to create a table such that for every 1 person i have three rows called Age, Height, Weight.
I have tried to use Proc Tabulate :
proc tabulate data=new;
class person;
var NEWCOLUMN;
table person,NEWCOLUMN;
run;
However i am getting an error because the new column is not the correct type.
You can pivot the data per person and REPORT or TABULATE the cell values.
Example:
proc transpose data=sashelp.class out=pivoted;
by name;
var age height weight;
where name <= 'C';
run;
ods html file='output.html' style=plateau;
options nodate nonumber nocenter;
title; footnote;
proc report data=pivoted;
column name _name_ col1;
define name / ' ' order order=data;
define _name_ / ' ';
define col1 / ' ';
run;
proc tabulate data=pivoted;
class name _name_ / order=data;
var col1;
table name*_name_='', col1=''*min='' / nocellmerge;
run;
ods html close;
Output

How to filter a dataset according to its quantile

In the following code, how could I keep only the observations superior to the 95th quantile?
data test;
input business_ID $ count;
datalines;
'busi1' 2
'busi1' 10
'busi1' 4
'busi2' 1
'busi3' 2
'busi3' 1
;
run;
proc sort data = test;
by descending count;
run;
I don't know how to cleanly stock the quartile and then re-use it with an if condition.
Thanks
Edit : I can determine the quantile with this code :
proc means data=test noprint;
var count;
output out=quantile P75= / autoname;
run;
But how can I relate to it in the Test dataset so that I can select every observations above that quantile?
You could either read the value of the quantile in a macro variable to use in a subsequent if or where condition:
proc means data=test noprint;
var count;
output out=quantile P75= / autoname;
run;
data _null_;
set quantile;
call symput('quantile',count_p75);
run;
data test;
set test;
where count > &quantile.;
run;
or you could use an SQL subquery
proc means data=test noprint;
var count;
output out=quantile P75= / autoname;
run;
proc sql undo_policy=none;
create table test as
select *
from test
where count > (select count_p75 from quantile)
;
quit;
(Note that your question mentions the 95th quantile whereas your sample code mentions the 75th)
User2877959's solution is solid. Recently I did this with Proc Rank. The solution is a bit 'work around-y', but saves a lot of typing.
proc rank data=Input groups=1000 out=rank_out;
var var_to_rank;
ranks Rank_val;
run;
data seventy_five;
set rank_out;
if rank_val>750;
run;
More on Rank: http://documentation.sas.com/?docsetId=proc&docsetTarget=p0le3p5ngj1zlbn1mh3tistq9t76.htm&docsetVersion=9.4&locale=en

How to PROC PRINT only the sum of a column of data

In SAS, you can use PROC PRINT to sum a column and display the sum:
proc print data = dataset.project_out;
sum variable;
run;
How can I get this function to only print the sum line and not the rest of the data?
I don't think you can do it with proc print. The closest you can come is the empty var statement:
proc print data=sashelp.class;
var ;
sum age;
run;
But sum adds the sum variable to the var list.
You can certainly accomplish this a number of other ways.
PROC SQL is the one I'd use:
proc sql;
select sum(Age) from sashelp.class;
quit;
PROC REPORT, often called "pretty PROC PRINT", can do it also:
proc report data=sashelp.class;
columns age;
define age/analysis sum;
run;
PROC TABULATE can do it:
proc tabulate data=sashelp.class;
var age;
tables age*sum;
run;
PROC MEANS:
proc means data=sashelp.class sum;
var age;
run;
Etc., plenty of ways to do the same thing.

Using proc means to get the average fish weight by species in sashelp table fish

Here's what I'm doing but it throws an error saying. It will print the first two but fails after that.
ERROR: Data set WORK.FISH is not sorted in ascending sequence. The current BY group has Species = Whitefish and the next BY group
has Species = Parkki.
Any idea what I'm doing wrong?
data fish;
set sashelp.fish;
run;
proc means data = fish;
var weight;
by species;
run;
The BY statement in proc means assumes that the dataset is sorted by the BY variable. Just add a proc sort before proc means.
data fish;
set sashelp.fish;
run;
proc sort data=fish;
by species;
run;
proc means data = fish;
var weight;
by species;
run;
Another way of doing it without the proc sort is to use a class statement:
proc means data=sashelp.fish mean;
class species;
var weight;
run;

Transpose output of PROC UNIVARIATE:TestsForLocation

I want to format the result of the following call to PROC UNIVARIATE (TestForLocation).
proc sort data=sashelp.class; by sex; run;
proc univariate data = sashelp.class mu0 = 1;
ods select TestsForLocation;
var age;
by sex;
ods output TestsForLocation=ttest;
run;
data ttest; set ttest; keep sex test stat pvalue;run;
proc print data=ttest;run;
How can I trasnpose the output into a datset with the following columns?
Obs, Sex, StudentsT_Stat, StudentsT_pValue, SignedRank_Stat, SignedRank_Pvalue
You need to double transpose here. Make a dataset with 12 observations, with four columns: Obs, Sex, ID being the combination of Test and (Stat|pValue) that you want as your eventual variable name, and Value being the value you want transposed into the variable. Then,
proc transpose data=ttest_double out=ttest_transposed;
by obs sex;
id ID;
var Value;
run;
(ID and Value can be any variable name you like.)