I have problem with some SAS code. Within first weighted mean grouping by "date", I want to compute again weighted mean using "group" with by option and "w2" as weight. How can I do this?
proc univariate data=set_out;
by date;
weight w1;
VAR price;
run;
The weight statement accepts only one variable, so you will need to use UNIVARIATE twice:
proc sort data=have;
by date;
proc univariate data=have;
by date;
weight w1;
VAR price;
output out=want mean=mean_price;
run;
and
proc sort data=have;
by group;
proc univariate data=have;
by group;
weight w2;
VAR price;
output out=want mean=mean_price;
run;
If you don't want to sort the data use a CLASS statement instead of BY
Related
I am a beginner so my knowledge is lacking. I have a data set consisting of the following columns:
Subject
Age
Height
Weight
I wish to create a table such that for every 1 person i have three rows called Age, Height, Weight.
I have tried to use Proc Tabulate :
proc tabulate data=new;
class person;
var NEWCOLUMN;
table person,NEWCOLUMN;
run;
However i am getting an error because the new column is not the correct type.
You can pivot the data per person and REPORT or TABULATE the cell values.
Example:
proc transpose data=sashelp.class out=pivoted;
by name;
var age height weight;
where name <= 'C';
run;
ods html file='output.html' style=plateau;
options nodate nonumber nocenter;
title; footnote;
proc report data=pivoted;
column name _name_ col1;
define name / ' ' order order=data;
define _name_ / ' ';
define col1 / ' ';
run;
proc tabulate data=pivoted;
class name _name_ / order=data;
var col1;
table name*_name_='', col1=''*min='' / nocellmerge;
run;
ods html close;
Output
In the following code, how could I keep only the observations superior to the 95th quantile?
data test;
input business_ID $ count;
datalines;
'busi1' 2
'busi1' 10
'busi1' 4
'busi2' 1
'busi3' 2
'busi3' 1
;
run;
proc sort data = test;
by descending count;
run;
I don't know how to cleanly stock the quartile and then re-use it with an if condition.
Thanks
Edit : I can determine the quantile with this code :
proc means data=test noprint;
var count;
output out=quantile P75= / autoname;
run;
But how can I relate to it in the Test dataset so that I can select every observations above that quantile?
You could either read the value of the quantile in a macro variable to use in a subsequent if or where condition:
proc means data=test noprint;
var count;
output out=quantile P75= / autoname;
run;
data _null_;
set quantile;
call symput('quantile',count_p75);
run;
data test;
set test;
where count > &quantile.;
run;
or you could use an SQL subquery
proc means data=test noprint;
var count;
output out=quantile P75= / autoname;
run;
proc sql undo_policy=none;
create table test as
select *
from test
where count > (select count_p75 from quantile)
;
quit;
(Note that your question mentions the 95th quantile whereas your sample code mentions the 75th)
User2877959's solution is solid. Recently I did this with Proc Rank. The solution is a bit 'work around-y', but saves a lot of typing.
proc rank data=Input groups=1000 out=rank_out;
var var_to_rank;
ranks Rank_val;
run;
data seventy_five;
set rank_out;
if rank_val>750;
run;
More on Rank: http://documentation.sas.com/?docsetId=proc&docsetTarget=p0le3p5ngj1zlbn1mh3tistq9t76.htm&docsetVersion=9.4&locale=en
In SAS, you can use PROC PRINT to sum a column and display the sum:
proc print data = dataset.project_out;
sum variable;
run;
How can I get this function to only print the sum line and not the rest of the data?
I don't think you can do it with proc print. The closest you can come is the empty var statement:
proc print data=sashelp.class;
var ;
sum age;
run;
But sum adds the sum variable to the var list.
You can certainly accomplish this a number of other ways.
PROC SQL is the one I'd use:
proc sql;
select sum(Age) from sashelp.class;
quit;
PROC REPORT, often called "pretty PROC PRINT", can do it also:
proc report data=sashelp.class;
columns age;
define age/analysis sum;
run;
PROC TABULATE can do it:
proc tabulate data=sashelp.class;
var age;
tables age*sum;
run;
PROC MEANS:
proc means data=sashelp.class sum;
var age;
run;
Etc., plenty of ways to do the same thing.
Here's what I'm doing but it throws an error saying. It will print the first two but fails after that.
ERROR: Data set WORK.FISH is not sorted in ascending sequence. The current BY group has Species = Whitefish and the next BY group
has Species = Parkki.
Any idea what I'm doing wrong?
data fish;
set sashelp.fish;
run;
proc means data = fish;
var weight;
by species;
run;
The BY statement in proc means assumes that the dataset is sorted by the BY variable. Just add a proc sort before proc means.
data fish;
set sashelp.fish;
run;
proc sort data=fish;
by species;
run;
proc means data = fish;
var weight;
by species;
run;
Another way of doing it without the proc sort is to use a class statement:
proc means data=sashelp.fish mean;
class species;
var weight;
run;
I want to format the result of the following call to PROC UNIVARIATE (TestForLocation).
proc sort data=sashelp.class; by sex; run;
proc univariate data = sashelp.class mu0 = 1;
ods select TestsForLocation;
var age;
by sex;
ods output TestsForLocation=ttest;
run;
data ttest; set ttest; keep sex test stat pvalue;run;
proc print data=ttest;run;
How can I trasnpose the output into a datset with the following columns?
Obs, Sex, StudentsT_Stat, StudentsT_pValue, SignedRank_Stat, SignedRank_Pvalue
You need to double transpose here. Make a dataset with 12 observations, with four columns: Obs, Sex, ID being the combination of Test and (Stat|pValue) that you want as your eventual variable name, and Value being the value you want transposed into the variable. Then,
proc transpose data=ttest_double out=ttest_transposed;
by obs sex;
id ID;
var Value;
run;
(ID and Value can be any variable name you like.)