I have a table with four variables and i want the table a table with combination of all values. Showing a table with only 2 columns as an example.
NAME AMOUNT COUNT
RAJ 90 1
RAVI 20 4
JOHN 30 5
JOSEPH 40 3
The following output is to show the values only for raj and the output should be for all names.
NAME AMOUNT COUNT
RAJ 90 1
RAJ 90 4
RAJ 90 5
RAJ 90 3
RAJ 20 1
RAJ 20 4
RAJ 20 5
RAJ 20 3
RAJ 30 1
RAJ 30 4
RAJ 30 5
RAJ 30 3
RAJ 40 1
RAJ 40 4
RAJ 40 5
RAJ 40 3
.
.
.
.
There are a couple of useful options in SAS to do this; both create a table with all possible combinations of variables, and then you can just drop the summary data that you don't need. Given your initial dataset:
data have;
input NAME $ AMOUNT COUNT;
datalines;
RAJ 90 1
RAVI 20 4
JOHN 30 5
JOSEPH 40 3
;;;;
run;
There is PROC FREQ with SPARSE.
proc freq data=have noprint;
tables name*amount*count/sparse out=want(drop=percent);
run;
There is also PROC TABULATE.
proc tabulate data=have out=want(keep=name amount count);
class name amount count;
tables name*amount,count /printmiss;
run;
This has the advantage of not conflicting with the name for the COUNT variable.
Try
PROC SQL;
CREATE TABLE tbl_out AS
SELECT a.name AS name
,b.amount AS amount
,c.count AS count
FROM tbl_in AS a, tbl_in AS b, tbl_in AS c
;
QUIT;
This performs a double self-join and should have the desired effect.
Here's a variation on #JustinJDavies's answer, using an explicit CROSS JOIN clause:
data have;
input NAME $ AMOUNT COUNT;
datalines;
RAJ 90 1
RAVI 20 4
JOHN 30 5
JOSEPH 40 3
run;
PROC SQL;
create table combs as
select *
from have(keep=NAME)
cross join have(keep=AMOUNT)
cross join have(keep=COUNT)
order by name, amount, count;
QUIT;
Results:
NAME AMOUNT COUNT
JOHN 20 1
JOHN 20 3
JOHN 20 4
JOHN 20 5
JOHN 30 1
JOHN 30 3
JOHN 30 4
JOHN 30 5
...
Related
This is my code:
DATA sales;
INFILE 'D:\Users\...\Desktop\Onions.dat';
INPUT VisitingTeam $ 1-20 ConcessionSales 21-24 BleacherSales 25-28
OurHits 29-31 TheirHits 32-34 OurRuns 35-37 TheirRuns 38-40;
PROC PRINT DATA = sales;
TITLE 'SAS Data Set Sales';
RUN;
This is the data, but the spacing may be incorrect.
Columbia Peaches 35 67 1 10 2 1
Plains Peanuts 210 . 2 5 0 2
Gilroy Garlics 151035 12 11 7 6
Sacramento Tomatoes 124 85 15 4 9 1
;
I need to add or delete a blank column at the 19th
column. Can someone help?
Just open the dataset and then look at what the variable name is. Then do:
Data Want (drop=varible_name_you_are_dropping); /*This is your output dataset*/
Set have; /*this is your dataset you have*/
Run;
What's the code program in SAS to stack data?
For the purpose of example, lets say I have this dataset:
DATA test.one;
INPUT Name $ Y1996 Y1997 Y1998 Y1999;
cards;
Dan 5 10 40 20
Derek 10 12 10 10
run;
proc print data = test.one;
run;
Running this set would give me an output like this:
Name Y1996 Y1997 Y1998 Y1999
Dan 5 10 40 20
Derek 10 12 10 10
However, I would want my data to look like this:
Name Year Income
Dan 1996 5
Dan 1997 10
Dan 1998 40
Dan 1999 20
Derek 1996 10
Derek 1997 12
Derek 1998 10
Derek 1999 10
It would create a new variable income corresponding to the stacking the of the data as shown above.
Are you asking how to read the raw data directly into that form?
DATA want;
INPUT Name $ #;
do year=1996 to 1999;
input income #;
output;
end;
cards;
Dan 5 10 40 20
Derek 10 12 10 10
;
The PROC Transpose can solve this;
DATA test.one;
INPUT Name $ y1996 y1997 y1998 y1999;
cards;
Dan 5 10 40 20
Derek 10 12 10 10
run;
proc print data = test.one;
run;
proc transpose data=test.one out=long1;
by name;
run;
data test2;
set long1 (rename=(col1=Income));
RUN;
It will then transform the dataset into a stacked version.
I am trying to convert a long table into wide table in SAS. I have the following data
Team ID Player Score
1 Andy 12
2 Andy 32
1 Andy 2
3 Andy 0
1 Andy 43
3 Bin 33
1 Bin 23
3 Bin 34
3 Bin 3
1 Bin 12
2 Ray 34
3 Ray 52
2 Ray 11
Now I want to add score of each player teamwise like
Team ID Player Score
1 Andy 46
1 Ray 34
1 Bin 33
2 Andy 43
2 Ray 52
3 Bin 72
3 Ray 11
Now I want to tanspose the players so that I get one row for each team like this.
Team ID Andy Ray Bin
1 46 34 33
2 43 52 .
3 . 11 72
I tried with proc transpose and proc means but couldn't find a suitable solution. Will appreciate your help for a sas code.
Regards
This might work. First sort your data by "Team":
Proc SORT Data=Have Out=Want1;
by Team;
Run;
Then Transpose your data as your variable will be sorted "By Team" (vertical) with "Id Player" (horizontal):
PROC TRANSPOSE Data=Want1 Out=Want2 let;
Id Player;
By Team;
Var Score;
Run;
Sorry I'm new to a lot of the features of SAS - I've only been using for a couple months, mostly for survey data analysis but now I'm working with a dataset which has individual level data for a cross-over study. It's in the form: ID treatment period measure1 measure2 ....
What I want to do is be able to group these individuals by their treatment group and then output a variable with a group average for measure 1 and measure 2 and another variable with the count of observations in each group.
ie
ID trt per m1 m2
1 1 1 101 75
1 2 2 135 89
2 1 1 103 77
2 2 2 140 87
3 2 1 134 79
3 1 2 140 80
4 2 1 156 98
4 1 2 104 78
what I want is the data in the form:
group a = where trt=1 & per=1
group b = where trt=2 & per=2
group c = where trt=2 & per=1
group d = where trt=1 & per=2
trtgrp avg_m1 avg_m2 n
A 102 76 2
B ... ... ...
C
D
Thank you for the help.
/Creating Sample dataset/
data test;
infile datalines dlm=" ";
input ID : 8.
trt : 8.
per : 8.
m1 : 8.
m2 : 8.;
put ID=;
datalines;
1 1 1 101 75
1 2 2 135 89
2 1 1 103 77
2 2 2 140 87
3 2 1 134 79
3 1 2 140 80
4 2 1 156 98
4 1 2 104 78
;
run;
/Using proc summary to summarize trt and per/
Variables(dimensions) on which you want to summarize would go into class
Variables(measures) for which you want to have average would go into var
Since you want to have produce average so you will have to write mean as the desired statistics.
Read more about proc summary here
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473735.htm
and here
http://web.utk.edu/sas/OnlineTutor/1.2/en/60476/m41/m41_19.htm
proc summary data=test nway;
class trt per;
var m1 m2;
output out=final(drop= _type_)
mean=;
run;
The alternative method uses PROC SQL, the advantage being that it makes use of plain-English syntax, so the concept of a group in your question is maintained in the syntax:
PROC SQL;
CREATE TABLE final AS
SELECT
trt,
per,
avg(m1) AS avg_m1,
avg(m2) AS avg_m2,
count(*) AS n
FROM
test
GROUP BY trt, per;
QUIT;
You can even add your own group headings by applying conditional CASE logic as you did in your question:
PROC SQL;
CREATE TABLE final AS
SELECT
CASE
WHEN trt=1 AND per=1 THEN 'A'
WHEN trt=2 AND per=2 THEN 'B'
WHEN trt=2 AND per=1 THEN 'C'
WHEN trt=1 AND per=2 THEN 'D'
END AS group
avg(m1) AS avg_m1,
avg(m2) AS avg_m2,
count(*) AS n
FROM
test
GROUP BY group;
QUIT;
COUNT(*) simply counts the number of rows found within the group. The AVG function calculates the average for the given column.
In each example, you can replace the explicitly named columns in the GROUP BY clause with a number representing column position in the SELECT clause.
GROUP BY 1,2
However, take care with this method, as adding columns to the SELECT clause later can cause problems.
I am wondering the best way to transpose data in SAS when I have multiple occurances of my id variable. I know I can use the let option in the proc transpose statement to do this, but I do not want to get rid of any data, as I intend to compute averages.
Here is an example of my data and my code:
data grades;
input student testnum grade;
cards;
1 1 30
1 1 25
1 2 45
1 3 67
2 1 22
2 2 63
2 2 12
2 2 77
3 1 22
3 1 17
3 2 14
3 4 17
;
run;
proc sort data=grades;
by student testnum;
run;
proc transpose data=grades out=trgrades;
by student;
id testnum;
var grade;
run;
Here is how I would like my resulting dataset to look:
student testnum1 testnum2 testnum3 testnum4 avg12 avg34
1 30 45 67 . 33.33 67
1 25 . . . 33.33 67
2 22 63 . . 43.5 .
2 . 12 . . 43.5 .
2 . 77 . . 43.5 .
3 22 14 . 17 53 17
3 17 . . . 53 17
I want to use this new dataset (not sure how yet) to create the new columns that are the average score of all testnum1's and testnum2's for a student (avg12) and the average of all testenum3's and testnum4's (avg34) for a student.
There may be a much more efficient way to do this but I am stumped.
Any advice is appreciated.
If all you really need is the average of all test 1's and 2's, and 3's and 4's for each student, then you don't need to transpose at all. All you need is a simple data step:
data grouped;
set grades;
if testnum In (1,2) then group=1;
else if testnum in (3,4) then group=2;
run;
Then a basic proc means:
proc means data=grouped;
by student group;
var grade;
output out=averages mean=groupaverage;
run;
If you need the averages in a single observation, you can easily transpose the averages dataset.
proc transpose data=grades out=trgrades;
by student;
id group;
var grade;
run;
Update:
As mentioned by #Keith, using a format to group the tests is an excellent choice as well. Skip the data step and create the format like so:
proc format;
value TestGroup
1,2 = 'Tests 1 and 2'
3,4 = 'Tests 3 and 4'
;
run;
Then the proc means becomes:
proc means data=grouped;
by student testnum;
var grade;
format testnum TestGroup.;
output out=averages mean=groupaverage;
run;
End Update
If, for some reason, you really need to have all the test scores in one observation then I would recommend using a data step to make them uniquely identifiable. Use by, testnum.first, retain, and a simple counter to assign each score a retake number. Now your transpose uses retake and testnum as id variables. You should be able to figure it out from there.
Really hoping right now that I didn't just do your SAS homework assignment for you.