I hope that you can help me out with a question.
After creating a summarizing table (using proc summary, proc means etc.), I would like to add a title to the dataset. It's not the easiest thing to remember sample restrictions etc. so it would help a lot to be able to add a title, for example: "Mean income (note: incomes < $1000 have been excluded)".
An obvious way of doing this is to create another dataset...
data title;
length title $100;
title = "Mean income (note: incomes < $1000 have been excluded)";
run;
...and then combine this with the summarizing table. But is there a standard procedure to simply add a title while creating the table?
If I understood correctly, what you want to accomplish is called Label of SAS dataset.
You can add label to your dataset when creating it by using dataset option LABEL.
You should be able to use it anywhere you can use dataset options and you're creating dataset, e.g.:
data title (label="Mean income (note: incomes < $1000 have been excluded)");
length var1 8;
run;
proc sql;
create table title2 (label="Title in SQL") as select * from title
;
quit;
proc sort data=title out = title_sorted (label="Title Sorted");
by var1;
run;
Or add/modify title later via PROC DATASETS:
proc datasets lib=WORK nodetails nolist;
modify title (label="New title");
quit;
Related
I'm a beginner in SAS and i have difficulties with this exercice:
I have a very simple table with 2 columns and three lines
I try to find the request that will return me the name of the most little people (so it must return titi)
All what I found is to return the most little size (157) but i don't want this, I want the name related to the most little value!
Could you help me please?
Larapa
A SQL having clause is a good one for this. SAS will automatically summarize the data and merge it back to the original table, giving you one a one-line table with the name of the smallest value of taille.
proc sql noprint;
create table want as
select nom
from have
having taille = min(taille)
;
quit;
Here are some other ways you can do it:
Using PROC MEANS:
proc means data=have noprint;
id nom;
output out=want
min(taille) = min_taille;
run;
Using sort and a data step to keep only the first observation:
proc sort data=have;
by taille;
run;
data want;
set have;
if(_N_ = 1);
run;
Noob SAS user here.
I have a hospital data set with patientID and a variable that counts the days between admission and discharge.
Those patients who had more than one hospital admission show up with the same patientID and with a record of how many days they were in hospital each time.
I want to sum the total days in hospital per patient, and then only have one patientID record with the sum of all hospital days across all stays. Does anyone know how I would go about this?
You want to select distinct the sum of days_in_hospital and group by patientID This will get what you want:
proc sql;
create table want as
select distinct
patientID,
sum(days_in_hospital) as sum_of_days
from have
group by patientID;
quit;
Alternatively you can use proc summary.
proc summary data= hospital_data nway;
class patientID;
var days;
output out=summarized_data (drop = _type_ _freq_) sum=;
run;
This creates a new dataset called summarized_data which has the summed days for each patientID. (The nway option removes the overall summary row, and the drop statement removes extra default summary columns you don't need.)
I've looked at a handful of other similar questions (here, here, and here), but have not had success with the accepted answers I found. I'm trying to transform a wide data set into a long data set, turning column names into rows with matching records adjacent to the old column names. I can't seem to get the original column names to appear using my current code.
I have a wide dataset that looks like this:
I need it to look like this:
I've tried to do this with an array:
data want;
set have;
array d ImprovementPlan -- AssessmentPlan;
do i = 1 to dim(d);
Section = d{i};
Text = d
output;
end;
keep DBN Emp_ID FiscalYear Section Text Meeting1 Meeting2 Meeting3 Meeting4 Meeting5;
run;
But end up with this:
I appreciate any advice you have for me.
union in proc SQL should do the trick
proc sql;
create table want as
select DBN, Emp_ID, FiscalYear, 'Action_Plan' as Section, Action_Plan as Text, Meeting1, Meeting2, Meeting3, Meeting4, Meeting5
from have
union
select DBN, Emp_ID, FiscalYear, 'Timeline' as Section, Timeline as Text, Meeting1, Meeting2, Meeting3, Meeting4, Meeting5
from have
union
select DBN, Emp_ID, FiscalYear, 'Support_Plan' as Section, Support_Plan as Text, Meeting1, Meeting2, Meeting3, Meeting4, Meeting5
from have
union
select DBN, Emp_ID, FiscalYear, 'Assessment_Plan' as Section, Assessment_Plan as Text, Meeting1, Meeting2, Meeting3, Meeting4, Meeting5
from have
;
quit;
SAS also has proc transpose to do that kind of operation.
EDIT: something in the lines of
proc sort data=have;
by DBN Emp_ID FiscalYear Meeting1 Meeting2 Meeting3 Meeting4 Meeting5;
run;
proc transpose data=have out=want(rename=(column1=Text)) name=Section prefix=column;
by DBN Emp_ID FiscalYear Meeting1 Meeting2 Meeting3 Meeting4 Meeting5;
var action_plan timeline support_plan assessment_plan;
run;
I was able make PROC TRANSPOSE work using the following code:
PROC TRANSPOSE DATA=WORK.t_yoy
OUT=flash.TTRANSPOSED_yoy(LABEL="Transposed WORK.T2017")
PREFIX=Text
NAME=Section
LABEL=Label
;
BY Emp_ID FiscalYear DBN;
VAR ImprovementPlan ActionPlan TimeLinePlan SupportPlan AssessmentPlan;
COPY DBN Emp_ID FiscalYear Meeting1 Meeting2 Meeting3 Meeting4 Meeting5;
RUN; QUIT;
I am trying to output a three way frequency table. I am able to do this (roughly) with proc freq, but would like the control for variable to be joined. I thought proc tabulate would be a good way to customize the output. Basically I want to fill in the cells with frequency, and then customize the percents at a later time. So, have count and column percent in each cell. Is that doable with proc tabulate?
Right now I have:
proc freq data=have;
table group*age*level / norow nopercent;
run;
that gives me e.g.:
What I want:
Here is the code I am using:
proc tabulate data=ex1;
class age level group;
var age;
table age='Age Category',
mean=' '*group=''*level=''*F=10./ RTS=13.;
run;
Thanks!
You can certainly get close to that. You can't really get in 'one' cell, it needs to write each thing out to a different cell, but theoretically with some complex formatting (probably using CSS) you could remove the borders.
You can't use VAR and CLASS together, but since you're just doing percents, you don't need to use MEAN - you should just use N and COLPCTN. If you're dealing with already summarized data, you may need to do this differently - if so then post an example of your dataset (but that wouldn't work in PROC FREQ either without a FREQ statement).
data have;
do _t = 1 to 100;
age = ceil(3*rand('Uniform'));
group = floor(2*rand('Uniform'));
level = floor(5*rand('Uniform'));
output;
end;
drop _t;
run;
proc tabulate data=have;
class age level group;
table age='Age Category',
group=''*level=''*(n='n' colpctn='p')*F=10./ RTS=13.;
run;
This puts N and P (n and column %) in separate adjacent cells inside a single level.
I have a dataset which has multiple obs per person. I want to have each single record showing the sum of a variable per person ID. However I do not want to group the data into single personal IDs. I hope the example below explains my question
I want to create the column in bold. How to do this? In SAS EG (or SAS if necessary)?
ID...Var1...SUM
X.....10.......30
X.....20.......30
Y.....20.......80
Y.....20.......80
Y.....40.......80
Z.....30.......30
You can do this using either proc sql or proc means
more info:proc means
proc sql
proc sql:
proc sql noprint;
create table new_table as
select distinct id, var1, sum(var_to_sum) as summed_var_name
from old_table
group by id
;
quit;
after rereading your question, using proc means you will need to merge var1 back in, better off using proc sql above.
proc means:
proc means data = old_table sum;
by id var1;
var var_to_sum;
output out = new_table sum;
run;