I have this dataset and need to calculate the days' difference between each dose date per period. How do I label each period study date so I can carry out an intck to calculate the days' difference per subject (ptno)
Just use the DIF() function to calculate the change in value for your date variable. SAS stores dates as number of days so the difference will be the number of days between the two observations. You could then test if the difference is 7 days or not.
data want;
set have;
by ptno period;
interval = dif(ex_stadt);
if first.ptno then interval=0;
seven_days = (interval = 7) ;
run;
The code of Tom works very well. I simulated the data set with a few rows based in
the sample showed above and it's OK.
Only thing absent is PROC SORT. If the data set is huge the log will exhibit an error.
proc sort data=have;
by ptno period;
run;
Related
I have a date time variable 'chg_date_of_svc' and would like to make this variable a month_year variable. To do this, I simply wrote the follow code:
data combined1;
set combined;
MONTH_YEAR=chg_date_of_svc;
format MONTH_YEAR monyy7.;
run;
I would then like to use the month_year variable in a proc freq statement; however, the month_years do not appear in chronological order when using the following code. For example, January 2019 appears before December 2018 in the tables the proc freq statement produces.
This may not be the easiest solution but I suspect I have to relabel the specific year_months so they appear in the the correct chronological order?
proc freq data = combined1 order=data;
table EM_Charge*MONTH_YEAR;
run;
Thank you for the help.
You requested that it list of columns in the order that they first appear in the input dataset. If you want them in chronological order then remove the ORDER=DATA option. If you must use ORDER=DATA then sort the data first.
I have a dataset with many dates in them. I want to categorize these dates into a new column that organizes them by decade (1980s, 1990s, etc).
I have a good idea on how to use IF, AND, and ELSE statements to accomplish this, but I don't know how to have SAS extract the year and only the year from the date to apply it to the conditional logic.
You could always use multipliers and the intnx() function as well.
Using Allan's sample data...
data want;
set have;
decade=year(intnx('year10.',dateval,0,'beginning'));
run;
The intnx() part of the code returns the date corresponding to the start of the decade, then we just take the year portion from it.
The year10. parameter tells it we want to work with decades, the 0 parameter means shift the date supplied to the current decade, and the beginning parameter tells it to return the date corresponding to the beginning of the decade.
If you're not familiar with using intnx() to perform date calculations in SAS see here for a quick primer: https://stackoverflow.com/a/11211180/214994
No need for conditional logic - can use a combination of the year() and floor() functions with some simple arithmetic:
data have;
infile cards;
input dateval date9.;
cards;
01JAN2004
08FEB1996
07MAR1987
14SEP1982
;run;
data want;
set have;
decade=floor(year(dateval)/10)*10;
run;
Which gives:
I have a target population with some characteristics and I have been asked to select an appropriate control based on these characteristics. I am trying to do a stratified sample using SAS base but I need to be able to define my 4 starta %s from my target and apply these to my sample. Is there any way I can do that? Thank you!
To do stratified sampling you can use PROC SURVEYSELECT
Here is an example:-
/*Dataset creation*/
data data_dummy;
input revenue revenue_tag Premiership_level;
datalines;
1000 High 1
90 Low 2
500 Medium 3
1200 High 4
;
run;
/*Now you need to Sort by rev_tag, Premiership_level (say these are the
variables you need to do stratified sampling on)*/
proc sort data = data_dummy;
by rev_tag Premiership_level;
run;
/*Now use SURVEYSELECT to do stratified sampling using 10% samprate (You can
change this 10% as per your requirement)*/
/*Surveyselect is used to pick entries for groups such that , both the
groups created are similar in terms of variables specified under strata*/
proc surveyselect data=data_dummy method = srs samprate=0.10
seed=12345 out=data_control;
strata rev_tag Premiership_level;
run;
/*Finally tag (if you want for more clarity) your 10% data as control
group*/
Data data_control;
Set data_control;
Group = "Control";
Run;
Hope this helps:-)
Let's say I have 50 years of data for each day and month. I also have a column which lists the max rainfall for each day of that dataset. I want to be able to compute the average monthly rainfall and standard deviation for each of those 50 years. How would I accomplish this task? I've considered using PROC MEANS:
PROC MEANS DATA = WORK.rainfall;
BY DATE;
VAR AVG(max_rainfall);
RUN;
but I'm unfamiliar on how to let SAS understand that I want to be using the MM of the MMDDYY format to indicate where to start and stop calculating those averages for each month. I also do not know how I can tell SAS within this PROC MEANS statement on how to format the data correctly, using MMDDYY10. This is why my code fails.
Update: I've also tried using this statement,
proc sql;
create table new as
select date,count(max_rainfall) as rainfall
from WORK.rainfall
group by date;
create table average as
select year(date) as year,month(date) as month,avg(rainfall) as avg
from new
group by year,month;
quit;
but that doesnt solve the problem either, unfortunately. It gives me the wrong values, although it does create a table. Where in my code could I have gone wrong? Am I telling SAS correctly that add all the rainfall's in 30 days and then divide it by the number of days for each month? Here's a snippet of my table.
You can use a format to group the dates for you. But you should use a CLASS statement instead of a BY statement. Here is an example using the dataset SASHELP.STOCKS.
proc means data=sashelp.stocks nway;
where date between '01JAN2005'd and '31DEC2005'd ;
class date ;
format date yymon. ;
var close ;
run;
I am currently running a macro code in SAS and I want to do a calculation with regards to max and min. Right now the line of code I have is :
hhincscaled = 100*(hhinc - min(hhinc) )/ (max(hhinc) - min(hhinc));
hhvaluescaled = 100*(hhvalue - min(hhvalue))/ (max(hhvalue) - min(hhvalue));
What I am trying to do is re-scale household income and value variables with the calculations below. I am trying to subtract the minimum value of each variable and subtract it from the respective maximum value and then scale it by multiplying it by 100. I'm not sure if this is the right way or if SAS is recognizing the code the way I want it.
I assume you are in a Data Step. A Data Step has an implicit loop over the records in the data set. You only have access to the record of the current loop (with some exceptions).
The "SAS" way to do this is the calculate the Min and Max values and then add them to your data set.
Proc sql noprint;
create table want as
select *,
min(hhinc) as min_hhinc,
max(hhinc) as max_hhinc,
min(hhvalue) as min_hhvalue,
max(hhvalue) as max_hhvalue
from have;
quit;
data want;
set want;
hhincscaled = 100*(hhinc - min_hhinc )/ (max_hhinc - min_hhinc);
hhvaluescaled = 100*(hhvalue - min_hhvalue)/ (max_hhvalue - min_hhvalue);
/*Delete this if you want to keep the min max*/
drop min_: max_:;
run;
Another SAS way of doing this is to create the max/min table with PROC MEANS (or PROC SUMMARY or your choice of alternatives) and merge it on. Doesn't require SQL knowledge to do, and probably about the same speed.
proc means data=have;
*use a class value if you have one;
var hhinc hhvalue;
output out=minmax min= max= /autoname;
run;
data want;
if _n_=1 then set minmax; *get the min/max values- they will be retained automatically and available on every row;
set have;
*do your calculations, using the new variables hhinc_max hhinc_min etc.;
run;
If you have a class statement - ie, a grouping like 'by state' or similar - add that in proc means and then do a merge instead of a second set in want, by your class variable. It would require a sorted (initial) dataset to merge.
You also have the option of doing this in SAS-IML, which works more similarly to how you are thinking above. IML is the SAS interactive matrix language, and more similar to r or matlab than the SAS base language.