How to define Current Year in SAS - sas

I am trying to do a very easy SAS calculation such as Age=current_year- birth_year.
How do I set current year? I tried current_year=2018; but SAS does not like this answer?
I am needing to know how to set current year to 2018.
data merged_bike;
merge all_clean_data (in=a) boroughs(in=b);
by station_id;
if a and b;
run;
Current_year =2018;
Age= Current_year - birth_year;
if Age <29 then age_group ='0-29';
else if Age > 30 and age<40 then age_group= '30-39';
else if Age>40 and age<50 then age_group='40-49';
else if Age>50 and age<60 then age_group='50-59';
else if Age>60 and age<70 then age_group='60-69';
run;

You cannot have assignment statements outside of a data step. Remove the extra run; statement that is ending the data step definition too soon.
Or create a second data step to read the data back in and run your age calculations.
If you want today's year you can use the date() function (or its alias today()) and then use the year() function to extract the year.
current_year = year(date());

Related

Quickest way to fill in missing dates in a sequence? SAS

Say you download data for stocks or bonds, you have the stock price or yield for every trading day. So you have two variables, stock price (or yield if bond) and date. What is the quickest way to add weekends and holidays to the dates variable while using the previous open day as the values for those missing days?
For example, if it were July 1, 2022 there would be a stock price, lets say $100, corresponding to that date, but during the long weekend (4th of July) there are no observations in the data with the date being July 2nd through 4th. How do you add those dates with the stock price equaling $100 until the next trading day, July 5th?
I used a do loop to create the dates then merged and retain, but I feel like theres got to be a quicker method
You could just add an OUTPUT statement in a DO loop. The tricky part is getting the next date. Here is a method using a second SET statement that is offset by one observation.
data want;
set have ;
by date;
set have(firstobs=2 keep=date rename=(date=next_date)) have(obs=0 drop=_all_);
next_date = coalesce(next_date,date);
do date=date to next_date;
output;
end;
run;
But your real data probably has multiple stocks. So add some BY group processing.
data want;
set have ;
by stock date;
set have(firstobs=2 keep=date rename=(date=next_date)) have(obs=0 drop=_all_);
if last.stock the next_date=date;
do date=date to next_date;
output;
end;
run;

SAS combining range of class values for PROC MEANS

I am taking a scripting class and I have no idea what I'm doing!
For my assignment, I am supposed to print min/max/mean/std for each year. The .csv file I was given to use has a year column with the years as
1949.083
1949.167
1949.25
1949.333
1949.417
1949.5
1949.583
1949.667
1949.75
1949.833
1949.917
1950
1950.083
1950.167
and so on, all the way to 1960.
Assuming I am using PROC MEANS, is there a way to maybe combine the years so I can print a single set of calculations (min/max/mean/std) for each year? As in one set of calculations for the year 1949 (data values from 1949-1949.917), another one for 1950 (data values from 1950-1950.917), etc. Not sure if I'm making sense! I've been looking everywhere for hours and I can't figure it out! :(
If you want PROC MEANS to calculate separate statistics per year you can use a CLASS statement. With a CLASS statement it will define the groups based on the formatted value. So if you just use the format 4. with the variable YEAR then each value will be mapped to a simple 4 digit value.
proc means data=have min max mean std ;
class year;
format year 4.;
var analysis_var ;
run;
But that will round values like 1,949.667 to 1950 and not 1949. If you want to ignore the fractional part of the year you can use the INT() function. So first create a new variable and then use that new variable in the CLASS statement.
data step1;
set have;
yrnum = int(year);
run;
proc means data=step1 min max mean std ;
class yrnum ;
var analysis_var ;
run;

Extremely New to SAS

I am new to SAS and I am struggling struggling with my code. I would love some help. Am I thinking about this the right way? I have a huge table and I want to extract that data from certain dates. My two dates: 1969-12-01 and 1948-01-01 my sample code:
data null;
call symput ('timenow',put (time(),time.));
call symput ('datenow',put (date(),date9.));
run;
title "The current time is timenow and the date is datenow";
proc print data=sashelp.buy;
run;
So first learn about your dataset. So for example run PROC CONTENTS.
proc contents data=sashelp.buy; run;
Which will show you that there is variable named DATE that has date values (number of days since 1960).
So to reference a specific date use a date literal. That is a quoted string in the style that the DATE informat can read followed by the letter D. You can then use a WHERE statement to filter the data.
data want;
set sashelp.buy;
where date = '31dec1969'd ;
run;
Which will not find any observations since that date does not appear in that dataset.
If you want to select for multiple dates you could either add more conditions using OR.
where (date = '31dec1969'd) or (date = '01jan1948'd);
You can also use the IN operator:
where date in ('31dec1969'd '01jan1948'd);
Note that if your variable contains datetime values (number of seconds) then to pick a specific date you would either need to use a range of datetime literals:
where datetime between '31dec1969:00:00'dt and '31dec1969:11:59:59'dt);
Or convert the number of seconds into number of days and compare to the date literal.
where datepart(datetime) = '31dec1969'd ;
Welcome to StackOverflow Sportsguy3090.
Here I make a dataset called sample with some sample dates. That dataset has a variable called name and another variable called date. Internally, SAS stores dates as the number of days until or after January 1st 1970. That is rough to look at. So I use the format statement to have the dates appear as a 10 character string with month/day/year.
data sample;
name = "Abe "; date = "01Dec1969"d; output;
name = "Betty"; date = "01Jan1948"d; output;
name = "Carl"; date = "06Jun1960"d; output;
name = "Doug"; date = "06Dec1969"d; output;
name = "Ed"; date = "01Jan1947"d; output;
format date mmddyy10.;
run;
The code below subsets the data and puts the good records into a new dataset called keepers. It only keeps the records that are in the date range (including the limit dates).
data keepers;
set sample;
where date between "01jan1948"d and "01Dec1969"d;
run;
I hope that helps.... if not send up another flare.

How can I use different sets of date values depending on the date

I am looking to automate a daily report for my company but I have run in to a bit of trouble. The report gets updated only on the 2nd working day of each month. I found some code on the SAS website which works out what the 2nd working day of any month is.
data scdwrk;
/* advance date to the first day of the month using the INTNX function */
second=intnx('month',today(),0);
/* determine the day of the week using the WEEKDAY function */
day=weekday(second);
/* if day=Monday then advance by 1 */
if day=2 then second+1;
/* if day=Sunday then advance by 2 */
else if day=1 then second+2;
format second date9.;
run ;
I have also set a flag that compares todays date to the date from this generated by this piece of code.
I now need to find a way that if the code is run on the first working day of the month then it runs a particular set of macro date variables
%let start_date="&prevmnth;
%let end_date= &endprevmnth;
%let month= &prevyearmnth;
and then when its run on the 2nd working day of the month it uses the other set of macro date variables (calender month)
%let start_date="&currmnth;
%let end_date= &endcurrmnth;
%let month= &curryearmnth;
Any help on this would be greatly appreciated.
I have some recent code that does just this. Here is how I tackled it.
First, create a table of holidays. This can be maintained yearly.
Second, create a table with the first 5 days of the month that are not weekend days.
Third, delete holidays.
Finally, get the second value in the data set.
data holidays;
format holiday_date date9.;
informat holiday_date date9.;
input holiday_date;
datalines;
01JAN2015
19JAn2015
16FEB2015
03APR2015
25MAY2015
03JUL2015
07SEP2015
26NOV2015
25DEC2015
;
data _dates;
firstday = intnx('month',today(),0);
format firstday date date9.;
do date=firstday to firstday+5;
if 1 < weekday(date) < 7 then
output;
end;
run;
proc sql noprint;
delete from _dates
where date in (select holiday_date from holidays);
quit;
data _null_;
set _dates(firstobs=2);
call symput("secondWorkDay",put(date,date9.));
stop;
run;
%put &secondWorkDay;

Extract month and year for new variable

I have a dataset with a variable that has the date of orders (MMDDYY10.). I need to extract the month and year to look at orders by month--year because there are multiple years. I am vaguely aware of the month and year functions, but how can I use them together to create a new month bin?
Ideally I would create a bin for each month/year so my output can look something like:
Date
Item 2011OCT 2011NOV 2011DEC 2012JAN ...
a 50 40 30 20
b 15 20 25 30
c 1 2 3 4
total
Here is a sample code I created:
data dsstabl;
set dsstabl;
order_month = month(AC_DATE_OF_IMAGE_ORDER);
order_year = year(AC_DATE_OF_IMAGE_ORDER);
order = compress(order_month||order_year);
run;
proc freq data
table item * _order;
run;
Thanks in advance!
When you are doing your analysis, use an appropriate format. MONYY. sounds like the right one. That will sort properly and will group values accordingly.
Something like:
proc means data=yourdata;
class datevar;
format datevar MONYY7.;
var whatever;
run;
So your table:
proc tabulate data=dsstabl;
class item datevar;
format datevar MONYY7.;
tables item,datevar*n;
run;
Or you can do it with nice trick.
data my_data_with_months;
set my_data;
MONTH = INTNX('month', NORMAL_DATE, 0, 'B');
run;
Always use it.