I have an 8 digit date of birth (e.g., 19860710). How can I extract the just the 4 digits of the year portion and store it in a new variable (e.g., 1986) in SAS?
i suggest when dealing with dates to (nearly) always to convert to SAS dates. You will save a lot of headache that way.
data wanted;
sas_date=input('19860710', yymmdd8.);
Sae_year=year(sas_date);
call symput('Year_mac_var', Sae_year);
run;
%put &Year_mac_var.;
1986
You can do this via Substring Proc sql or by arithmetic ways. This is just what I prefer.
Using substring:
num=19860710; /*Date as Numeric*/
txt=put(num,8.); /*Change Numeric to Char*/
year=substr(txt,1,4);
Or
year=substr("19860710",1,4);
data _null_;
date=19860710;
year=year(input(put(date,8.),yymmdd8.));
put year;
run;
Related
I am taking a scripting class and I have no idea what I'm doing!
For my assignment, I am supposed to print min/max/mean/std for each year. The .csv file I was given to use has a year column with the years as
1949.083
1949.167
1949.25
1949.333
1949.417
1949.5
1949.583
1949.667
1949.75
1949.833
1949.917
1950
1950.083
1950.167
and so on, all the way to 1960.
Assuming I am using PROC MEANS, is there a way to maybe combine the years so I can print a single set of calculations (min/max/mean/std) for each year? As in one set of calculations for the year 1949 (data values from 1949-1949.917), another one for 1950 (data values from 1950-1950.917), etc. Not sure if I'm making sense! I've been looking everywhere for hours and I can't figure it out! :(
If you want PROC MEANS to calculate separate statistics per year you can use a CLASS statement. With a CLASS statement it will define the groups based on the formatted value. So if you just use the format 4. with the variable YEAR then each value will be mapped to a simple 4 digit value.
proc means data=have min max mean std ;
class year;
format year 4.;
var analysis_var ;
run;
But that will round values like 1,949.667 to 1950 and not 1949. If you want to ignore the fractional part of the year you can use the INT() function. So first create a new variable and then use that new variable in the CLASS statement.
data step1;
set have;
yrnum = int(year);
run;
proc means data=step1 min max mean std ;
class yrnum ;
var analysis_var ;
run;
Hi I need to calculate the value of month supposed in sas
01jan1960 is equal to 1
02jan1960 is equal to 2
So I need to calculate for 01aug2020
I used intck function but no output
I want in datastep only .
SAS stores dates as the number of days since 1960 with zero representing first day of 1960. To represent a date in a program just use a quoted string followed by the letter D. The string needs to be something the DATE informat can interpret.
Let's run a little test.
6 data _null_;
7 do dt=0 to 3,"01-JAN-1960"d,'01AUG2020'd;
8 put dt= +1 dt date9.;
9 end;
10 run;
dt=0 01JAN1960
dt=1 02JAN1960
dt=2 03JAN1960
dt=3 04JAN1960
dt=0 01JAN1960
dt=22128 01AUG2020
So the date value for '01AUG2020'd is 22,128.
Subtraction works
days_interval = '01Aug2020'd - '01Jan1960'd;
Or looking at the unformatted value as SAS stores dates from 01Jan1960
days_interval = '01Aug2020'd;
format days_interval 8.;
I need to convert SAS character dates without imputing to numeric dates in yymmdd10. format. I tried several formats but it comes out as blank. Is it possible to change character SAS dates to numeric SAS dates?
data check;
length date $10;
date="2013-04-17"; output;
date="2012-11"; output;
date="2011-12-13"; output;
date="2015-03-24"; output;
date="2014"; output;
run;
If I understand "without imputing" correctly, the answer is "you can't".
SAS numeric dates are the number of days from January 1, 1960, until the date you specify. This only has meaning for a specific day: "2014-11" (November 2014) doesn't have a specific number of days from January 1, 1960, for example, there's a 30 day span there, and "2014" is even worse.
When you have only part of the date, you can impute the day, month, or year values (1st, 15th, 30th of the month, or a random day, etc.); but without imputing you cannot have a SAS numeric date value.
Use the INPUT() function to convert a string to a number. SAS stores dates as numbers, so this function can be used for the conversion.
data want;
set check;
format date2 date9.;
date2 = input(date,anydtdte10.);
run;
Here anydtdte10. is an INFORMAT that tells the function how to interpret the string. It is a generic INFORMAT for most date strings.
Note the 2014 string is not going to work with this. You can check for a missing value in date2 and attempt a different conversion.
I have a column that contains date values. So when imported as numeric, it shows 20668, 20669...etc. if I format it as yymmddn8, it shows 20160802 etc. However, what I really want is a numeric variable that shows 20160802. I have tried to create other to get day, month, year and then concatenate them. Unfortunately, the issue is if month and day is 1 digit, it would only show 201682. what would be the quickest way to achieve my goal. I guess a can turn the day and month variable to text and add 0 if day or month is less than 10. But this is not elegant and efficient. Please help.
Thanks
You can just wrap an input around that format:
data test;
date = 20668;
full_date = input(put(date,yymmddn8.),best12.);
run;
The put is converting the date to character in the format as you want it to appear, and the input with the best12. format is converting it back to numeric in that format.
It sounds like you just need to attach a format to your variable.
format date yymmddn8. ;
Try running this program to see a few of the different formats that are available for displaying dates.
data _null_;
do date = 20668, 20669 ;
put (6*date) (=10. =date9. =yymmddn8. =mmddyy10. =ddmmyy10. =yymmdd10.) ;
end;
run;
I have this dataset here which looks like this:
Basically I want to manipulate the data set so that I have
GVKEY1 as unique such as 1004 then a unique year number such as 1996 then several gvkey2 after that. However the number of gvkey2 for each year is not the same. Does anyone know how to get around this problem? This means I will have several 12 lines of data for gvkey1 for 1004 since i have years from 1996 to 2008. Then for each year I will have many columns where each column will have a gvkey2.
Best Regards,
Naz
Can you not just use PROC TRANSPOSE?
proc sort data=your_data_set out=temp1;
by gvkey1 year;
run;
proc transpose data=temp1 out=temp2;
by gvkey1 year;
var gvkey2;
run;
This will give you a series of variables COL1 - COLx. Use the PREFIX option for different variable names.
I'm not sure I've understood your question, but if you're looking for unique gvkey1/year pairs, you could do either of these:
proc sql;
create table results as
select distinct gvkey1, year
from _your_data_set;
quit;
or
proc sort data=_your_data_set(keep=gvkey1 year) out=results nodupkey;
by gvkey1 year;
run;
If that's not what you're looking for, I suggest posting an example of the results you want.