How to convert 201711 into 13th Nov 2017? - sas

Hi I am beginner is sas and I need help for this question.
I want to convert 201711 to 13th Nov 2017. I cannot understand this tricky questions.
Please help and thanks in advance.

If this is for display purpose and assuming all date values are in the same format as that in your question then this should work.
First you create format to display the months:
proc format lib=work;
value mon
1 = "Jan"
2 = "Feb"
3 = "Mar"
4 = "Apr"
5 = "May"
6 = "Jun"
7 = "Jul"
8 = "Aug"
9 = "Sep"
10 = "Oct"
11 = "Nov"
12 = "Dec"
;
run;
Then you substring the month and the year from your date variable and then apply the formats.
data have;
length full_date $20;
date = 201711;
mon = input(substrn(date,5,2),best.);
yr = input(substrn(date,1,4),best.);
full_date = compbl(put(mon,mon.)||put(yr,best.));
run;

If '201711' is just some text you have to convert, then it seems the day number is missing so it will have to be added. SAS treats dates as numbers, so it is useful to convert text dates to a SAS date format. The date can then be reformatted:
data want;
have = '201711'; /* given partial date */
add_day = '13'; /* day of month to add */
full_dt = cats(have,add_day); /* join day to partial date */
num_dt = input(full_dt,yymmdd8.); /* convert to a SAS date */
text_dt = put(num_dt,date9.); /* format as desired */
run;
As you are new to SAS I have commented what each part is doing, but it would be more useful for you to understand date handling / processing in SAS, e.g. the following is a useful start:
http://support.sas.com/documentation/cdl/en/lrcon/65287/HTML/default/viewer.htm#p1wj0wt2ebe2a0n1lv4lem9hdc0v.htm

Related

Not getting the results from the suggestions trying to subset data

DATA proj4.gasQTR;
SET proj4.gasQTR;
INPUT Q1 Q2 Q3 Q4;
IF MONTH = 1 or 2 or 3 THEN Q1 = 1;
ELSE IF MONTH = 4 or 5 or 6 THEN Q2 = 2;
ELSE IF MONTH = 7 or 8 or 9 THEN Q3 = 3;
ELSE IF MONTH = 10 or 11 or 12 THEN Q4 = 4;
quarter = MONTH; FORMAT Quarter qtrw.;
RUN;
I am trying to get a 1-4 value for each qtr of each year, my error comes from Quarter qtrw. 'ERROR 388-185 Expecting an arithmetic operator'
*Data is already in 1-4 format for the month variable
What am I doing wrong?
Any help would be appreciated!
Thank you!
You normally do not use both a SET statement to retrieve data from an existing dataset and an INPUT statement to read values from a text file in the same data step. And if you do want to INPUT values from a text file you must tell SAS where to find the text by including either an INFILE statement or add the text in-line with the code by using a DATALINES (or CARDS) statement.
SAS will consider any number that is not zero or missing as TRUE. So the condition 2 or 3 or 4 is always TRUE. So Q1 will always be set to 1 and Q2, Q3 and Q4 will always be missing (or if they existed already unchanged). If you want to test if a variables has any of a number of values use the IN operator instead of the equality operator. month in (1 2 3 4)
You also should not be reading and writing the same dataset. If there are logic issues in your coding you might destroy the original dataset. So hopefully you have backup copy of proj4.gasQTR, or a program that can recreate it.
What is the format QTRW ? Is that something you created? Show its definition.
Assuming you have a variable named MONTH with integer values in the range 1 to 12 you can calculate QUARTER with integer values in the range 1 to 4 with a simple arithmetic function instead of coding a series of IF conditions.
data want;
set have;
quarter = ceil(month/3) ;
run;
If you actually have a DATE variable then perhaps all you were supposed to do was use the MONTH or QTR format to display the dates as the month number or quarter number that they fall into.
Try this program to see the impact of applying different formats to the same values.
data test;
do month=1 to 12;
date1=mdy(month,1,2022);
date2=date1;
date3=date1;
output;
end;
format date1 date9. date2 month. date3 qtr.;
run;
proc print;
run;
Use the in operator or repeat the equality for every case.
Example from the doc:
You can use the IN operator with character strings to determine whether a variable's value is among a list of character values. The following statements produce the same results:
if state in ('NY','NJ','PA') then region+1;
if state='NY' or state='NJ' or state='PA' then region+1;
Therefore
DATA proj4.gasQTR;
SET proj4.gasQTR;
IF MONTH = 1 or MONTH = 2 or MONTH = 3 THEN Q1 = 1;
ELSE IF MONTH = 4 or MONTH = 5 or MONTH = 6 THEN Q2 = 2;
ELSE IF MONTH = 7 or MONTH = 8 or MONTH = 9 THEN Q3 = 3;
ELSE IF MONTH = 10 or MONTH = 11 or MONTH = 12 THEN Q4 = 4;
quarter = MONTH; FORMAT Quarter qtrw.;
RUN;
is equivalent to
DATA proj4.gasQTR;
SET proj4.gasQTR;
IF MONTH in (1,2,3) THEN Q1 = 1;
ELSE IF MONTH in (4,5,6) THEN Q2 = 2;
ELSE IF MONTH in (7,8,9) THEN Q3 = 3;
ELSE IF MONTH in (10,11,12) THEN Q4 = 4;
quarter = MONTH; FORMAT Quarter qtrw.;
RUN;

How can I select the first and last week of each month in SAS?

I have monthly data with several observations per day. I have day, month and year variables. How can I retain data from only the first and the last 5 days of each month? I have only weekdays in my data so the first and last five days of the month changes from month to month, ie for Jan 2008 the first five days can be 2nd, 3rd, 4th, 7th and 8th of the month.
Below is an example of the data file. I wasn't sure how to share this so I just copied some lines below. This is from Jan 2, 2008.
Would a variation of first.variable and last.variable work? How can I retain observations from the first 5 days and last 5 days of each month?
Thanks.
1 AA 500 B 36.9800 NH 2 1 2008 9:10:21
2 AA 500 S 36.4500 NN 2 1 2008 9:30:41
3 AA 100 B 36.4700 NH 2 1 2008 9:30:43
4 AA 100 B 36.4700 NH 2 1 2008 9:30:48
5 AA 50 S 36.4500 NN 2 1 2008 9:30:49
If you want to examine the data and determine the minimum 5 and maximum 5 values then you can use PROC SUMMARY. You could then merge the result back with the data to select the records.
So if your data has variables YEAR, MONTH and DAY you can make a new data set that has the top and bottom five days per month using simple steps.
proc sort data=HAVE (keep=year month day) nodupkey
out=ALLDAYS;
by year month day;
run;
proc summary data=ALLDAYS nway;
class year month;
output out=MIDDLE
idgroup(min(day) out[5](day)=min_day)
idgroup(max(day) out[5](day)=max_day)
/ autoname ;
run;
proc transpose data=MIDDLE out=DAYS (rename=(col1=day));
by year month;
var min_day: max_day: ;
run;
proc sql ;
create table WANT as
select a.*
from HAVE a
inner join DAYS b
on a.year=b.year and a.month=b.month and a.day = b.day
;
quit;
/****
get some dates to play with
****/
data dates(keep=i thisdate);
offset = input('01Jan2015',DATE9.);
do i=1 to 100;
thisdate = offset + round(599*ranuni(1)+1); *** within 600 days from offset;
output;
end;
format thisdate date9.;
run;
/****
BTW: intnx('month',thisdate,1)-1 = first day of next month. Deduct 1 to get the last day
of the current month.
intnx('month',thisdate,0,"BEGINNING") = first day of the current month
****/
proc sql;
create table first5_last5 AS
SELECT
*
FROM
dates /* replace with name of your data set */
WHERE
/* replace all occurences of 'thisdate' with name of your date variable */
( intnx('month',thisdate,1)-5 <= thisdate <= intnx('month',thisdate,1)-1 )
OR
( intnx('month',thisdate,0,"BEGINNING") <= thisdate <= intnx('month',thisdate,0,"BEGINNING")+4 )
ORDER BY
thisdate;
quit;
Create some data with the desired structure;
Data inData (drop=_:); * froget all variables starting with an underscore*;
format date yymmdd10. time time8.;
_instant = datetime();
do _i = 1 to 1E5;
date = datepart(_instant);
time = timepart(_instant);
yy = year(date);
mm = month(date);
dd = day(date);
*just some more random data*;
letter = byte(rank('a') +floor(rand('uniform', 0, 26)));
*select week days*;
if weekday(date) in (2,3,4,5,6) then output;
_instant = _instant + 1E5*rand('exponential');
end;
run;
Count the days per month;
proc sql;
create view dayCounts as
select yy, mm, count(distinct dd) as _countInMonth
from inData
group by yy, mm;
quit;
Select the days;
data first_5(drop=_:) last_5(drop=_:);
merge inData dayCounts;
by yy mm;
_newDay = dif(date) ne 0;
retain _nrInMonth;
if first.mm then _nrInMonth = 1;
else if _newDay then _nrInMonth + 1;
if _nrInMonth le 5 then output first_5;
if _nrInMonth gt _countInMonth - 5 then output last_5;
run;
Use the INTNX() function. You can use INTNX('month',...) to find the beginning and ending days of the month and then use INTNX('weekday',...) to find the first 5 week days and last five week days.
You can convert your month, day, year values into a date using the MDY() function. Let's assume that you do that and create a variable called TODAY. Then to test if it is within the first 5 weekdays of last 5 weekdays of the month you could do something like this:
first5 = intnx('weekday',intnx('month',today,0,'B'),0) <= today
<= intnx('weekday',intnx('month',today,0,'B'),4) ;
last5 = intnx('weekday',intnx('month',today,0,'E'),-4) <= today
<= intnx('weekday',intnx('month',today,0,'E'),0) ;
Note that those ranges will include the week-ends, but it shouldn't matter if your data doesn't have those dates.
But you might have issues if your data skips holidays.

How to get week value based on financial year in SAS?

I have below dataset , I need to find the week number from the date given based on the financial year(e.g April 2013 to March 2014). For example 01AprXXX , should be 0th or 1st week of the year and the consequent next year March's last week should be 52/53. I have tried a way to find out the same( code is present below as well).
I am just curious to know if there is any better way in SAS to do this in SAS
. Thanks in advance. Please let me know if this question is redundant, in that case I would delete it at the earliest, although I search for the concept but didn't find anything.
Also my apologies for my English, it may not be grammatically correct.But I hope I am able to convey my point.
DATA
data dsn;
format date date9.;
input date date9.;
cards;
01Nov2015
08Sep2013
06Feb2011
09Mar2004
31Mar2009
01Apr2007
;
run;
CODE
data dsn2;
set dsn;
week_normal = week(date);
dat2 = input(compress("01Apr"||year(date)),date9.);
week_temp = week(dat2);
format dat2 date9.;
x1 = month(input(compress('01Jan'||(year(date)+1)),date9.)) ;***lower cutoff;
x2 = month(input(compress("31mar"||year(date)),date9.)); ***upper cutoff;
x3 = week(input(compress("31dec"||(year(date)-1)),date9.)); ***final week value for the previous year;
if month(dat2) <= month(date) <= month(input(compress("31dec"||year(date)),date9.)) then week_f = week(date) - week_temp;
else if x2 >= month(date) >= x1 then week_f = week_normal + x3 - week(input(compress("31mar"||(year(date)+1)),date9.)) ;
run;
RESULT
INTCK and INTNX are your best bets here. You could use them as I do below, or you could use the advanced functionality with defining your own interval type (fiscal year); that's described more in the documentation.
data dsn2;
set dsn;
week_normal = week(date);
dat2 = intnx('month12.4',date,0); *12 month periods, starting at month 4 (so 01APR), go to the start of the current one;
week_F = intck('week',dat2,date); *Can adjust what day this starts on by adding numbers to week, so 'week.1' etc. shifts the start of week day by one;
format dat2 date9.;
run;

How to update previous retained rows in SAS / if condition?

I have a database like this. This corresponds to a single person and I have this type of data for multiple persons.
data test;
input date YYMMDD10. real_length min_length;
format date YYMMDD10.;
cards;
2000-02-23 1 7
2000-02-24 12 15
2000-03-07 15 7
2000-03-22 7 15
2000-03-29 13 7
2000-04-11 17 7
2000-04-28 . 7
run;
What I am looking for is : if the interval between 2 dates in consecutive lines (real_length) is inferior to a certain length (min_length), I want to replace the date in the next line by the previous date + min_length. So far, this is not a problem and here is the code I used to achieve it:
data test2;
set test;
format lagdate min_date YYMMDD10.;
retain lagmin lagdate;
if lag(real_length) < lag(min_length) and lag(real_length) ~= . then min_date = lagdate + lagmin;
else min_date = date;
lagdate = min_date;
lagmin = min_length;
run;
Which gives :
date min_date min_length
2000-02-23 2000-02-23 7
2000-02-24 2000-03-01 15
2000-03-07 2000-03-16 7
2000-03-22 2000-03-22 15
...
The problem is that now the interval between 2 consecutive dates could become less than the minimal length, e.g. : 2000-03-22 - 2000-03-16 = 6 days < min_length = 7. And I would like to have 2000-03-23 = 2000-03-16 + 7 (=min_length) instead of 2000-02-22 like this:
date min_date min_length
2000-02-23 2000-02-23 7
2000-02-24 2000-03-01 15
2000-03-07 2000-03-16 7
2000-03-22 2000-03-23 15
...
So I've tried this code, but it does not work... I believe the problem could be in the if condition.
data test2;
set test;
format lagdate min_date YYMMDD10.;
retain lagmin lagdate;
if (lag(real_length) < lag(min_length) and lag(real_length) ~= .) or (adjust_length < lag(min_length) and adjust_length ~=.) then min_date = lagdate + lagmin;
else min_date = date;
adjust_length = min_date - lagdate;
lagdate = min_date;
lagmin = min_length;
run;
Does anybody see why this isn't working or do you hve another way of doing this?
Thank you!
The problem is that each time you adjust one date, you have to move all the subsequent dates as well if they're bunched up together. I think you can do this by keeping a running total of how many days you've added on to all the previous rows and then adding on only what's needed after that to get to the min_length between dates:
data want;
set test;
format t_min_date min_date yymmdd10.;
if _n_ = 1 then total_adj = 0;
t_min_date = date + min_length + total_adj;
min_date = lag1(t_min_date);
total_adj + max(0,min_length - real_length);
run;
Is that what you were aiming for?
N.B. you'll need to replace the if _n_ = 1 with some first.id and last.id logic to make this work for multiple individuals in the same dataset.

How to subtract date by days and keep the format

This may be a simple question but I want to subtract 6 days from 01/09/2012 and keep the
format of DD/MM/YYYY how would I do this. Also if I compare this with another date in the same format does SAS actually compare the dates so if I said
If (Date1<Date2) /*Does this work in SAS */
SAS dates are simply stored as the number of days since 01JAN1960 - so just subtract six :-)
See my log:
44 data _null_;
45 date1 = '01SEP2012'd;
46 date2 = date1 - 6;
47 put date2= ddmmyys10.; /* the format you need */
48 if (date1 < date2) then put 'false'; /* this DOES work in SAS */
49 else put date1= date2=; /* unformatted - num of days*/
50 run;
date2=26/08/2012
date1=19237 date2=19231