I have to pass Date format in Proc SQL in the where Class.
Date Format is like this "SAT Mar 17 01:29:17 IST 2018" (String Column, length is 28)
Now when i have tried input(Date,datetime18.) and some other date functions, but all are giving me error. Below is my query
proc sql;
Select input(Date,datetime18.) from table;
quit;
How to convert this date into simple date like "17-03-2018", so that i can use the same in proc SQL query?
date is numeric and you should not compare it with string value, compare it with date literal by converting your value also to date. comparing greater and less than values with strings, in general do not serve any purpose and can lead to erroneous results. less than and greater than have meaning/make sense when you compare numeric variables
data have;
b = "SAT Mar 17 01:29:17 IST 2018";
output;
b= "SAT Mar 19 01:29:17 IST 2018";
output;
b= "SAT Jun 20 01:29:17 IST 2018";
output;
b= "SAT Mar 25 01:29:17 IST 2018";
output;
run;
proc sql;
select * from have
where input(cats(scan(b,3),scan(b,2), scan(b, -1)),date9.) > "19Mar2018"d;
The ANYDTDTM can be a go to informat in many cases, however, as you point out (in comments) it is not so for the datetime format presented in the question.
The string can be re-arranged into a SAS inputable date time representation using cats and scan
data have;
date_string = "SAT Mar 17 01:29:17 IST 2018";
run;
data want;
set have;
dt_wrong = input(date_string, anydtdtm.);
dt_right = input ( cats
( scan(date_string,3),
scan(date_string,2),
scan(date_string,6),':',
scan(date_string,4)
), datetime20.);
put date_string= /;
put dt_right= datetime20. " from input of cats of string parts";
put dt_wrong= datetime20. " from anydttm ";
run;
* sample macro that can be applied in data step or sql;
%macro dts_to_datetime(dts);
input ( cats
( scan( &dts , 3),
scan( &dts , 2),
scan( &dts , 6), ':',
scan( &dts , 4)
)
, datetime20.)
%mend;
data want2;
set have;
dt_righton = %dts_to_datetime(date_string);
format dt_righton datetime20.;
put dt_righton=;
run;
The macro can also be used in where statements such as
where '18-Mar-2018:0:0'DT <= %dts_to_datetime (date_string)
Or SQL
, %dts_to_datetime (date_string) as event_dtm format=datetime20.
i have used sub-string.I created a new column NEW_DATE in Dataset. As mentioned above date format is Sat Mar 17 01:01:01 IST 2018. Below is the data step to fetch date in the format of DD-MMM-YYYY
data new;
set old;
new_date = substr(date,9,2)||"-"||substr(date,5,3)||"-"||substr(date,25,4);
new_date_updated = input(new_date,date11.);
format new_date_updated date11.;
run;
then i use new column in proc sql
proc sql;
select * from new where new_date_updated>'17-Mar-2018'd;
quit;
and it worked for me.
Thanks
Checked above approach with all scenarios, working fine with all
Related
I need help retaining values in a SAS dataset and completing the column datetime (to the level of seconds) when not existing.
My dataset looks like:
data HAVE;
input type$ DATE:datetime18. value;
format date datetime18.;
cards;
A 19JUN01:21:06:55 534
A 19JUN01:21:06:58 590
A 19JUN01:21:07:02 600
A 19JUN01:21:07:04 602
B 18JUN01:22:06:58 105
B 18JUN01:22:07:03 110
;
run;
I need to fill the missing datetime and repeat the value when needed.
My result dataset should be:
data WANT;
input type$ DATE:datetime18. value;
format date datetime18.;
cards;
A 19JUN01:21:06:55 534
A 19JUN01:21:06:56 534
A 19JUN01:21:06:57 534
A 19JUN01:21:06:58 590
A 19JUN01:21:06:59 590
A 19JUN01:21:07:00 590
A 19JUN01:21:07:01 590
A 19JUN01:21:07:02 600
A 19JUN01:21:07:03 600
A 19JUN01:21:07:04 602
B 18JUN01:22:06:58 105
B 18JUN01:22:06:59 105
B 18JUN01:22:07:00 105
B 18JUN01:22:07:01 105
B 18JUN01:22:07:02 105
B 18JUN01:22:07:03 110
;
run;
Thanks for your suggestions.
Regards
If you have SAS/ETS, proc expand will do the entire conversion for you with the step method.
proc expand data=have out=want to=second;
by type;
id date;
convert value / method=step;
run;
If you don't, you can do this using a few DATA Steps.
First, create a template of all datetimes that you want for each by-group. We want something that looks like this:
type date_start date_end
A 19JUN01:21:06:55 19JUN01:21:07:04
B 18JUN01:22:06:58 18JUN01:22:07:03
The following code will do this:
data date_start_end;
set have;
by type date;
retain date_start;
if(first.type) then date_start = date;
if(last.type) then do;
date_end = date;
output;
end;
format date_start date_end datetime.;
keep type date_start date_end;
run;
Next, we need to create a template that fills in all the possible seconds between start/end for each type. We want something that looks like this:
type date
A 19JUN01:21:06:55
A 19JUN01:21:06:56
A 19JUN01:21:06:57
...
B 18JUN01:22:07:01
B 18JUN01:22:07:02
B 18JUN01:22:07:03
The following code does this:
data date_template;
set date_start_end;
do date = date_start to date_end;
output;
end;
format date datetime.;
keep type date;
run;
Now we just need to merge this template with our original data and retain the last non-missing value.
data want;
merge have(rename=(value = _value_) )
date_template
;
by type date;
retain value;
if(NOT missing(_value_) ) then value = _value_;
drop _value_;
run;
Note that we rename value to _value_ in the original dataset since retain will not work the way we expect after we merge. We need to create a new variable in order for it to retain properly.
I have a data set which contains four monthly observations in each row.
1Sep11 389.00 1Oct11 491.00 1Nov11 370.00 1Dec11 335.00
2Sep11 423.00 2Oct11 478.00 2Nov11 407.00 2Dec11 442.00
3Sep11 482.00 3Oct11 300.00 3Nov11 303.00 3Dec11 372.00
I need to have a data set, which would contain the months (Sep, Oct, Nov, Dec) as four columns, and the readings against each month placed on the right column. Example.
Day|Sep|Oct|Nov|Dec
1|389.00|491.00|370.00|335.00
2|423.00|478.00|407.00|442.00
3|482.00|300.00|303.00|372.00
How can I do this in SAS? I have tried the ## option, but that only helps me to read the four readings in the row, and create one observation for each reading.
You original data is in categorical form. That is good!
You are asking for a transformation that changes data (month part of date) into meta data (month name as column). This means down the road you will be dealing with arrays or variable name lists.
I would recommend keeping your data in categorical form. Categorical form means you can use CLASS and BY statements for efficient processing. Use Proc TABULATE to arrange your data items for output or delivery consumption (such as ODS EXCEL).
data have;
attrib date informat=date9. format=date9.;
input date value ##;
date_again = date;
datalines;
1Sep11 389.00 1Oct11 491.00 1Nov11 370.00 1Dec11 335.00
2Sep11 423.00 2Oct11 478.00 2Nov11 407.00 2Dec11 442.00
3Sep11 482.00 3Oct11 300.00 3Nov11 303.00 3Dec11 372.00
run;
proc tabulate data=have;
class date date_again;
var value;
format date monname.;
format date_again day.;
table date_again='', date=''*value=''*max='' / nocellmerge;
run;
ODS LISTING output
-------------------------------------------------------------
| | September | October | November | December |
|-------+------------+------------+------------+------------|
|1 | 389.00| 491.00| 370.00| 335.00|
|-------+------------+------------+------------+------------|
|2 | 423.00| 478.00| 407.00| 442.00|
|-------+------------+------------+------------+------------|
|3 | 482.00| 300.00| 303.00| 372.00|
-------------------------------------------------------------
If you feel you must transpose the data, split out the day and month for use as by and id
data have2(keep=day month value);
attrib date informat=date9. format=date9.;
input date value ##;
day = day(date);
month = put(date,monname3.);
datalines;
1Sep11 389.00 1Oct11 491.00 1Nov11 370.00 1Dec11 335.00
2Sep11 423.00 2Oct11 478.00 2Nov11 407.00 2Dec11 442.00
3Sep11 482.00 3Oct11 300.00 3Nov11 303.00 3Dec11 372.00
run;
proc transpose data=have2 out=want2(drop=_name_);
by day;
var value;
id month;
run;
You are also going to run into problems when overall date range exceeds one year, or if the raw data rows are not day in month grouped or are disordered.
Code:
/* Step 1: Read each line in a string*/
data raw;
input line $ 1-70;
cards;
1Sep11 389.00 1Oct11 491.00 1Nov11 370.00 1Dec11 335.00
2Sep11 423.00 2Oct11 478.00 2Nov11 407.00 2Dec11 442.00
3Sep11 482.00 3Oct11 300.00 3Nov11 303.00 3Dec11 372.00
;;;
run;
/*Step 2: Exract the individual values separated by space */
data input;
set raw;
September= input(scan(line,1,' '),date7.);
S_Value= scan(line,2,' ');
October= input(scan(line,3,' '),date7.);
O_Value= scan(line,4,' ');
November= input(scan(line,5,' '),date7.);
N_Value= scan(line,6,' ');
December= input(scan(line,7,' '),date7.);
D_Value= scan(line,8,' ');
format September October November December date7. ;
drop line;
put _ALL_;
run;
Output:
September=01SEP11 S_Value=389.00 October=01OCT11 O_Value=491.00
November=01NOV11 N_Value=370.00 December=01DEC11 D_Value=335.00 _ERROR_=0 _N_=1
September=02SEP11 S_Value=423.00 October=02OCT11 O_Value=478.00
November=02NOV11 N_Value=407.00 December=02DEC11 D_Value=442.00 _ERROR_=0 _N_=2
September=03SEP11 S_Value=482.00 October=03OCT11 O_Value=300.00
November=03NOV11 N_Value=303.00 December=03DEC11 D_Value=372.00 _ERROR_=0 _N_=3
I have a variable in SAS with a lot of numbers, for example 11000, 30129, 11111, 30999. I want to group this by the first two digits so "11000 and 11111" and "30129 and 30999" will be in a own table.
It's quite simple,
You have to create a second column and extract the 2 first digit.
Then sort the dataset by this second columns.
data test;
infile datalines dsd ;
input a : 15. ;
datalines;
11000,
30129,
11111,
309999,
;
run;
data test_a;
length val_a $2;
set test;
val_a= SUBSTRN(a,1,2);
run;
proc sort data=test_a out=test_b;
by val_a;
run;
Result will be :
val_a a
11 11000
11 11111
30 30129
30 309999
And then you can create 2 dataset with selection on the val_a like this :
data want data_11 data_30;
set test_b;
if val_a = 11 then output data_11;
if val_a = 30 then output data_30;
run;
Regards,
I think I did like you, but my new column only shows with ".". But I think your answer can give me some help anyways, thank you!
data books;
infile "&path\Boken.csv" dlm=';' missover dsd firstobs=2;
input ISBN: $12.
Book: $quote150.;
run;
data test_a;
format val_ISBN 15.;
set books;
val_ISBN= SUBSTRN(ISBN,1,2);
run;
proc sort data=test_a out=test_b;
by val_ISBN;
run;
proc print data=test_b (obs=10) noobs ;
run;
I'm trying to manipulate my Dispensing_Date to give me the weeknum of the year ending on last Friday for each Date, can this be done? Here is what I have so far...
%let 1= 01012016;
%let 53 = 12302016;
**01 import whiteoak file;
proc import
datafile = "E:\Horizon\Adhoc\AH\whiteoak.xlsx"
out = whiteoak
dbms = XLSX
replace;
run;
** 02 remove dupes to ensure unique rx and fill;
proc sort nodup data=whiteoak;
by Rx_ Refill;
run;
** 03 Filter out holds;
data whiteoak;
set whiteoak;
where (Filled_Status="YES");
run;
** 04 create weekday variable;
data dates;
set whiteoak;
format Dispensing_Date MMDDYY8.;
run;
This is my best guess as to what you are asking.
24 data _null_;
25 x = today();
26 d = intnx('week.7',x,-1,'end');
27 put (_all_)(=weekdate.);
28 run;
x=Wednesday, January 4, 2017 d=Friday, December 30, 2016
Does this do what you want?
data weeks;
do date = '22DEC2016'd to '15JAN2017'd;
format date first_friday weekdate.;
sas_week=week(date);
first_friday= intnx('week.7',intnx('year',date,0,'b'),0,'e');
friday_week=1+int((7+date-first_friday)/7) ;
output;
end;
run;
If it does then apply it to your data:
data dates;
set whiteoak;
week = 1 + int((7+Dispensing_Date
- intnx('week.7',intnx('year',Dispensing_Date,0,'b'),0,'e'))/7);
run;
I have a data set with daily data in SAS. I would like to convert this to monthly form by taking differences from the previous month's value by id. For example:
thedate, id, val
2012-01-01, 1, 10
2012-01-01, 2, 14
2012-01-02, 1, 11
2012-01-02, 2, 12
...
2012-02-01, 1, 20
2012-02-01, 2, 15
I would like to output:
thedate, id, val
2012-02-01, 1, 10
2012-02-01, 2, 1
Here is one way. If you license SAS-ETS, there might be a better way to do it with PROC EXPAND.
*Setting up the dataset initially;
data have;
informat thedate YYMMDD10.;
input thedate id val;
datalines;
2012-01-01 1 10
2012-01-01 2 14
2012-01-02 1 11
2012-01-02 2 12
2012-02-01 1 20
2012-02-01 2 15
;;;;
run;
*Sorting by ID and DATE so it is in the right order;
proc sort data=have;
by id thedate;
run;
data want;
set have;
retain lastval; *This is retained from record to record, so the value carries down;
by id thedate;
if (first.id) or (last.id) or (day(thedate)=1); *The only records of interest - the first record, the last record, and any record that is the first of a month.;
* To do END: if (first.id) or (last.id) or (thedate=intnx('MONTH',thedate,0,'E'));
if first.id then call missing(lastval); *Each time ID changes, reset lastval to missing;
if missing(lastval) then output; *This will be true for the first record of each ID only - put that record out without changes;
else do;
val = val-lastval; *set val to the new value (current value minus retained value);
output; *put the record out;
end;
lastval=sum(val,lastval); *this value is for the next record;
run;
You could achieve this using a PROC SQL, and the intnx function to bring last months date forward a month...
proc sql ;
create table lag as
select b.thedate, b.id, (b.val - a.val) as val
from mydata b
left join
mydata a on b.date = intnx('month',a.date,1,'s')
and b.id = a.id
order by b.date, b.id ;
quit ;
This may need tweaking to handle scenarios where the previous month doesn't exist or months which have a different number of days to the previous month.