I need to convert dates in the following forma:
30-giu-18
30-nov-20
......
into:
30JUN2018
30NOV2020
.......
I tried:
data Test;
set input;
mydates = input(myolddates, ddmmyy10.;)
format mydates ddmmyy10.;
run;
It doesn't work. The variable myolddates is character $9.
Can anyone help me please?
Try this
data have;
input myolddates $9.;
datalines;
30-giu-18
30-nov-20
;
options dflang = Italian ;
data want;
set have;
date = input(myolddates, EURDFDE9.);
format date ddmmyy10.;
run;
I would like to add a day to a SAS date and save that value as a new variable. I have these data:
data have;
input ID $ date;
datalines;
A 14610
B 13229
C 15644
D 14278
;
run;
And I would to end up with these data, with the new variables labelled for each iteration as below:
data want;
input ID $ date date1 date2 date3;
datalines;
A 14610 14611 14612 14613
B 13229 13230 13231 13232
C 15644 15645 15646 15647
D 14278 14279 14280 14281
;
run;
How can I accomplish this?
Try this
data have;
input ID $ date;
datalines;
A 14610
B 13229
C 15644
D 14278
;
run;
data want;
set have;
array d date1 - date3;
do over d;
d = date + _i_;
end;
run;
I have a large data file with data in the following format: country, datatype, year1month1 to year2018month7.
Reading the data using proc import did not work for all data fields. I ended up modifying the SAS datastep code to ensure data format was correct.
However, I am having trouble simplifying the code, namely I would like a do loop to go through all the years and month. This way, I could use current date to figure out the range of dates for the file and the code to create Year/Month variable does not have to repeat 100 times in the file.
data test;
infile 'abc.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat Country_Name $34. ;
do i = 1940 to 2018;
do j = 1 to 12;
informat _(i)M(j) best32.;
end;
end;
informat Base_Year $1. ;
format Country_Name $34. ;
do i = 1940 to 2018;
do j = 1 to 12;
format _(i)M(j) best12.;
end;
end;
format Base_Year $1. ;
input
Country_Name $
do i = 1940 to 2018;
do j = 1 to 12;
_(i)M(j) $;
end;
end;
Base_Year $;
run;
There are a few approaches here that could work. The most directly translatable to your approach is to use the macro language.
You need to translate those two loops to something like this:
%do i = 1940 %to 2018;
%do j = 1 %to 12;
informat _&i.M&j. best32.;
%end;
%end;
Notice the % there. This also has to be in a macro; you can't do this in normal datastep code.
I would rewrite it to use a macro like so:
%macro make_ym(startyear=, endyear=, separator=);
%local i j;
%do i = &startyear. %to &endyear.;
%do j = 1 %to 12;
_&i.&separator.&j.
%end;
%end;
%mend make_ym;
data test;
infile 'abc.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat Country_Name $34. ;
informat %make_ym(startyear=1940,endyear=2018,separator=M) best32.;
informat Base_Year $1. ;
format %make_ym(startyear=1940,endyear=2018,separator=M) best12.;
format Base_Year $1. ;
input
Country_Name $
%make_ym(startyear=1940,endyear=2018,separator=M)
Base_Year $;
run;
I took out the $ after the yMm bits in the input since you declared them as numeric.
Don't model your data step after the code generated by PROC IMPORT. It does a lot of useless things, like attaching formats and informats to variables that don't need them.
For your problem you just need a simple program like this:
data test;
infile 'abc.csv' dsd dlm= ',' truncover firstobs=2 ;
input Country_Name :$34. Y1940M01 .... Y2018M08 Base_Year :$1. ;
run;
Now the only tricky part is building that list of numerical variables. If the list is small enough you could just put it into a macro variable. Fortunately that is not a problem in this case since using 8 character names (YyyyyMmm) there is room for over 300 years worth in a data step character variable. A variable of length 10,800 bytes should have room for 100 years of month names.
So just run this data step first.
data _null_;
length names $10800 ;
basedate = mdy(1,1,1940);
lastdate = today();
do i=0 to intck('month',basedate,lastdate);
date=intnx('month',basedate,i);
names=catx(' ',names,cats('Y',year(date),'M',put(month(date),Z2.)));
end;
call symputx('names',names);
run;
Now you can use the macro variable in your INPUT statement.
data test;
infile 'abc.csv' dsd dlm= ',' truncover firstobs=2 ;
input Country_Name :$34. &names Base_Year :$1. ;
run;
I know for the infile statement, you can add a missover and truncover statement for missing values in the input, but if I'm using datalines for data instead of infile, is there an equivalent statement I can use?
You can still use infile by following it with datalines and then missover or truncover etc.
data _null_;
infile datalines missover;
input a b c;
put _all_;
datalines;
1 2
3 4
;
run;
I am confused about what DSD actually does in terms of "moving the pointer" and reading in data. To better explain, look at the following code:
data one;
infile cards dlm=',' TRUNCOVER ; /*using dlm','*/
input cust_id date ddmmyy10. A $ B $ C $;
cards;
1,10/01/2015,5000,dr
;
run;
data two;
infile cards dsd TRUNCOVER ;
input cust_id date ddmmyy10. A $ B $ C $;
cards;
1,10/01/2015,5000,dr
;
run;
The dataset one contains values for A and B of 5000 and dr but the dataset two contains values of A as missing whereas B and C are 5000 and dr. I don't get why the dsd sets A to missing.
Thanks!
Your problem is not DLM or DSD it is "DATE DDMMYY10." that is inFORMATTED input which is not compatible with delimited input in any form DSD or NO.
You need INFORMAT statement or : informat modified.
date :DDMMYY10.