Deletion of 3 year old files using sas - sas

I am trying to delete the files which are having 3 year old files from a folder. But when I am running the code it is also deleting the other files which are not in the file name frmat which I tried to delete. The file name is like SFRE_BIL_SIT_20160812_134317_PAM_FILES1.zip I attached the code also with this
options mlogic;
%macro delete_year_files_in_folder(folder);
filename filelist "&folder";
data _null_;
dir_id = dopen('filelist');
total_members = dnum(dir_id);
do i = 1 to total_members;
member_name = dread(dir_id,i);
datestring = scan(member_name,4,'_');
month = input(substr(datestring,5,2),best.);
day = input(substr(datestring,5,2),best.);
year = input(substr(datestring,1,4),best.);
date = mdy(month, day, year);
if intnx('year', today(),-3,'S') > date %put _all_;
then do;
file_id = mopen(dir_id,member_name,'i',0);
if file_id > 0 then do;
freadrc = fread(file_id);
rc = fclose(file_id);
rc = filename('delete',member_name,,,'filelist');
rc = fdelete('delete');
end; %put _all_;
rc = fclose(file_id);
end;
end;
rc = dclose(dir_id);
run;
%mend;

I can see at least one bug in your code that might be causing unexpected behaviour:
month = input(substr(datestring,5,2),best.);
day = input(substr(datestring,5,2),best.);
I think you meant to type:
day = input(substr(datestring,7,2),best.);
I wouldn't do this, though - it's quicker to use date informats to do this:
date = input(datestring,yymmdd8.);
However, I think the bigger problem is with this line:
if intnx('year', today(),-3,'S') > date then do; /*Deletion logic follows*/
If you have a file that you don't want to delete that isn't in the same format, it's likely that date will have a missing value, as it won't have a date in the place where you're looking and the input functions earlier on will return missing values. In SAS, numeric missing values are less than any non-missing numeric value, so this condition will evaluate to true except for files with names in the format that you want to delete that are less than 3 years old.
You can avoid missing values fairly easily by tweaking your code like so:
if intnx('year', today(),-3,'S') > date and not(missing(date)) then do;

Related

SAS outputting new variable missing full name

I am rearranging some data to run in a linear regression and used the code below to change. When I look at the new dataset however the New Names are incomplete. For example Day11 shows are Day1, Day14 as Day1, Day20 as Day2 ect. Where is my code causing this problem?
Data EMU116Linear;
Set EMU116;
TumorVolume = Day0; Day = "Day0"; output;
TumorVolume = Day3; Day = "Day3"; output;
TumorVolume = Day7; Day = "Day7"; output;
TumorVolume = Day9; Day = "Day9"; output;
TumorVolume = Day11; Day = "Day11"; output;
TumorVolume = Day14; Day = "Day14"; output;
TumorVolume = Day16; Day = "Day16"; output;
TumorVolume = Day18; Day = "Day18"; output;
TumorVolume = Day21; Day = "Day21"; output;
drop Day0 Day3 Day7 Day9 Day11 Day14 Day16 Day18 Day21 TumorWeightDay21;
Run;
Original Data Set
New Data Set
SAS will truncate character values to the variable's designated length. Since you do not explicitly specify length, the very first assignment of Day0 designates the length of the new Day variable to 4 characters. Afterwards, any supplied, longer value to Day variable will truncate to 4 (i.e., Day11 to Day1). To accommodate 5 characters and to initialize new variables use length directive:
Data EMU116Linear;
Set EMU116;
length TumorVolume Day $5;
...
run
However, from your code and desired result a better less repetitive solution may be proc_transpose to reshape your wide data to long format:
proc sort data=EMU116Linear;
by Treatment Treatment2 TreatCode Mouse;
run;
proc transpose
data=EMU116Linear
out=EMU116;
by Treatment Treatment2 TreatCode Mouse;
run;
data EMU116;
set EMU116;
rename col1=TumorVolume
_name_=Day;
run;

SAS Macro error with call execute

I have the following code that is being used generate running totals of features for the past 1 day, 7 days, 1 month, 3 months, and 6 months.
LIBNAME A "C:\Users\James\Desktop\data\Base Data";
LIBNAME DATA "C:\Users\James\Desktop\data\Data1";
%MACRO HELPER(P);
data a1;
set data.final_master_&P. ;
QUERY = '%TEST('||STRIP(DATETIME)||','||STRIP(PARTICIPANT)||');';
CALL EXECUTE(QUERY);
run;
%MEND;
%MACRO TEST(TIME,PAR);
proc sql;
select SUM(APP_1), SUM(APP_2), sum(APP_3), SUM(APP_4), SUM(APP_5) INTO :APP_1_24, :APP_2_24, :APP_3_24, :APP_4_24, :APP_5_24
FROM A1
WHERE DATETIME BETWEEN INTNX('SECONDS',&TIME.,-60*60*24) AND &TIME.;
/* 7 Days */
select SUM(APP_1), SUM(APP_2), sum(APP_3), SUM(APP_4), SUM(APP_5) INTO :APP_1_7DAY, :APP_2_7DAY, :APP_3_7DAY, :APP_4_7DAY, :APP_5_7DAY
FROM A1
WHERE DATETIME BETWEEN INTNX('SECONDS',&TIME.,-60*60*24*7) AND &TIME.;
/* One Month */
select SUM(APP_1), SUM(APP_2), sum(APP_3), SUM(APP_4), SUM(APP_5) INTO :APP_1_1MONTH, :APP_2_1MONTH, :APP_3_1MONTH, :APP_4_1MONTH, :APP_5_1MONTH
FROM A1
WHERE DATETIME BETWEEN INTNX('SECONDS',&TIME.,-60*60*24*7*4) AND &TIME.;
/* Three Months */
select SUM(APP_1), SUM(APP_2), sum(APP_3), SUM(APP_4), SUM(APP_5) INTO :APP_1_3MONTH, :APP_2_3MONTH, :APP_3_3MONTH, :APP_4_3MONTH, :APP_5_3MONTH
FROM A1
WHERE DATETIME BETWEEN INTNX('SECONDS',&TIME.,-60*60*24*7*4*3) AND &TIME.;
/* Six Months */
select SUM(APP_1), SUM(APP_2), sum(APP_3), SUM(APP_4), SUM(APP_5) INTO :APP_1_6MONTH, :APP_2_6MONTH, :APP_3_6MONTH, :APP_4_6MONTH, :APP_5_6MONTH
FROM A1
WHERE DATETIME BETWEEN INTNX('SECONDS',&TIME.,-60*60*24*7*4*6) AND &TIME.;
quit;
DATA T;
PARTICIPANT = &PAR.;
DATETIME = &TIME;
APP_1_24 = &APP_1_24.;
APP_2_24 = &APP_2_24.;
APP_3_24 = &APP_3_24.;
APP_4_24 = &APP_4_24.;
APP_5_24 = &APP_5_24.;
APP_1_7DAY = &APP_1_7DAY.;
APP_2_7DAY = &APP_2_7DAY.;
APP_3_7DAY = &APP_3_7DAY.;
APP_4_7DAY = &APP_4_7DAY.;
APP_5_7DAY = &APP_5_7DAY.;
APP_1_1MONTH = &APP_1_1MONTH.;
APP_2_1MONTH = &APP_2_1MONTH.;
APP_3_1MONTH = &APP_3_1MONTH.;
APP_4_1MONTH = &APP_4_1MONTH.;
APP_5_1MONTH = &APP_5_1MONTH.;
APP_1_3MONTH = &APP_1_3MONTH.;
APP_2_3MONTH = &APP_2_3MONTH.;
APP_3_3MONTH = &APP_3_3MONTH.;
APP_4_3MONTH = &APP_4_3MONTH.;
APP_5_3MONTH = &APP_5_3MONTH.;
APP_1_6MONTH = &APP_1_6MONTH.;
APP_2_6MONTH = &APP_2_6MONTH.;
APP_3_6MONTH = &APP_3_6MONTH.;
APP_4_6MONTH = &APP_4_6MONTH.;
APP_5_6MONTH = &APP_5_6MONTH.;
FORMAT DATETIME DATETIME.;
RUN;
PROC APPEND BASE=DATA.FLAGS_&par. DATA=T;
RUN;
%MEND;
%helper(1);
This code runs perfectly if I limit the number of observations in the %helper macro, using an (obs=) in the creation of the a1 dataset. However, when I put no limit on the obs number, i.e. execute the %test macro for every row in the dataset a1, I get errors. In SAS EG, I get a "server disconnected" popup after the status bar hangs at "running data step", and on Base SAS 9.4 I get the error that none of the macro variables have been resolved that are created in the proc sql into.
I'm confused as the code works fine for a limited amount of observations, but when trying on the whole dataset it hangs or gives errors. The dataset I'm doing this for has around 130,000 observations.
The answer to your actual question is that you're simply generating too much macro code and perhaps even simply taking too much time. The way you are doing this is going to operate on an O=n^2 level, as you're basically doing a cartesian join of every record to every record, and then some. 130,000 * 130,000 is a pretty decent sized number, and on top of that you're actually opening the SQL environment several times for each 130,000 rows. Ouch.
The solution is to do it either in a way that isn't too slow, or if it is, in a way that won't have too much overhead at least.
The fast solution is to not do the cartesian join, or to limit how much needs to be joined. One good solution would be to restructure the problem, not require every record to be compared, but instead consider each calendar day, say, a period, especially in the over-24h periods (24h you might do the way you do above, but not the other four). 1 month, 3 month, etc., do you really need to figure out time of day? Probably won't make much difference. If you can get rid of that, then you can use built in PROCs to precompile all possible 1 month periods, all possible 3 month periods, etc., and then join on the appropriate one. But that won't work with 130,000 of them; it would only work if you could limit it to one per day, probably.
If you must do it at the second level (or worse), what you'll want to do is avoid the cartesian join, and instead keep track of the various records you've seen already, and the sums. The short explanation of the algorithm is:
For each row:
Add this row's values to the rolling sums (at the end of the queue)
Check if the current item of the queue is outside of the period; if it is, subtract it from the rolling sums, and check the next item (repeat until not outside of the period), updating the current queue position
Return the sum at this point
This requires checking each row typically twice (except at odd boundaries where you have no rows popped off for several iterations, due to months having different numbers of days). This operates on O=n time, much faster than the cartesian join, and on top of that has far less memory/space required (the cartesian join might need to hit disk space).
The hash version of this solution is below. This will be the fastest solution I think that compares every row. Note that I intentionally make the test data have 1 for every row and same number of rows for every day; that lets you see how it works on a row-wise manner very easily. (For example, every 24h period has 481 rows, because I made 480 rows per day exactly, and 481 includes the same time yesterday - if you change lt to le it will be 480, if you prefer not to include same time yesterday). You can see that the 'month' based periods will have slightly odd results at the boundaries where months change because the '01FEB20xx' to '01MAY20xx' period has far fewer days (and thus rows) than the '01JUL20xx' to '01OCT20xx' period, for example; better would be 30/90/180 day periods.
data test_data;
array app[5] app_1-app_5;
do _i = 1 to 130000;
dt_var = datetime() - _i*180;
do _j = 1 to dim(app);
*app[_j] = floor(rand('Uniform')*6); *generate 0 to 5 integer;
app[_j]=1;
end;
output;
end;
format dt_var datetime17.;
run;
proc sort data=test_data;
by dt_var;
run;
%macro add(array=);
do _i = 1 to dim(app);
&array.[_i] + app[_i];
end;
%mend add;
%macro subtract(array=);
do _i = 1 to dim(app);
&array.[_i] + (-1*app[_i]);
end;
%mend subtract;
%macro process_array_add(array=);
array app_&array. app_&array._1-app_&array._5;
%add(array=app_&array.);
%mend process_array_add;
%macro process_array_subtract(array=, period=, number=);
if _n_ eq 1 then do;
declare hiter hi_&array.('td');
rc_&array. = hi_&array..first();
end;
else do;
rc_&array. = hi_&array..setcur(key:firstval_&array.);
end;
do while (intnx("&period.",dt_var,&number.,'s') lt curr_dt_var and rc_&array.=0);
%subtract(array=app_&array.);
rc_&array. = hi_&array..next();
end;
retain firstval_&array.;
firstval_&array. = dt_var;
%mend process_array_subtract;
data want;
set test_data;
* if _n_ > 10000 then stop;
curr_dt_var = dt_var;
array app[5] app_1-app_5;
if _n_ eq 1 then do;
declare hash td(ordered:'a');
td.defineKey('dt_var');
td.defineData('dt_var','app_1','app_2','app_3','app_4','app_5');
td.defineDone();
end;
rc_a = td.add();
*start macro territory;
%process_array_add(array=24h);
%process_array_add(array=1wk);
%process_array_add(array=1mo);
%process_array_add(array=3mo);
%process_array_add(array=6mo);
%process_array_subtract(array=24h,period=DTDay, number=1);
%process_array_subtract(array=1wk,period=DTDay, number=7);
%process_array_subtract(array=1mo,period=DTMonth, number=1);
%process_array_subtract(array=3mo,period=DTMonth, number=3);
%process_array_subtract(array=6mo,period=DTMonth, number=6);
*end macro territory;
rename curr_dt_var=dt_var;
format curr_dt_var datetime21.3;
drop dt_var rc: _:;
output;
run;
Here's a pure data step non-hash version. On my machine it's actually faster than the hash solution; I suspect it's not actually faster on a machine with a HDD (I have an SSD, so point access is not substantially slower than hash access, and I avoid having to load the hash). I would recommend using it if you don't know hashes very well or at all, as it'll be easier to troubleshoot, and it scales similarly. For most rows it accesses 11 rows, the current row and five other rows twice (one row, subtract it, then another row) for a total of around a million and a half total reads for 130k rows. (Compare that to about 17 billion reads for the cartesian...)
I suffix the macros with "_2" to differentiate them from the macros in the hash solution.
data test_data;
array app[5] app_1-app_5;
do _i = 1 to 130000;
dt_var = datetime() - _i*180;
do _j = 1 to dim(app);
*app[_j] = floor(rand('Uniform')*6); *generate 0 to 5 integer;
app[_j]=1;
end;
output;
end;
format dt_var datetime17.;
run;
proc sort data=test_data;
by dt_var;
run;
%macro add_2(array=);
do _i = 1 to dim(app);
&array.[_i] + app[_i];
end;
%mend add;
%macro subtract_2(array=);
do _i = 1 to dim(app);
&array.[_i] + (-1*app[_i]);
end;
%mend subtract;
%macro process_array_add_2(array=);
array app_&array. app_&array._1-app_&array._5; *define array;
%add_2(array=app_&array.); *add current row to array;
%mend process_array_add_2;
%macro process_array_sub_2(array=, period=, number=);
if _n_ eq 1 then do; *initialize point variable;
point_&array. = 1;
end;
else do; *do not have to do this _n_=1 as we only have that row;
set test_data point=point_&array.; *set the row that we may be subtracting;
end;
do while (intnx("&period.",dt_var,&number.,'s') lt curr_dt_var and point_&array. < _N_); *until we hit a row that is within the period...;
%subtract_2(array=app_&array.); *subtract the rows values;
point_&array. + 1; *increment the point to look at;
set test_data point=point_&array.; *set the new row;
end;
%mend process_array_sub_2;
data want;
set test_data;
*if _n_ > 10000 then stop; *useful for testing if you want to check time to execute;
curr_dt_var = dt_var; *save dt_var value from originally set record;
array app[5] app_1-app_5; *base array;
*start macro territory;
%process_array_add_2(array=24h); *have to do all of these adds before we start subtracting;
%process_array_add_2(array=1wk); *otherwise we have the wrong record values;
%process_array_add_2(array=1mo);
%process_array_add_2(array=3mo);
%process_array_add_2(array=6mo);
%process_array_sub_2(array=24h,period=DTDay, number=1); *now start checking to subtract what we need to;
%process_array_sub_2(array=1wk,period=DTDay, number=7);
%process_array_sub_2(array=1mo,period=DTMonth, number=1);
%process_array_sub_2(array=3mo,period=DTMonth, number=3);
%process_array_sub_2(array=6mo,period=DTMonth, number=6);
*end macro territory;
rename curr_dt_var=dt_var;
format curr_dt_var datetime21.3;
drop dt_var _:;
output; *unneeded in this version but left for comparison to hash;
run;

How do I pass values through variables to a macro in sas

Here is my code
%macro redemptions1(startdate, enddate, sd, ed, sunday1, sunday2);
data _null_;
%put &startdate;
run;
%mend redemptions1;
data _null_;
format tday date9.;
format sd date9.;
format ed date9.;
tday=today();
if weekday(tday) = 1 then do; ed = intnx('day',tday,-9); sd = intnx('day',tday,-15);end;
if weekday(tday) = 2 then do; ed = intnx('day',tday,-3); sd = intnx('day',tday,-9);end;
if weekday(tday) = 3 then do; ed = intnx('day',tday,-4); sd = intnx('day',tday,-10);end;
if weekday(tday) = 4 then do; ed = intnx('day',tday,-5); sd = intnx('day',tday,-11);end;
if weekday(tday) = 5 then do; ed = intnx('day',tday,-6); sd = intnx('day',tday,-12);end;
if weekday(tday) = 6 then do; ed = intnx('day',tday,-7); sd = intnx('day',tday,-13);end;
if weekday(tday) = 7 then do; ed = intnx('day',tday,-8); sd = intnx('day',tday,-14);end;
startdate = (year(sd) - 1900) * 10000 + month(sd) * 100 + day(sd);
enddate = (year(ed) - 1900) * 10000 + month(ed) * 100 + day(ed);
sunday1 = year(intnx('day',sd,-6))*10000+month(intnx('day',sd,-6))*100+day(intnx('day',sd,-6));
sunday2 = year(intnx('day',sd,1))*10000+month(intnx('day',sd,1))*100+day(intnx('day',sd,1));
%redemptions1(startdate,enddate,sd,ed,sunday1,sunday2);
run;
If i pass values through the variables startdate,enddate etc, The redemeptions1 macro just prints 'startdate' instead of actually printing the value of startdate. How do I get it to print the value contained in the variable(s)?
Thanks!
You need to construct a call to the macro as a text string and then tell SAS to execute it using either CALL EXECUTE or the DOSUBL function.
The parameters on a macro call need to be literal text giving the values you want. Your call in the data step starts %redemptions1(startdate,... so the first parameter is the literal text startdate and that's what the macro prints. Instead, you could do something like:
myCall = '%redemptions1(' || startdate || ')';
call execute(myCall);
This construct the necessary call - something like %redemptions1(09MAR2017) - and then executes it. You could of course do this in one line:
call execute('%redemptions1(' || startdate || ')');
You'll need to fill in the values of the other parameters, of course.
Your date calculations look a bit sketchy, by the way - startdate and enddate may not contain the values you think they do. Please look up the dhms function to see if that might help. You're creating a number like '1170309' for today's date - 1 million, 170 thousand, 3 hundred and 9. The year value is very odd - do you really want 117 (2017 - 1900)?. If you ask SAS to handle that value as a date, it will treat it as a number of days since 01JAN1960, which would be some date way in the future.

Deleting the zip files from an external folder having 2 year old using SAS

Iam trying with the below code and I could able to delete the files which is in zip format but I need to know how we can filter to delete only the 2 year old files . The file name will be SFRE_BIL_SIT_20160812_134317_PAM_FILES1.zip
when doing the filtering for deleting the 2 year old fiIes I need to concentrate only the year and month part for the file to do the deletion because the day part would be changing . Please find below the code
options mlogic;
%macro delete_all_zip_files_in_folder(folder);
filename filelist "&folder";
data _null_;
dir_id = dopen('filelist');
total_members = dnum(dir_id);
do i = 1 to total_members;
member_name = dread(dir_id,i);
if scan(lowcase(member_name),2,'.')='zip' then do;
file_id = mopen(dir_id,member_name,'i',0);
if file_id > 0 then do;
freadrc = fread(file_id);
rc = fclose(file_id);
rc = filename('delete',member_name,,,'filelist');
rc = fdelete('delete');
end;
rc = fclose(file_id);
end;
end;
rc = dclose(dir_id);
run;
%mend;
%delete_all_zip_files_in_folder(C:\Users\UCS1MKP\Desktop\test)
If your files are named in the same manner - 3 part code then date, all separated with _ you can use scan(). You allready used scan() to check for files with zip extension. You can do the same to extract date part from file name and then parse it and convert to numbers.
Solution 1
Fourth word of member_name is YYYYMMDD part:
datestring = scan(member_name,4,'_')
First 4 digits are year:
year = input(substr(datestring,1,4),best.)
Then 2 digits for month:
month = input(substr(datestring,5,2),best.)
Finally 2 digits for day (just in case you will need day):
day = input(substr(datestring,5,2),best.)
At the end just compare with actual month and year:
if month = month(today()) and year = year(today()) then do;
*your code for deleting file;
end;
But t will work only for given month. To make it delete files older than 2 years you can do something like below.
Solution 2
Allready having all date parts construct date:
date = mdy(month, day, year)
Then compare it with actual date minus 2 years. To shift date exacly 2 years from today you should use intnx() function
if intnx('year', today(), -2, 'S') > date do;
*your code for deleting file;
end;

SAS: attempting to build a loop for uploading multiple files

I'm attempting to build a loop in SAS to upload several files, and am running into a few issues to work through. Current code:
%Macro Weatherupload(File=, output=);
proc import datafile = &File;
out = &output;
dbms=dlm replace;
delimiter= ",";
getnames=yes;
guessingrows = 1000;
run;
%Mend Weatherupload;
%Macro WeatherPrepare(input=, output=);
data &output (keep=Wban_Number _YearMonthDay DewPoint Temp _Avg_Dew_Pt _Avg_Temp year month day);
set &input;
DewPoint = Input(compress(_Avg_Dew_Pt,"*"), 3.);
Temp = Input(compress(_Avg_Temp,"*"), 3.);
year = (_yearmonthday - mod(_yearmonthday, 10000))/10000;
month = ((_yearmonthday - mod(_yearmonthday, 100)) - (_yearmonthday - mod(_yearmonthday,10000)))/100;
day = mod(_yearmonthday, 100);
drop _Avg_Dew_Pt _Avg_Temp _YearMonthDay;
run;
%Mend WeatherPrepare;
data temperatures;
do i = 1999 to 2015;
do j = 1 to 12;
name = 'C:\Users\DILLON.SAXE\Documents\'||i||j||'.tar'||' \'||i||j||'daily.txt';
output = i||j||'weather';
final = i||j||'final';
%Weatherupload(File=name, output=output)
%WeatherPrepare(input=output, output=final)
end;
end;
run;
The goal is to run through several files, in several folders, listed in month + day + rest of title, and (at the moment) upload two variables of data from them. Later I will want to add in merging the files, and doing some more data work, but for the moment it's the macro issues and uploading that are holding it up.
Is there a way to either use proc upload in a loop, or use another data step in the loop?
I get the error "more positional variables than (something)" (I forget exact error, but it lists positional variables). I've tried adding and removing commas in the macros, but have not been able to get rid of this error. Any ideas?
I don't think you can call macro's like you have in your data step. I think you're intending to use Call Execute.
data temperatures;
do i = 1999 to 2015;
do j = 1 to 12;
name = 'C:\Users\DILLON.SAXE\Documents\'||i||j||'.tar'||' \'||i||j||'daily.txt';
output = i||j||'weather';
final = i||j||'final';
call execute('%Weatherupload(File='||name||', output='||output||')');
call execute('%WeatherPrepare(input='||output||', output='||final||')');
end;
end;
run;
Alternatively, assuming you're trying to read all files in a folder, I think you should be creating a list of file names in a data set, use a data step with the filename option to input all files at once instead. Here's a brief method on how to do it if all where in a single folder: https://communities.sas.com/docs/DOC-10426
Here is a page that has code to get a list of files into a data set
http://www.sascommunity.org/wiki/Making_Lists
since your macros have neither conditionals (%if) nor loops (%do)
then I suggest you use them as parameterized %incudes
Here is a tool to read the list-of-files data set and call a program
http://www.sascommunity.org/wiki/Call_Execute_Parameterized_Include
note: in proc import always set guessingrows to the max value;
in v9.3 that is 2147483647;
Got it sorted out, based on the first answer. Eventual code:
%Macro Weatherupload(File=, output=);
proc import datafile = "&File"
out = &output
dbms=dlm replace;
delimiter= ",";
getnames=yes;
guessingrows = 1000;
run;
%Mend Weatherupload;
%Macro WeatherPrepare(input=, output=);
data &output;
set &input;
DewPoint = Input(compress(_Avg_Dew_Pt,"*"), 3.);
Temp = Input(compress(_Avg_Temp,"*"), 3.);
year = (_yearmonthday - mod(_yearmonthday, 10000))/10000;
month = ((_yearmonthday - mod(_yearmonthday, 100)) - (_yearmonthday - mod(_yearmonthday,10000)))/100;
day = mod(_yearmonthday, 100);
keep Wban_Number DewPoint Temp year month day;
run;
%Mend WeatherPrepare;
%Macro WeatherPrepare2(input=, output=);
data &output;
set &input;
DewPoint = Input(DewPoint, 3.);
Temp = Input(compress(_Avg_Temp,"*"), 3.);
year = (_yearmonthday - mod(_yearmonthday, 10000))/10000;
month = ((_yearmonthday - mod(_yearmonthday, 100)) - (_yearmonthday - mod(_yearmonthday,10000)))/100;
day = mod(_yearmonthday, 100);
Wban_Number = Wban;
keep Wban_Number DewPoint Temp year month day;
run;
%Mend WeatherPrepare;
%Macro Append(merge=);
data temperatures;
set temperatures &merge;
%Mend Append;
data temperatures;
do i = 1999 to 2015;
do j = 1 to 12;
jzero = put(j, z2.);
name = compress('C:\Users\DILLON.SAXE\Documents\'||i||jzero||'.tar'||'\'||i||jzero||'daily.txt');
name2 = compress('C:\Users\DILLON.SAXE\Documents\'||'QCLCD'||i||jzero||'\'||i||jzero||'daily.txt');
output = compress('weather'||i||j);
final = compress('final'||i||j);
if 1000*i+j < 200708 then
do;
call execute('%Weatherupload(File='||name||', output='||output||')');
call execute('%WeatherPrepare(input='||output||', output='||final||')');
end;
else
do;
call execute('%Weatherupload(File='||name2||', output='||output||')');
call execute('%WeatherPrepare2(input='||output||', output='||final||')');
end;
call execute('%Append(merge='||final||')');
end;
end;
drop i j jzero name name2 output final;
run;