Append data after checking a condition - if-statement

I have a yearly table which on a monthly basis will append data from another table. However, I need to check the max date on the monthly table before appending it. If the max date on monthly is same as YTD, then do not append else append. How can I achieve this in SAS.
I tried using append but don't know how to check the dates before appending.

You can use a macro.
First, insert the last year/month id on a variable and preper a macro to execute the append operation:
proc sql;
select max(yourDtcolumn) into : var
from yourTable;
quit;
%macro append;
proc sql;
insert into yourtable
select * from sourcetable;
quit;
%mend;
then, verify if the variable are the same:
%macro verify;
%if &var > &curMonth %then %do;
%append;
%end;
%mend;
finally, you call the macro to execute:
%verify;

I'd skip the check, and simply stack the existing data (excluding the current month, if it exists) and the new data.
/* Get period from new data */
proc sql ;
select min(period) into :LATEST
from new ;
quit ;
/* Append new to master, and save back master */
data perm.master ;
set perm.master (where=(period < &LATEST))
new ;
run ;

Related

fdelete in sas, delete files from a backup

I want to delete files from backup if there are more than three files.
filename parent '/abc/cde';
if is less than three files in parent directory sas code does nothing.
The name of sample file:
ABC_1117_02MAY2016.txt
- all have the same length.
if count()>3 then sas code returns all dates from substring:
02MAY2016,
because I want to delete all files with dates less than the third largest.
data all_files;
keep substr_date;
did=dopen("parent");
if dnum(did)>3 then do;
do i=1 to dnum(did);
wycinek_z_daty=substr(dread(did,i),10,9);
output;
end;
end;
run;
I sort it
proc sort data=all_files;
by descending substr_date;
run;
It is what I don't want to delete
data backup;
set all_files(obs=3);
run;
I create a table with all what I want to delete.
proc sql;
create table delete as
select*from all_files except select*from backup;
quit;
How I can remove these files that I have in 'delete' table? I know I supposed to use fdelete funtion
%macro test;
%do i=1 %to &sqlobs;
fdelete('/abc/cde/ABC_1117_&something. Can I use macro variable for i? because only date in name is changing?)
%end;
%mend;
Thanks for help,
aola
Once you have the delete table, I would write the macro like so:
%macro delete_file(file=);
fdelete("&file.");
%mend;
Then you can call it:
proc sql;
select cats('%delete_file(file=abc_1117_',substr_date,')')
into :dellist separated by ' '
from delete;
quit;
&dellist.;
I would probably just save the whole filename in delete dataset and then use that rather than hardcoding abc_1117_, but that's your call.

Reading a list of name to a SAS Macro

I am trying to read a list of values into a macro, so that the macro variable would contain the table name and create a column that would contain the table name.
My attempt, which is wrong, was trying to use the code below, and erroring out because of the line " '&tbl' as Table_Dt ". The code below is inefficient, so feel free to enhance it. Thanks for your help.
%macro flat(tbl);
proc sql exec feedback stimer noprint outobs=5;
CREATE TABLE &tbl as
SELECT
ID,
DOB,
'&tbl' as Table_Dt
FROM &tbl..flat_file;
QUIT;
%mend flat;
%flat(flat0113);
%flat(flat0213);
...
%flat(flat1213);
As you are basically processing a list, this could also be done using call execute. No need to write all the information to macro variables. All tables/libraries are already stored in the sashelp tables and therefore are ready for list processing.
data _null_;
set sashelp.vslib (where=(substr(libname,1,4) = 'FLAT')) end =eof;
if _n_ = 1 then call execute ('proc sql exec feedback stimer noprint outobs=5;');
call execute ('
CREATE TABLE '|| libname ||' AS
SELECT ID,
DOB,
"'||compress(libname)||'" as Table_Dt
FROM '||compress(libname)||'.flat_file
;
');
if eof then call execute ('QUIT;');
run;
Macros in quotation marks will only resolve with double quotes, not single. If you want to do a more efficient way, you can do so with the following modified code. I am assuming that you are reading from libraries named flat0113, flat0213, etc.
Step 1: Get a list of all the libnames with the word "flat" in it
proc sql noprint;
select distinct libname
, count(libname)
into: tbl_list separated by ' '
, total_tbls
from sashelp.vmember
where libname LIKE 'FLAT%'
;
quit;
This will create two macro variables: &tbl_list, and &total_tbls.
&tbl_list holds the values flat0113 flat0213 flat ... flat1213.
&total_tbls holds the total number of values in &tbl_list.
Step 2: Loop through the newly created list
%macro readTables;
%do i = 1 %to &total_tbls;
%let tbl = %scan(tbl_list, &i);
proc sql exec feedback stimer noprint outobs=5;
CREATE TABLE &tbl as
SELECT
ID,
DOB,
"&tbl" as Table_Dt
FROM &tbl..flat_file;
quit;
%end;
%mend;
%readTables;
This will read each individual value from &tbl_list one by one until the very end of the list.

SAS loop through datasets

I have multiple tables in a library call snap1:
cust1, cust2, cust3, etc
I want to generate a loop that gets the records' count of the same column in each of these tables and then insert the results into a different table.
My desired output is:
Table Count
cust1 5,000
cust2 5,555
cust3 6,000
I'm trying this but its not working:
%macro sqlloop(data, byvar);
proc sql noprint;
select &byvar.into:_values SEPARATED by '_'
from %data.;
quit;
data_&values.;
set &data;
select (%byvar);
%do i=1 %to %sysfunc(count(_&_values.,_));
%let var = %sysfunc(scan(_&_values.,&i.));
output &var.;
%end;
end;
run;
%mend;
%sqlloop(data=libsnap, byvar=membername);
First off, if you just want the number of observations, you can get that trivially from dictionary.tables or sashelp.vtable without any loops.
proc sql;
select memname, nlobs
from dictionary.tables
where libname='SNAP1';
quit;
This is fine to retrieve number of rows if you haven't done anything that would cause the number of logical observations to differ - usually a delete in proc sql.
Second, if you're interested in the number of valid responses, there are easier non-loopy ways too.
For example, given whatever query that you can write determining your table names, we can just put them all in a set statement and count in a simple data step.
%let varname=mycol; *the column you are counting;
%let libname=snap1;
proc sql;
select cats("&libname..",memname)
into :tables separated by ' '
from dictionary.tables
where libname=upcase("&libname.");
quit;
data counts;
set &tables. indsname=ds_name end=eof; *9.3 or later;
retain count dataset_name;
if _n_=1 then count=0;
if ds_name ne lag(ds_name) and _n_ ne 1 then do;
output;
count=0;
end;
dataset_name=ds_name;
count = count + ifn(&varname.,1,1,0); *true, false, missing; *false is 0 only;
if eof then output;
keep count dataset_name;
run;
Macros are rarely needed for this sort of thing, and macro loops like you're writing even less so.
If you did want to write a macro, the easier way to do it is:
Write code to do it once, for one dataset
Wrap that in a macro that takes a parameter (dataset name)
Create macro calls for that macro as needed
That way you don't have to deal with %scan and troubleshooting macro code that's hard to debug. You write something that works once, then just call it several times.
proc sql;
select cats('%mymacro(name=',"&libname..",memname,')')
into :macrocalls separated by ' '
from dictionary.tables
where libname=upcase("&libname.");
quit;
&macrocalls.;
Assuming you have a macro, %mymacro, which does whatever counting you want for one dataset.
* Updated *
In the future, please post the log so we can see what is specifically not working. I can see some issues in your code, particularly where your macro variables are being declared, and a select statement that is not doing anything. Here is an alternative process to achieve your goal:
Step 1: Read all of the customer datasets in the snap1 library into a macro variable:
proc sql noprint;
select memname
into :total_cust separated by ' '
from sashelp.vmember
where upcase(memname) LIKE 'CUST%'
AND upcase(libname) = 'SNAP1';
quit;
Step 2: Count the total number of obs in each data set, output to permanent table:
%macro count_obs;
%do i = 1 %to %sysfunc(countw(&total_cust) );
%let dsname = %scan(&total_cust, &i);
%let dsid=%sysfunc(open(&dsname) );
%let nobs=%sysfunc(attrn(&dsid,nobs) );
%let rc=%sysfunc(close(&dsid) );
data _total_obs;
length Member_Name $15.;
Member_Name = "&dsname";
Total_Obs = &nobs;
format Total_Obs comma8.;
run;
proc append base=Total_Obs
data=_total_obs;
run;
%end;
proc datasets lib=work nolist;
delete _total_obs;
quit;
%mend;
%count_obs;
You will need to delete the permanent table Total_Obs if it already exists, but you can add code to handle that if you wish.
If you want to get the total number of non-missing observations for a particular column, do the same code as above, but delete the 3 %let statements below %let dsname = and replace the data step with:
data _total_obs;
length Member_Name $7.;
set snap1.&dsname end=eof;
retain Member_Name "&dsname";
if(NOT missing(var) ) then Total_Obs+1;
if(eof);
format Total_Obs comma8.;
run;
(Update: Fixed %do loop in step 2)

Check if a value exists in a dataset or column

I have a macro program with a loop (for i in 1 to n). With each i i have a table with many columns - variables. In these columns, we have one named var (who has 3 possible values: a b and c).
So for each table i, I want to check his column var if it exists the value "c". If yes, I want to export this table into a sheet of excel. Otherwise, I will concatenate this table with others.
Can you please tell me how can I do it?
Ok, in your macro at step i you have to do something like this
proc sql;
select min(sum(case when var = 'c' then 1 else 0 end),1) into :trigger from table_i;
quit;
then, you will get macro variable trigger equal 1 if you have to do export, and 0 if you have to do concatenetion. Next, you have to code something like this
%if &trigger = 1 %then %do;
proc export data = table_i blah-blah-blah;
run;
%end;
%else %do;
data concate_data;
set concate_data table_i;
run;
%end;
Without knowing the whole nine yard of your problem, I am at risk to say that you may not need Macro at all, if you don't mind exporting to .CSV instead of native xls or xlsx. IMHO, if you do 'Proc Export', meaning you can't embed fancy formats anyway, you 'd better off just use .CSV in most of the settings. If you need to include column headings, you need to tap into metadata (dictionary tables) and add a few lines.
filename outcsv '/share/test/'; /*define the destination for CSV, change it to fit your real settings*/
/*This is to Cat all of the tables first, use VIEW to save space if you must*/
data want1;
set table: indsname=_dsn;
dsn=_dsn;
run;
/*Belowing is a classic 2XDOW implementation*/
data want;
file outcsv(test.csv) dsd; /*This is your output CSV file, comma delimed with quotes*/
do until (last.dsn);
set want1;
by dsn notsorted; /*Use this as long as your group is clustered*/
if var='c' then _flag=1; /*_flag value will be carried on to the next DOW, only reset when back to top*/
end;
do until (last.dsn);
set want1;
by dsn notsorted;
if _flag=1 then put (_all_) (~); /*if condition meets, output to CSV file*/
else output; /*Otherwise remaining in the Cat*/
end;
drop dsn _flag;
run;

An efficient way to update multiple columns using join

I am really struggling with it guys.
The table needs to be updates has ~15M rows and ~200 columns.
I need to update few columns using a work table table.
This is (partly) what I need to do:
%macro condition;
%if &row_count>0 %then %do;
data _null_;
set W4TWGKJ6 end=final;
if _n_ = 1 then call execute("proc sql ;");
call execute
("update dds.insurance_policy set X_STORNO_BY_VERSION="||TOSNUM||" where policy_no='"||cats(polid)||"' and X_INSURANCE_PRODUCT_CD='"||cats(prodid)||"'
and X_INSURER_SERIAL_NO = "||X_INSURER_SERIAL_NO||" and x_source_system_cd ="||'"5"'||" and x_source_system_category_cd ="||'"5"'||" and x_current_ind = "||'"Y"'||";,
update dds.insurance_policy set STATUS_CHANGE_DT="||ISSUE_DT||" where policy_no='"||cats(polid)||"' and X_INSURANCE_PRODUCT_CD='"||cats(prodid)||"'
and X_INSURER_SERIAL_NO = "||X_INSURER_SERIAL_NO||" and x_source_system_cd ="||'"5"'||" and x_source_system_category_cd ="||'"5"'||" and x_current_ind = "||'"Y"'||";");
if final then call execute('quit;'); run;
%end;
%mend;
%condition;
I first check if there are rows in table (&row_count)
if there are,
I update 2 columns (I need to update 5, I just cut them from the example)
using a work table called W4TWGKJ6.
This update takes forever.
In fact, I stopped the process every single time, as it worked for hours without returning anything....
Does anyone knows a better solution for this problem?
Thanks in advance,
Gal.
I'd suggest using MODIFY statement in datastep:
You should have same column names in both tables for BY variables and have them sorted by those variables.
data dds.insurance_policy;
modify
dds.insurance_policy
W4TWGKJ6 (keep= POLICY_NO X_INSURER_SERIAL_NO /* key variables */
X_STORNO_BY_VERSION STATUS_CHANGE_DT /* ... other variables from source to update target */
updatemode=nomissingcheck;
by POLICY_NO X_INSURER_SERIAL_NO;
if _iorc_ = %sysrc(_SOK) then do;
* Update row ;
replace;
end;
else _error_ = 0;
run;
See SAS: How not to overwrite a dataset when the "where" condition in a "Modify" statement does not hold? for complete reference of iorc return values.