SAS Selecting the most recent dataset in a lib automatically - sas

i am working with a library that is updated every month or so, and i need a way to select the most recent dataset each month, i tried two methods that would show me the latest table, one that makes a table ordering from the modified date
proc sql;
create table tables as
select memname, modate
from dictionary.tables
where libname = 'SASHELP'
order by modate desc;
quit;
and one that gives me just the latest modified one
proc sql;
select memname into> latest_dataset
from dictionary.tables
where libname='WORK'
having crdate=max(crate);
%put &=latest_dataset;
and i would like to put these latest datasets in a table, but i don't know how, or if there is another easier way to do this, i am still very much new to SAS programming so i'm lost, any help is appreciated.

Use Proc APPEND to put the latest data set into a table. You are essentially accumulating rows.
Use SQL :INTO to obtain (place into a macro variable) the libname.memname of the data set that should be appended.
Example:
The task of determining the newest data set and appending it to a base table is also in a macro so the the code can be easily rerun in the example.
%macro append_newest;
%local newest_table;
proc sql noprint;
select catx('.', libname, memname) into :newest_table
from dictionary.tables
where libname = 'WORK'
and memtype = 'DATA'
having crdate = max(crdate);
%put NOTE: &=newest_table;
create view newest_view as
select "&newest_table" as row_source length=41, *
from &newest_table
;
proc append base=work.master data=newest_view;
run;
%mend;
* create an empty for accumulating new observations;
data work.master;
length row_source $41;
set one (obs=0);
run;
data work.one;
set sashelp.class;
where name between 'A' and 'E';
%append_newest;
data work.two;
set sashelp.class;
where name between 'Q' and 'ZZ';
%append_newest;
data work.three;
set sashelp.class;
where name between 'E' and 'Q';
%append_newest;
Will produce this master table that accumulates the little pieces that come in day by day.
You would want additional constraints such as a unique key in order to prevent appending the same data more than once.

Related

Append new column to existing table using SAS

I have a do loop in which I do calculation on new variable and results are stored as additional column, this column-s (at each iteration) should be attached to the output table defined by macro.
Here on SO something similar has been asked but the answer is not acceptable, the last answer is not compatible with sas command but very close, getting incomplete script with following:
proc sql;
update &outlib..&out.
set var._iqr = b.&var._iqr
from &outlib..&out. as a
left join cal_resul as b
on a.id_client=b.id_client
and a.reference_date=b.reference_date;
quit;
Here is my attempt which works but very slow:
proc sql; create table &outlib..&out. as select * from &inlib..&in.; quit; /* the input is as a basis for output table */
proc sql; alter table &outlib..&out. add &var._iqr numeric; quit; /* create empty column to be filled at each iteration */
proc sql;
update &outlib..&out. as a
set &var._iqr=(select b.&var._iqr from cal_resul as b
where a.id_client=b.id_client
and a.reference_date=b.reference_date
and a.data_source=b.data_source);
quit;
Attempt 2:
This is somewhat faster:
proc sort data=cal_resul; by id_client reference_date data_source; run;
data &outlib..&out.;
update &outlib..&out. cal_resul;
by id_client reference_date data_source;
run;
Simple left join (adding new column into existing table is way faster) but with left join I did not figure out how I can update (always retain the same dataset) the &outlib..&out. at each iteration. Many thanks for any help;
If you want to ADD a variable to a dataset you will have to make a new dataset. (Your ALTER TABLE statement will create a new dataset and copy over all of the observations.)
Looks like your data has three key variables. So use those in merging the new data to the old.
For example to make a new variable in HAVE named EXAMPLE_IQR using the variable EXAMPLE in the dataset NEW you could use code like this. I have used macro variables to show how you might use those macro variables as the parameters to a macro. It sounds like you don't want the process to add new observations to the existing dataset so I have added a check for that using the IN= dataset option.
%let base=work.have;
%let indata=work.new;
%let var=example;
data &base ;
merge &base(in=inbase)
&indata(keep=id_client reference_date data_source &var
rename=(&var=&var._iqr)
)
;
by id_client reference_date data_source;
if inbase;
run;

SAS EG : Selecting the most recent dataset from a library

I use sas EG V5.1. I need to select the most recent dataset saved within a permanent library. How can I do that without having to look at the library?
employee_2016_09_04
employee_2016_09_15
first_2016_09_04
first_2016_09_14
I need to select the most recent tables of either category and these are SAS datasets. I currently have a macro variable defined for the date which I update manually each time I run a code. Any help is appreciated. Thanks
You can use dictionary tables.
Using modification (modate) or creation column (crdate).
proc sql;
create table tables as
select memname, modate
from dictionary.tables
where libname = 'SASHELP'
order by modate desc;
quit;
Or use sort on variable name (memname).
proc sql;
create table tables as
select memname
from dictionary.tables
where libname = 'SASHELP'
order by memname desc;
quit;
Or the same thing using sashelp views
data tables;
set sashelp.vtable;
where libname = 'SASHELP';
keep memname modate;
run;
proc sort data=tables;
by descending modate;
run;

How to obtain the number of records of a dataset in SAS

I want to count the number of records in a dataset in SAS. There is a function the make this thing in a simple way? I used R ed for obtain this information there was the length() function. Morover I need the number of record to compute some percetages so I need this value not in a table but in a value that can be used for other data step. How can I fix?
Thanks in advance
Here is another solution, using SAS dictionaries,
proc sql;
select nobs into: num_obs
from dictionary.tables
where libname = "WORK" and memname = "A"
;
quit;
It is easy to get the size of many datasets by modifying the above code,
proc sql;
create table test as
select memname, nobs
from dictionary.tables
where libname = "WORK" and memname like "A%"
;
quit;
data _null_;
set test;
call symput(memname, nobs);
run;
The above code will give you the sizes of all data sets with name starting with "a" in the temporary/work library.
Assuming this is a basic SAS table that you've created, and not modified or appended to, the best way is to use the meta data held in a dataset (the Number of tries is held in a piece of meta data called "nobs"), without reading through the dataset its self and place it in a macro variable. You can do this in the following way:
Data _null_;
i=1;
If i = 0 then set DATASETTOCOUNT nobs= mycount;
Call symput('mycount', mycount);
Run;
%put &mycount.;
You will now have a macro variable that contains the number of rows in your dataset, that you can call on in other data steps using &mycount.

Reading a list of name to a SAS Macro

I am trying to read a list of values into a macro, so that the macro variable would contain the table name and create a column that would contain the table name.
My attempt, which is wrong, was trying to use the code below, and erroring out because of the line " '&tbl' as Table_Dt ". The code below is inefficient, so feel free to enhance it. Thanks for your help.
%macro flat(tbl);
proc sql exec feedback stimer noprint outobs=5;
CREATE TABLE &tbl as
SELECT
ID,
DOB,
'&tbl' as Table_Dt
FROM &tbl..flat_file;
QUIT;
%mend flat;
%flat(flat0113);
%flat(flat0213);
...
%flat(flat1213);
As you are basically processing a list, this could also be done using call execute. No need to write all the information to macro variables. All tables/libraries are already stored in the sashelp tables and therefore are ready for list processing.
data _null_;
set sashelp.vslib (where=(substr(libname,1,4) = 'FLAT')) end =eof;
if _n_ = 1 then call execute ('proc sql exec feedback stimer noprint outobs=5;');
call execute ('
CREATE TABLE '|| libname ||' AS
SELECT ID,
DOB,
"'||compress(libname)||'" as Table_Dt
FROM '||compress(libname)||'.flat_file
;
');
if eof then call execute ('QUIT;');
run;
Macros in quotation marks will only resolve with double quotes, not single. If you want to do a more efficient way, you can do so with the following modified code. I am assuming that you are reading from libraries named flat0113, flat0213, etc.
Step 1: Get a list of all the libnames with the word "flat" in it
proc sql noprint;
select distinct libname
, count(libname)
into: tbl_list separated by ' '
, total_tbls
from sashelp.vmember
where libname LIKE 'FLAT%'
;
quit;
This will create two macro variables: &tbl_list, and &total_tbls.
&tbl_list holds the values flat0113 flat0213 flat ... flat1213.
&total_tbls holds the total number of values in &tbl_list.
Step 2: Loop through the newly created list
%macro readTables;
%do i = 1 %to &total_tbls;
%let tbl = %scan(tbl_list, &i);
proc sql exec feedback stimer noprint outobs=5;
CREATE TABLE &tbl as
SELECT
ID,
DOB,
"&tbl" as Table_Dt
FROM &tbl..flat_file;
quit;
%end;
%mend;
%readTables;
This will read each individual value from &tbl_list one by one until the very end of the list.

SAS : How to iterate a dataset elements within the proc sql WHERE statement?

I need to create multiple tables using proc sql
proc sql;
/* first city */
create table London as
select * from connection to myDatabase
(select * from mainTable
where city = 'London');
/* second city */
create table Beijing as
select * from connection to myDatabase
(select * from mainTable
where city = 'Beijing');
/* . . the same thing for other cities */
quit;
The names of those cities are in the sas table myCities
How can I embed the data step into proc sql in order to iterate through all cities ?
proc sql noprint;
select quote(city_varname) into :cities separated by ',' from myCities;
quit;
*This step above creates a list as a macro variable to be used with the in() operator below. EDIT: Per Joe's comment, added quote() function so that each city will go into the macro-var list within quotes, for proper referencing by in() operator below.
create table all_cities as
select * from connection to myDatabase
(select * from mainTable
where city in (&cities));
*this step is just the step you provided in your question, slightly modified to use in() with the macro-variable list defined above.
One relatively simple solution to this is to do this entirely in a data step. Assuming you can connect via libname (which if you can connect via connect to you probably can), let's say the libname is mydb. Using a similar construction to Max Power's for the first portion:
proc sql noprint;
select city_varname
into :citylist separated by ' '
from myCities;
select cats('%when(var=',city_varname,')')
into :whenlist separated by ' '
from myCities;
quit;
%macro when(var=);
when "&var." output &var.;
%mend when;
data &citylist.;
set mydb.mainTable;
select(city);
&whenlist.;
otherwise;
end;
run;
If you're using most of the data in mainTable, this probably wouldn't be much slower than doing it database-side, as you're moving all of the data anyway - and likely it would be faster since you only hit the database once.
Even better would be to pull this to one table (like Max shows), but this is a reasonable method if you do need to create multiple tables.
You need to put your proc sql code into a SAS Macro.
Create a macro-variable for City (in my example I called the macro-variable "City").
Execute the macro from a datastep program. Since the Datastep program processes one for each observation, there is no need to create complex logic to iterate.
data mycities;
infile datalines dsd;
input macrocity $ 32.;
datalines;
London
Beijing
Buenos_Aires
;
run;
%macro createtablecity(city=);
proc sql;
/* all cities */
create table &city. as
select * from connection to myDatabase
(select * from mainTable
where city = "&city.");
quit;
%mend;
data _null_;
set mycities;
city = macrocity;
call execute('%createtablecity('||city||')');
run;
Similar to the other solutions here really, maybe a bit simpler... Pull out a distinct list of cities, place into macros, run SQL query within a do loop.
Proc sql noprint;
Select distinct city, count(city) as c
Into :n1-:n999, :c
From connection to mydb
(Select *
From mainTable)
;
Quit;
%macro createTables;
%do a=1 %to &c;
Proc sql;
Create table &&n&a as
Select *
From connection to myDb
(Select *
From mainTable
Where city="&&n&a")
;
Quit;
%end;
%mend createTables;
%createTables;