Reading a list of name to a SAS Macro - list

I am trying to read a list of values into a macro, so that the macro variable would contain the table name and create a column that would contain the table name.
My attempt, which is wrong, was trying to use the code below, and erroring out because of the line " '&tbl' as Table_Dt ". The code below is inefficient, so feel free to enhance it. Thanks for your help.
%macro flat(tbl);
proc sql exec feedback stimer noprint outobs=5;
CREATE TABLE &tbl as
SELECT
ID,
DOB,
'&tbl' as Table_Dt
FROM &tbl..flat_file;
QUIT;
%mend flat;
%flat(flat0113);
%flat(flat0213);
...
%flat(flat1213);

As you are basically processing a list, this could also be done using call execute. No need to write all the information to macro variables. All tables/libraries are already stored in the sashelp tables and therefore are ready for list processing.
data _null_;
set sashelp.vslib (where=(substr(libname,1,4) = 'FLAT')) end =eof;
if _n_ = 1 then call execute ('proc sql exec feedback stimer noprint outobs=5;');
call execute ('
CREATE TABLE '|| libname ||' AS
SELECT ID,
DOB,
"'||compress(libname)||'" as Table_Dt
FROM '||compress(libname)||'.flat_file
;
');
if eof then call execute ('QUIT;');
run;

Macros in quotation marks will only resolve with double quotes, not single. If you want to do a more efficient way, you can do so with the following modified code. I am assuming that you are reading from libraries named flat0113, flat0213, etc.
Step 1: Get a list of all the libnames with the word "flat" in it
proc sql noprint;
select distinct libname
, count(libname)
into: tbl_list separated by ' '
, total_tbls
from sashelp.vmember
where libname LIKE 'FLAT%'
;
quit;
This will create two macro variables: &tbl_list, and &total_tbls.
&tbl_list holds the values flat0113 flat0213 flat ... flat1213.
&total_tbls holds the total number of values in &tbl_list.
Step 2: Loop through the newly created list
%macro readTables;
%do i = 1 %to &total_tbls;
%let tbl = %scan(tbl_list, &i);
proc sql exec feedback stimer noprint outobs=5;
CREATE TABLE &tbl as
SELECT
ID,
DOB,
"&tbl" as Table_Dt
FROM &tbl..flat_file;
quit;
%end;
%mend;
%readTables;
This will read each individual value from &tbl_list one by one until the very end of the list.

Related

How to select columns only containing the certain string in SAS [duplicate]

I would like to know is it possible to perform an action that keeps only columns that contain a certain character.
For example, lets say that I have columns: name, surname, sex, age.
I want to keep only columns that start with letter 's' (surname and sex).
How do I do that?
There's several variations on how to filter out names.
For prefixes or lists of variables it's pretty easy. For suffixes or more complex patterns it keeps more complicated. In general you can short cut lists as follows:
_numeric_ : all numeric variables
_character_ : all character variables
_all_ : all variables
prefix1 - prefix# : all variables with the same prefix assuming they're numbered
prefix: : all variables that start with prefix
firstVar -- lastVar : variables based on location between first and last variable, including the first and last.
first-numeric-lastVar : variables that are numeric based on location between first and last variable
Anything more complex requires that you filter it via the metadata list. SAS basically keeps some metadata about each data set so you can query that information to build your lists. Data about columns and types are in the sashelp.vcolumn or dictionary.column data set.
To filter all columns that have the word mpg for example:
*generate variable list;
proc sql noprint;
select name into :var_list separated by " "
from sashelp.vcolumn
where libname = 'SASHELP' and memname = 'CARS'
and lowcase(name) like '%mpg%';
quit;
*check log for results;
%put &var_list;
*verification from original table;
proc contents data=sashelp.cars;
run;
*example of usage;
data want;
set sashelp.cars;
keep &var_list;
run;
Some more details are available in this blog post and here (documentation).
If you want do keep only variables that start with an s, then use name prefix list operator :.
data want;
set have(keep=s:);
run;
It's possible.
In the code below I created a macro variable that has the name of columns that have in a table. After run the code you will have the name of columns you want.
PROC SQL;
SELECT
NAME
INTO:
NMVAR /* SAVE IN MACRO VARIABLE */
FROM SASHELP.VCOLUMN
WHERE
LIBNAME EQ "YOUR LIBNAME" AND /* THE NAME OF LIB MUST BE WRITTEN IN UPPERCASE */
MEMNAME EQ "YOUR TABLE" AND /* THE NAME OF 'TABLE/DATA SET' MUST BE WRITTEN IN UPPERCASE */
SUBSTR(NAME,1,1) EQ "S";
RUN;
For complex variable name selection filtering, such as regular expressions, or lookup in external metadata control table, you will need to process the metadata of the table itself to construct source that can be applied.
This example demonstrates two, of many, ways that source code can be generated.
metadata table from target table, Proc CONTENTS
process metadata, Proc SQL
construct source code
Expectation of name lists < 64K
SQL INTO :<macro-variable> for source code expected to be < 64K characters
Very large name lists or robust
A macro that streams source code from metadata table
From a data set with 50,000 variables select the columns whose name contains 2912
data have;
retain id 'HOOPLA12345' x1-x50000 .;
stop;
run;
* obtain metadata of target table;
proc contents noprint data=have
out=varlist_table
( keep=name
where= (
prxmatch('/x.*2912.*/',name) /* name selection criteria */
)
);
run;
* Short lists;
* construct source code for name list;
proc sql noprint;
select name into :varlist separated by ' ' from varlist_table;
data want;
set have (keep=&varlist); /* apply generated source code */
run;
* Arbitrary or Long lists expected;
%macro stream_column (data=, column=);
%local dsid index &column;
%let dsid=%sysfunc(open(&data(keep=&column)));
%if &dsid %then %do;
%syscall SET(dsid);
%do %while (0=%sysfunc(fetch(&dsid)));
&&&column. /* emit column value from table */
%end;
%let dsid = %sysfunc(close(&dsid));
%end;
%mend;
options mprint;
data want2;
set have (keep=
/* stream source code as macro text emissions */
%stream_column(data=varlist_table,column=name)
);
run;

SAS Selecting the most recent dataset in a lib automatically

i am working with a library that is updated every month or so, and i need a way to select the most recent dataset each month, i tried two methods that would show me the latest table, one that makes a table ordering from the modified date
proc sql;
create table tables as
select memname, modate
from dictionary.tables
where libname = 'SASHELP'
order by modate desc;
quit;
and one that gives me just the latest modified one
proc sql;
select memname into> latest_dataset
from dictionary.tables
where libname='WORK'
having crdate=max(crate);
%put &=latest_dataset;
and i would like to put these latest datasets in a table, but i don't know how, or if there is another easier way to do this, i am still very much new to SAS programming so i'm lost, any help is appreciated.
Use Proc APPEND to put the latest data set into a table. You are essentially accumulating rows.
Use SQL :INTO to obtain (place into a macro variable) the libname.memname of the data set that should be appended.
Example:
The task of determining the newest data set and appending it to a base table is also in a macro so the the code can be easily rerun in the example.
%macro append_newest;
%local newest_table;
proc sql noprint;
select catx('.', libname, memname) into :newest_table
from dictionary.tables
where libname = 'WORK'
and memtype = 'DATA'
having crdate = max(crdate);
%put NOTE: &=newest_table;
create view newest_view as
select "&newest_table" as row_source length=41, *
from &newest_table
;
proc append base=work.master data=newest_view;
run;
%mend;
* create an empty for accumulating new observations;
data work.master;
length row_source $41;
set one (obs=0);
run;
data work.one;
set sashelp.class;
where name between 'A' and 'E';
%append_newest;
data work.two;
set sashelp.class;
where name between 'Q' and 'ZZ';
%append_newest;
data work.three;
set sashelp.class;
where name between 'E' and 'Q';
%append_newest;
Will produce this master table that accumulates the little pieces that come in day by day.
You would want additional constraints such as a unique key in order to prevent appending the same data more than once.

SAS, keeping only columns that contain a certain character

I would like to know is it possible to perform an action that keeps only columns that contain a certain character.
For example, lets say that I have columns: name, surname, sex, age.
I want to keep only columns that start with letter 's' (surname and sex).
How do I do that?
There's several variations on how to filter out names.
For prefixes or lists of variables it's pretty easy. For suffixes or more complex patterns it keeps more complicated. In general you can short cut lists as follows:
_numeric_ : all numeric variables
_character_ : all character variables
_all_ : all variables
prefix1 - prefix# : all variables with the same prefix assuming they're numbered
prefix: : all variables that start with prefix
firstVar -- lastVar : variables based on location between first and last variable, including the first and last.
first-numeric-lastVar : variables that are numeric based on location between first and last variable
Anything more complex requires that you filter it via the metadata list. SAS basically keeps some metadata about each data set so you can query that information to build your lists. Data about columns and types are in the sashelp.vcolumn or dictionary.column data set.
To filter all columns that have the word mpg for example:
*generate variable list;
proc sql noprint;
select name into :var_list separated by " "
from sashelp.vcolumn
where libname = 'SASHELP' and memname = 'CARS'
and lowcase(name) like '%mpg%';
quit;
*check log for results;
%put &var_list;
*verification from original table;
proc contents data=sashelp.cars;
run;
*example of usage;
data want;
set sashelp.cars;
keep &var_list;
run;
Some more details are available in this blog post and here (documentation).
If you want do keep only variables that start with an s, then use name prefix list operator :.
data want;
set have(keep=s:);
run;
It's possible.
In the code below I created a macro variable that has the name of columns that have in a table. After run the code you will have the name of columns you want.
PROC SQL;
SELECT
NAME
INTO:
NMVAR /* SAVE IN MACRO VARIABLE */
FROM SASHELP.VCOLUMN
WHERE
LIBNAME EQ "YOUR LIBNAME" AND /* THE NAME OF LIB MUST BE WRITTEN IN UPPERCASE */
MEMNAME EQ "YOUR TABLE" AND /* THE NAME OF 'TABLE/DATA SET' MUST BE WRITTEN IN UPPERCASE */
SUBSTR(NAME,1,1) EQ "S";
RUN;
For complex variable name selection filtering, such as regular expressions, or lookup in external metadata control table, you will need to process the metadata of the table itself to construct source that can be applied.
This example demonstrates two, of many, ways that source code can be generated.
metadata table from target table, Proc CONTENTS
process metadata, Proc SQL
construct source code
Expectation of name lists < 64K
SQL INTO :<macro-variable> for source code expected to be < 64K characters
Very large name lists or robust
A macro that streams source code from metadata table
From a data set with 50,000 variables select the columns whose name contains 2912
data have;
retain id 'HOOPLA12345' x1-x50000 .;
stop;
run;
* obtain metadata of target table;
proc contents noprint data=have
out=varlist_table
( keep=name
where= (
prxmatch('/x.*2912.*/',name) /* name selection criteria */
)
);
run;
* Short lists;
* construct source code for name list;
proc sql noprint;
select name into :varlist separated by ' ' from varlist_table;
data want;
set have (keep=&varlist); /* apply generated source code */
run;
* Arbitrary or Long lists expected;
%macro stream_column (data=, column=);
%local dsid index &column;
%let dsid=%sysfunc(open(&data(keep=&column)));
%if &dsid %then %do;
%syscall SET(dsid);
%do %while (0=%sysfunc(fetch(&dsid)));
&&&column. /* emit column value from table */
%end;
%let dsid = %sysfunc(close(&dsid));
%end;
%mend;
options mprint;
data want2;
set have (keep=
/* stream source code as macro text emissions */
%stream_column(data=varlist_table,column=name)
);
run;

How to obtain the number of records of a dataset in SAS

I want to count the number of records in a dataset in SAS. There is a function the make this thing in a simple way? I used R ed for obtain this information there was the length() function. Morover I need the number of record to compute some percetages so I need this value not in a table but in a value that can be used for other data step. How can I fix?
Thanks in advance
Here is another solution, using SAS dictionaries,
proc sql;
select nobs into: num_obs
from dictionary.tables
where libname = "WORK" and memname = "A"
;
quit;
It is easy to get the size of many datasets by modifying the above code,
proc sql;
create table test as
select memname, nobs
from dictionary.tables
where libname = "WORK" and memname like "A%"
;
quit;
data _null_;
set test;
call symput(memname, nobs);
run;
The above code will give you the sizes of all data sets with name starting with "a" in the temporary/work library.
Assuming this is a basic SAS table that you've created, and not modified or appended to, the best way is to use the meta data held in a dataset (the Number of tries is held in a piece of meta data called "nobs"), without reading through the dataset its self and place it in a macro variable. You can do this in the following way:
Data _null_;
i=1;
If i = 0 then set DATASETTOCOUNT nobs= mycount;
Call symput('mycount', mycount);
Run;
%put &mycount.;
You will now have a macro variable that contains the number of rows in your dataset, that you can call on in other data steps using &mycount.

SAS - Creating variables from macro variables

I have a SAS dataset which has 20 character variables, all of which are names (e.g. Adam, Bob, Cathy etc..)
I would like a dynamic code to create variables called Adam_ref, Bob_ref etc.. which will work even if there a different dataset with different names (i.e. don't want to manually define each variable).
So far my approach has been to use proc contents to get all variable names and then use a macro to create macro variables Adam_ref, Bob_ref etc..
How do I create actual variables within the dataset from here? Do I need a different approach?
proc contents data=work.names
out=contents noprint;
run;
proc sort data = contents; by varnum; run;
data contents1;
set contents;
Name_Ref = compress(Name||"_Ref");
call symput (NAME, NAME_Ref);
%put _user_;
run;
If you want to create an empty dataset that has variables named like some values you have in a macro variables you could do something like this.
Save the values into macro variables that are named by some pattern, like v1, v2 ...
proc sql;
select compress(Name||"_Ref") into :v1-:v20 from contents;
quit;
If you don't know how many values there are, you have to count them first, I assumed there are only 20 of them.
Then, if all your variables are character variables of length 100, you create a dataset like this:
%macro create_dataset;
data want;
length %do i=1 %to 20; &&v&i $100 %end;
;
stop;
run;
%mend;
%create_dataset; run;
This is how you can do it if you have the values in macro variable, there is probably a better way to do it in general.
If you don't want to create an empty dataset but only change the variable names, you can do it like this:
proc sql;
select name into :v1-:v20 from contents;
quit;
%macro rename_dataset;
data new_names;
set have(rename=(%do i=1 %to 20; &&v&i = &&v&i.._ref %end;));
run;
%mend;
%rename_dataset; run;
You can use PROC TRANSPOSE with an ID statement.
This step creates an example dataset:
data names;
harry="sally";
dick="gordon";
joe="schmoe";
run;
This step is essentially a copy of your step above that produces a dataset of column names. I will reuse the dataset namerefs throughout.
proc contents data=names out=namerefs noprint;
run;
This step adds the "_Refs" to the names defined before and drops everything else. The variable "name" comes from the column attributes of the dataset output by PROC CONTENTS.
data namerefs;
set namerefs (keep=name);
name=compress(name||"_Ref");
run;
This step produces an empty dataset with the desired columns. The variable "name" is again obtained by looking at column attributes. You might get a harmless warning in the GUI if you try to view the dataset, but you can otherwise use it as you wish and you can confirm that it has the desired output.
proc transpose out=namerefs(drop=_name_) data=namerefs;
id name;
run;
Here is another approach which requires less coding. It does not require running proc contents, does not require knowing the number of variables, nor creating a macro function. It also can be extended to do some additional things.
Step 1 is to use built-in dictionary views to get the desired variable names. The appropriate view for this is dictionary.columns, which has alias of sashelp.vcolumn. The dictionary libref can be used only in proc sql, while th sashelp alias can be used anywhere. I tend to use sashelp alias since I work in windows with DMS and can always interactively view the sashelp library.
proc sql;
select compress(Name||"_Ref") into :name_list
separated by ' '
from sashelp.vcolumn
where libname = 'WORK'
and memname = 'NAMES';
quit;
This produces a space delimited macro vaiable with the desired names.
Step 2 To build the empty data set then this code will work:
Data New ;
length &name_list ;
run ;
You can avoid assuming lengths or create populated dataset with new variable names by using a slightly more complicated select statement.
For example
select compress(Name)||"_Ref $")||compress(put(length,best.))
into :name_list
separated by ' '
will generate a macro variable which retains the previous length for each variable. This will work with no changes to step 2 above.
To create populated data set for use with rename dataset option, replace the select statement as follows:
select compress(Name)||"= "||compress(_Ref")
into :name_list
separated by ' '
Then replace the Step 2 code with the following:
Data New ;
set names (rename = ( &name_list)) ;
run ;