List only the column names of a dataset - sas

I am working on SAS in UNIX env and I want to view only the column name of a dataset. I have tried proc contents and proc print but both of them list a lot of other irrevelant information that I do not want as it fills up my putty screen and the information ultimately is lost.
I also tried to get this thing frm the sas metadata but that is not working either.
I tried :
2? proc sql;
select *
from dictionary.tables
where libname='test' and memname='sweden_elig_file_jul';
quit;
5?
NOTE: No rows were selected.
6?
NOTE: PROCEDURE SQL used (Total process time):
real time 0.27 seconds
cpu time 0.11 seconds

You're using the wrong dictionary table to get column names...
proc sql ;
select name
from dictionary.columns
where memname = 'mydata'
;
quit ;
Or using PROC CONTENTS
proc contents data=mydata out=meta (keep=NAME) ;
run ;
proc print data=meta ; run ;

Here's one I've used before to get a list of columns with a little bit more information, you can add the keep option as in the previous answer. This just demonstrates how to create a connection to the metadata server, in case that is useful to anyone viewing this post.
libname fetchlib meta
library="libraryName" metaserver="metaDataServerAddress"
password="yourPassword" port=1234
repname="yourRepositoryName" user="yourUserName";
proc contents data=fetchlib.YouDataSetName
memtype=DATA
out=outputDataSet
nodetails
noprint;
run;

For a pure macro approach, try the following:
%macro mf_getvarlist(libds
,dlm=%str( )
)/*/STORE SOURCE*/;
/* declare local vars */
%local outvar dsid nvars x rc dlm;
/* open dataset in macro */
%let dsid=%sysfunc(open(&libds));
%if &dsid %then %do;
%let nvars=%sysfunc(attrn(&dsid,NVARS));
%if &nvars>0 %then %do;
/* add first dataset variable to global macro variable */
%let outvar=%sysfunc(varname(&dsid,1));
/* add remaining variables with supplied delimeter */
%do x=2 %to &nvars;
%let outvar=&outvar.&dlm%sysfunc(varname(&dsid,&x));
%end;
%End;
%let rc=%sysfunc(close(&dsid));
%end;
%else %do;
%put unable to open &libds (rc=&dsid);
%let rc=%sysfunc(close(&dsid));
%end;
&outvar
%mend;
Usage:
%put List of Variables=%mf_getvarlist(sashelp.class);
Returns:
List of Variables=Name Sex Age Height Weight
source: https://github.com/sasjs/core/blob/main/base/mf_getvarlist.sas

proc sql;
select *
from dictionary.tables
where libname="TEST" and memname="SWEDEN_ELIG_FILE_JUL";
quit;

Related

Parse all column names from table and put into macro variable (SAS 9.4)

Hi I have table which contains around 300 variable and I need to get all the column names (variables) into one macro variable (%let vars = [list of names of those 300 columns] ).
Anybody knows how could I do that?
SAS dictionary keep metadata, it is easy to get columns name of data with dictionary, such as:
proc sql;
select name into: vars separated by ' ' from dictionary.columns where libname='SASHELP' and memname='CLASS';
quit;
%put &vars;
Or Vcolumn in data step:
data _null_;
set sashelp.vcolumn(where=(libname='SASHELP' and memname='CLASS')) end=last;
length vars $100;
retain vars;
vars=catx(' ', vars,name);
if last then call symputx('vars',vars);
run;
%put &vars;
There is a macro for this in the MacroCore library which works as follows:
%put List of Variables=%mf_getvarlist(sashelp.class);
Reproduced below:
/**
#file
#brief Returns dataset variable list direct from header
#details WAY faster than dictionary tables or sas views, and can
also be called in macro logic (is pure macro). Can be used in open code,
eg as follows:
%put List of Variables=%mf_getvarlist(sashelp.class);
returns:
> List of Variables=Name Sex Age Height Weight
#param libds Two part dataset (or view) reference.
#param dlm= provide a delimiter (eg comma or space) to separate the vars
#version 9.2
#author Allan Bowe
#copyright GNU GENERAL PUBLIC LICENSE v3
**/
%macro mf_getvarlist(libds
,dlm=%str( )
)/*/STORE SOURCE*/;
/* declare local vars */
%local outvar dsid nvars x rc dlm;
/* open dataset in macro */
%let dsid=%sysfunc(open(&libds));
%if &dsid %then %do;
%let nvars=%sysfunc(attrn(&dsid,NVARS));
%if &nvars>0 %then %do;
/* add first dataset variable to global macro variable */
%let outvar=%sysfunc(varname(&dsid,1));
/* add remaining variables with supplied delimeter */
%do x=2 %to &nvars;
%let outvar=&outvar.&dlm%sysfunc(varname(&dsid,&x));
%end;
%End;
%let rc=%sysfunc(close(&dsid));
%end;
%else %do;
%put unable to open &libds (rc=&dsid);
%let rc=%sysfunc(close(&dsid));
%end;
&outvar
%mend;

Execute proc step only if it doesn't cause log error (inside macro) SAS

I'm trying to test different covariance structures inside macro with Proc Mixed.
%macro cov(type);
proc mixed data=tmp order=data;
class sub trt visit;
model var = trt visit trt*visit / S cl;
repeated visit /subject=sub type=&type.;
FitStatistics=min_var_&type.;
run;
%mend;
Some of the covariance structures I need to fit in model causes errors and I'm trying to find a way to execute this proc mixed statement only, if it doesn't cause error with value of &type.
I have been working with %sysfunc and but haven't been able to resolve this yet.
%IF %SYSFUNC(EXIST(min_var_&type.)) %THEN %DO;
data help_&type.;
set min_var_&type.;
run;
%end;
This produces these datasets correctly, but still log errors exists in log for those macro variables that can not be fitted.
You can redirect the log to a file like that :
filename logfile "\\SERVER\LOG\mylog.log";
proc printto log=logfile new;
run;
And then when your PROC MIXED is finished, you can filter on the log file for the string "ERROR" :
....YOUR PROC MIXED...
/*come back to normal log*/
proc printto;
run;
/*check the log file*/
DATA CHECKLOG;
LENGTH ROWS $200;
LABEL ROWS = 'Messages from LOG';
INFILE "\\SERVER\LOG\mylog.log" TRUNCOVER;
INPUT ROWS &;
LINE = _N_;
IF SUBSTR(ROWS,1,5)='ERROR' /*OR SUBSTR(ROWS,1,7)='WARNING'*/ THEN
OUTPUT;
RUN;
You will have all the ERROR and (or WARNING if needed) in a dataset.
Then you have to check if the table is empty.
If YES, you can continue your script.
You can do it via this method
proc sql;
select * from checklog;
run;
%put n=&sqlobs;
If sqlobs is greater than 0, then you have errors.
You can check the sqlobs via a macro function like this :
%macro checklog;
proc sql;
select * from checklog;
run;
%if (&sqlobs>0) %then ...
%else ...
%mend checklog;

Find a string in SAS and capture variable name and position of the variable in macro variables

I have a very large number of datasets that are not consistently formatted - I am trying to read them into SAS and normalize them.
The basic need here is to locate a 'key column' that contains a certain string - from there I know what to do with all the variables to the left and right of that column.
The 'GREP' macro from the sas website (http://support.sas.com/kb/33/078.html) seems like it can handle this, but I need help adapting the code in the following ways:
1 - I only need to search one dataset at a time, already in the 'work' library.
2 - I need to capture the name of the variable (and the position number of it) that prints to the log at the end of this macro. This seems like it would be easy but it just returns the last column in the dataset instead of the (correct) column that prints to the log at the end.
Current code below:
%macro grep(librf,string); /* parameters are unquoted, libref name, search string */
%let librf = %upcase(&librf);
proc sql noprint;
select left(put(count(*),8.)) into :numds
from dictionary.tables
where libname="&librf";
select memname into :ds1 - :ds&numds
from dictionary.tables
where libname="&librf";
%do i=1 %to &numds;
proc sql noprint;
select left(put(count(*),8.)) into :numvars
from dictionary.columns
where libname="&librf" and memname="&&ds&i" and type='char';
/* create list of variable names and store in a macro variable */
%if &numvars > 0 %then %do;
select name into :var1 - :var&numvars
from dictionary.columns
where libname="&librf" and memname="&&ds&i" and type='char';
quit;
data _null_;
set &&ds&i;
%do j=1 %to &numvars;
if &&var&j = "&string" then
put "String &string found in dataset &librf..&&ds&i for variable &&var&j";
%end;
run;
%end;
%end;
%mend;
%grep(work,Source Location);
The log returns: "String Source Location found in dataset WORK.RAW_IMPORT for variable C" (the third), which is correct.
I just need usable macro variables equal to "C" and "3" at the end. This macro will be part of a larger macro (or a prelude to it) so the two macro variables need to reset with each dataset I run through it. Thanks for any help offered.
Please find the modification below, basically what I have done was to create global macro variables for dataset name and variable name which will feed as input to get the variable position using VARNUM function as below, ( change identified by **** )
%macro grep(librf,string);
%let librf = %upcase(&librf);
proc sql noprint;
select left(put(count(*),8.)) into :numds
from dictionary.tables
where libname="&librf";
select memname into :ds1 - :ds&numds
from dictionary.tables
where libname="&librf";
%do i=1 %to &numds;
proc sql noprint;
select left(put(count(*),8.)) into :numvars
from dictionary.columns
where libname="&librf" and memname="&&ds&i" and type='char';
/* create list of variable names and store in a macro variable */
%if &numvars > 0 %then %do;
select name into :var1 - :var&numvars
from dictionary.columns
where libname="&librf" and memname="&&ds&i" and type='char';
quit;
%global var_pos var_nm var_ds;
data _null_;
set &&ds&i;
%do j=1 %to &numvars;
**** ADDED NEW CODE HERE ****;
if &&var&j = "&string" then do; /* IF-DO nesting */;
call symputx("var_nm","&&var&j"); /*Global Macro variable for Variable Name */
call symputx("var_ds","&&ds&i"); /*Global Macro variable for Dataset Name */
put "String &string found in dataset &librf..&&ds&i for variable &&var&j";
%end;
run;
**** ADDED NEW CODE HERE ****;
%let dsid=%sysfunc(open(&var_ds,i)); /* Open Data set */
%let var_pos=%sysfunc(varnum(&dsid,&var_nm)); /* Variable Position */
%let rc=%sysfunc(close(&dsid)); /* Close Data set */;
%end;
%end;
%mend;
%grep(work,Source Location);
%put &=var_nm &=var_ds &=var_pos;

SAS connect to Teradata - Using 2 accounts (switch)

Can someone please help? I have not used SAS in a few years and need some assistance with connecting to Teradata.
I want to connect to Teradata using ACCT1 if the time of day is between 7pm-6:59am or with Acct2 if the time of day is between 7am-6:59pm.
%let
acct1="mismktdev"
acct2="mismktprod"
%include
%macro t_cnnct;
options nomprint;
connect to teradata (tdpid="&tpidxyz" user="&misuid"
password="&mispwd" account="&acct1" mode=teradata);
options mprint;
proc sql;
connect to teradata (user="&terauser" password="&terapwd" mode=teradata);
execute (SET QUERY_BAND = 'Application=PrimeTime;Process=Daily;script=pt_add_history_v30.sas;' for session ) by teradata;
%mend t_cnnct;
proc sql;
Sel * from tblname;
You can use %let timenow=%sysfunc(time(), time.); to get the time at which the program is running and then in you macro do something like :
%macro test();
%let timenow=%sysfunc(time(), time.);
%put &timenow;
%if (&timenow > '19:00'T and &timenow < '06:59'T) %then %do;
/* Your Code for 7pm - 6:59am Here*/
%end;
%else %do;
/*Code for 7am - 6:59pm here*/
%end;
%mend;
%test();
Your idea to use SAS macro variables is just right. Here is a macro to define all the global SAS macro variables you need (including changing the "account" string):
%macro t_cnnct;
%global tdserver terauser terapwd teraacct teradb;
%let tdserver=myTDPID;
%let terauser=myuserID;
%let terapwd=mYTDpassword;
%let teradb=myTDdatabase;
%let now=%sysfunc(time());
%if &now >= %sysfunc(inputn(07:00,time5.))
and &now <= %sysfunc(inputn(19:00,time5.))
%then %let teraacct=mismktdev;
%else %let teraacct=mismktprod;
%mend t_cnnct;
Note the SAS macro variable values are specified without double-quotes! Use double-quotes when they are referenced in your code.
Next, just run the macro before your PROC SQL code to set the variables and use those variables in your connect string:
%t_cnnct;
proc sql;
connect to teradata (user="&terauser" password="&terapwd" account="&teraacct"
server="&tdserver" mode=teradata);
execute (
SET QUERY_BAND = 'Application=PrimeTime;Process=Daily;script=pt_add_history_v30.sas;' for session
) by teradata;
create table mySASds as
select *
from connection to teradata (
select *
from &teradb..tablename
);
quit;
Note that the tdpid= option you were using is an alias for the server= option (which I prefer). Also, the query band you set will remain in effect for the entire PROC SQL run.
And here is an example of a SAS libref that uses the same macro variables:
libname myTD teradata user="&terauser" password="&terapwd" account="&teraacct"
server="&tdserver" database="&teradb" mode=teradata);

How to detect how many observations in a dataset (or if it is empty), in SAS?

I wonder if there is a way of detecting whether a data set is empty, i.e. it has no observations.
Or in another saying, how to get the number of observations in a specific data set.
So that I can write an If statement to set some conditions.
Thanks.
It's easy with PROC SQL. Do a count and put the results in a macro variable.
proc sql noprint;
select count(*) into :observations from library.dataset;
quit;
There are lots of different ways, I tend to use a macro function with open() and attrn(). Below is a simple example that works great most of the time. If you are going to be dealing with data views or more complex situations like having a data set with records marked for deletion or active where clauses, then you might need more robust logic.
%macro nobs(ds);
%let DSID=%sysfunc(OPEN(&ds.,IN));
%let NOBS=%sysfunc(ATTRN(&DSID,NOBS));
%let RC=%sysfunc(CLOSE(&DSID));
&NOBS
%mend;
/* Here is an example */
%put %nobs(sashelp.class);
Here's the more complete example that #cmjohns was talking about. It will return 0 if it is empty, -1 if it is missing, and has options to handle deleted observations and where clauses (note that using a where clause can make the macro take a long time on very large datasets).
Usage Notes:
This macro will return the number of observations in a dataset. If the dataset does not exist then -1 will be returned. I would not recommend this for use with ODBC libnames, use it only against SAS tables.
Parameters:
iDs - The libname.dataset that you want to check.
iWhereClause (Optional) - A where clause to apply
iNobsType (Optional) - Either NOBS OR NLOBSF. See SASV9 documentation for descriptions.
Macro definition:
%macro nobs(iDs=, iWhereClause=1, iNobsType=nlobsf, iVerbose=1);
%local dsid nObs rc;
%if "&iWhereClause" eq "1" %then %do;
%let dsID = %sysfunc(open(&iDs));
%end;
%else %do;
%let dsID = %sysfunc(open(&iDs(where=(&iWhereClause))));
%end;
%if &dsID %then %do;
%let nObs = %sysfunc(attrn(&dsID,nlobsf));
%let rc = %sysfunc(close(&dsID));
%end;
%else %do;
%if &iVerbose %then %do;
%put WARNING: MACRO.NOBS.SAS: %sysfunc(sysmsg());
%end;
%let nObs = -1;
%end;
&nObs
%mend;
Example Usage:
%put %nobs(iDs=sashelp.class);
%put %nobs(iDs=sashelp.class, iWhereClause=height gt 60);
%put %nobs(iDs=this_dataset_doesnt_exist);
Results
19
12
-1
Installation
I recommend setting up a SAS autocall library and placing this macro in your autocall location.
Proc sql is not efficient when we have large dataset. Though using ATTRN is good method but this can accomplish within base sas, here is the efficient solution that can give number of obs of even billions of rows just by reading one row:
data DS1;
set DS nobs=i;
if _N_ =2 then stop;
No_of_obs=i;
run;
The trick is producing an output even when the dataset is empty.
data CountObs;
i=1;
set Dataset_to_Evaluate point=i nobs=j; * 'point' avoids review of full dataset*;
No_of_obs=j;
output; * Produces a value before "stop" interrupts processing *;
stop; * Needed whenever 'point' is used *;
keep No_of_obs;
run;
proc print data=CountObs;
run;
The above code is the simplest way I've found to produce the number of observations even when the dataset is empty. I've heard NOBS can be tricky, but the above can work for simple applications.
A slightly different approach:
proc contents data=library.dataset out=nobs;
run;
proc summary data=nobs nway;
class nobs;
var delobs;
output out=nobs_summ sum=;
run;
This will give you a dataset with one observation; the variable nobs has the value of number of observations in the dataset, even if it is 0.
I guess I am trying to reinvent the wheel here with so many answers already. But I do see some other methods trying to count from the actual dataset - this might take a long time for huge datasets. Here is a more efficient method:
proc sql;
select nlobs from sashelp.vtable where libname = "library" and memname="dataset";
quit;