Get data set with maximum date in name by proc SQL - sas

Suppose I have some data sets in library lib, their names look like Table_YYYYMMDD (e.g. Table_20150101).
I want to get a name of a data set with maximum date (YYYYMMDD) and store it in a macro variable.
I'm using proc sql and from dictionary.tables.
First I extract a YYYYMMDD part of name. Then I should convert it to date and then find MAX. And I want to be sure that I have at least one data set in library.
proc sql;
select put(MAX(input(scan(memname, 2, '_'), yymmdd8.)), yymmddn8.)
into :mvTable_MaxDate
from dictionary.tables
where libname = 'LIB';
quit;
So,
Is it right to use sas functions like scan in proc sql?
How could I check whether the query is not empty (mvTable_MaxDate hasn't missing value)?
Thanks for your help:)

The cause of the error is that your are using the INPUTN() function, which expects the second argument to be a text literal or the name of a variable. If you change to INPUT(), it will avoid the error.
Also note you need to upcase the literal value of the library name on your where clause. Dictionary.tables stores libnames in upcase.
As written, the value of the macro variable will be a SAS date value. If you want it formatted as YYMMDDN8. you will need to add that.
Here's an example:
74 data a_20151027
75 a_20141022
76 a_20130114
77 ;
78 x=1;
79 run;
NOTE: The data set WORK.A_20151027 has 1 observations and 1 variables.
NOTE: The data set WORK.A_20141022 has 1 observations and 1 variables.
NOTE: The data set WORK.A_20130114 has 1 observations and 1 variables.
80
81 proc sql noprint;
82 select COALESCE(MAX(input(scan(memname, 2, '_'), yymmdd8.)), 0)
83 into :mvTable_MaxDate
84 from dictionary.tables
85 where libname = 'WORK';
86 quit;
87
88 %put &mvTable_MaxDate;
20388
89 %put %sysfunc(putn(&mvTable_MaxDate,yymmddn8));
20151027
As a side-comment, often life becomes much easier if you can just combine all your data into one dataset, and store the dataset name date suffix as a variable.

Related

How to create a datetime macro variable in SAS

%let mydate = "01JUN2021 00:00:00.000"dt;
This does not work. How do I create a datetime macro variable without using proc sql or data step?
The pure macro solution is:
%let mydate = %sysfunc(dhms(%sysfunc(mdy(6,1,2021)), 0, 0, 0));
%put &=mydate; * PRINTS THE UNFORMATTED VALUE STORED;
%put %sysfunc(sum(&mydate), datetime22.); * PRINTS THE DATETIME VALUE FORMATTED;
Output:
MYDATE=1938124800
01JUN2021:00:00:00
You can of course perform the dhms() and mdy() functions on separate lines if that is clearer for you.
Compare this to what your orginal code is doing:
%let mydate="01jan2021:00:00:00"dt;
%put &=mydate;
Prints:
MYDATE="01jan2021:00:00:00"dt
Notice how in your approach the string "01jan2021:00:00:00"dt has been saved into the macro variable, rather than the actual numeric date value 1938124800? Sometimes when you use your approach SAS gets confused when you try to use the value and it is unable to translate the literal to a numeric date value.
try %let mydate = '1Jan2021:0:0:1'dt
Note that it uses single quotes & theres no space between date and time
Your posted macro variable works fine in SAS code.
82 %let mydate = "01JUN2021 00:00:00.000"dt;
83
84 data test;
85 now = datetime();
86 then = &mydate;
87 diff = intck('dtday',then ,now);
88 format now then datetime20. ;
89 put (_all_) (=);
90 run;
now=16JUN2021:08:18:33 then=01JUN2021:00:00:00 diff=15
NOTE: The data set WORK.TEST has 1 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
If you need to use the value in pass through SQL code then you will need to set the macro variable to text that the remote database's implementation of SQL will recognize as a datetime value.
So perhaps
%let myts = timestamp '2021-06-01 00:00:00.000';

Produce custom table in SAS with a subsetted data set

I want to use SAS and eg. proc report to produce a custom table within my workflow.
Why: Prior, I used proc export (dbms=excel) and did some very basic stats by hand and copied pasted to an excel sheet to complete the report. Recently, I've started to use ODS excel to print all the relevant data to excel sheets but since ODS excel would always overwrite the whole excel workbook (and hence also the handcrafted stats) I now want to streamline the process.
The task itself is actually very straightforward. We have some information about IDs, age, and registration, so something like this:
data test;
input ID $ AGE CENTER $;
datalines;
111 23 A
. 27 B
311 40 C
131 18 A
. 64 A
;
run;
The goal is to produce a table report which should look like this structure-wise:
ID NO-ID Total
Count 3 2 5
Age (mean) 27 45.5 34.4
Count by Center:
A 2 1 3
B 0 1 1
A 1 0 1
It seems, proc report only takes variables as columns but not a subsetted data set (ID NE .; ID =''). Of course I could just produce three reports with three subsetted data sets and print them all separately but I hope there is a way to put this in one table.
Is proc report the right tool for this and if so how should I proceed? Or is it better to use proc tabulate or proc template or...?
I found a way to achieve an almost match to what I wanted. First if all, I had to introduce a new variable vID (valid ID, 0 not valid, 1 valid) in the data set, like so:
data test;
input ID $ AGE CENTER $;
if ID = '' then vID = 0;
else vID = 1;
datalines;
111 23 A
. 27 B
311 40 C
131 18 A
. 64 A
;
run;
After this I was able to use proc tabulate as suggested by #Reeza in the comments to build a table which pretty much resembles what I initially aimed for:
proc tabulate data = test;
class vID Center;
var age;
keylabel N = 'Count';
table N age*mean Center*N, vID ALL;
run;
Still, I wonder if there is a way without introducing the new variable at all and just use the SAS counters for missing and non-missing observations.
UPDATE:
#Reeza pointed out to use the proc format to assign a value to missing/non-missing ID data. In combination with the missing option (prints missing values) in proc tabulate this delivers the output without introducing a new variable:
proc format;
value $ id_fmt
' ' = 'No-ID'
other = 'ID'
;
run;
proc tabulate data = test missing;
format ID $id_fmt.;
class ID Center;
var age;
keylabel N = 'Count';
table N age*(mean median) Center*N, (ID=' ') ALL;
run;

Formats export with variable names in SAS

I created a format for a variable as follows
proc format;
value now 0=M
1=F
;
run;
and now I apply this to a dataset.
Data X;
set X2;
format Var1 now.;
run;
and I want to export this format using cntlout
proc format library=work cntlout=form; run;
this gives me the list of formats in the library catalog. But doesnot give me the variable name to which it is attached.
How can I create a dataset with list of formats and the attached variables to it?
So I can see which format is linked to what variable.
If you just want to look up the variables in a specific dataset, often PROC CONTENTS is faster than using SASHELP.VCOLUMN or DICTIONARY.TABLES, particularly when there are lots of libraries/datasets defined.
57 proc contents data=x out=myvars(keep=name format) noprint;
58 run;
NOTE: The data set WORK.MYVARS has 1 observations and 2 variables.
59
60 data _null_;
61 set myvars;
62 put _all_;
63 run;
NAME=Var1 FORMAT=NOW _ERROR_=0 _N_=1
NOTE: There were 1 observations read from the data set WORK.MYVARS.
Assuming you want this for a specific library you can use the SASHELP.VCOLUMN dataset. This dataset contains the formats for all variables and you can filter it as desired.

What's the easiest way to get SAS to do this?

I have a dataset that looks like this but with many, many more variable pairs:
Stuff2016 Stuff2008 Earth2016 Earth2008 Fire2016 Fire2008
123456 5646743 45 456 456 890101
541351 543534534 45 489 489 74456
352352 564889 98 489489 1231 189
464646 542235423 13 15615 1561 78
987654 4561889 44 1212 12121 111
For each pair of almost identically named variables,
I want SAS to subtract 2016 data - 2008 data without typing the variable names.
What's the easiest way to tell SAS to do this without having to specifically type the variable names? Is there a way to tell it to subtract every other variable minus the one that precedes it without mentioning the specific variable names?
Thanks a lot!!!!
I would probably recommend three arrays but you could do it with one. This highly depends on the order of the variables which isn't a good assumption in my book. Also, how would you name the results automatically?
data want;
set have;
array vars(*) stuff2016--fire2008;
array diffs(*) diffs1-diffs20; *something big enough to hold difference;
do i=1 to dim(vars)-1;
diffs(i) = vars(i)-vars(i+1);
end;
run;
Instead, I'd highly suggest you use the dictionary tables to query your variable names and dynamically generate your variable lists which are then passed onto three different arrays, one for 2016, one for 2008 and one for the difference. The libname and memname are stored in uppercase in the Dictionary table so keep that in mind.
data have;
input Stuff2016 Stuff2008 Earth2016 Earth2008 Fire2016 Fire2008;
cards;
123456 5646743 45 456 456 890101
541351 543534534 45 489 489 74456
352352 564889 98 489489 1231 189
464646 542235423 13 15615 1561 78
987654 4561889 44 1212 12121 111
;
run;
proc sql;
select name into :var2016 separated by " "
from sashelp.vcolumn
where libname='WORK'
and memname='HAVE'
and name like '%2016'
order by name;
select name into :var2008 separated by " "
from sashelp.vcolumn
where libname='WORK'
and memname='HAVE'
and name like '%2008'
order by name;
select catx("_", compress(name, ,'d'), "diff") into :vardiff separated by " "
from sashelp.vcolumn
where libname='WORK'
and memname='HAVE'
and name like '%2016'
order by name;
quit;
%put &var2016.;
%put &var2008.;
%put &vardiff.;
data want;
set have;
array v2016(*) &var2016;
array v2008(*) &var2008;
array diffs(*) &vardiff;
do i=1 to dim(v2016);
diffs(i)=v2016(i)-v2008(i);
end;
run;

proc contents is truncating the values in out put dataset.How to get full values in out put dataset?

I'm trying to get all dataset names in a library in to a data set.
proc datasets library=LIB1 memtype=data ;
contents data=_all_ noprint out=Datasets_in_Lib1(keep=memname) ;
run;
The final data set (Datasets_in_Lib1) is having all the data set names that are in LIB1, but names are truncated to 6 characters.Is there any way to get full names of the datasets with out truncation.
Ex: If dataset name is x123456789, the Datasets_in_Lib1 will have x12345 only.
Thanks in advance,
Sam.
Agree with the comment, in 9.3 memname is $32. I don't think there was a version of SAS where data set names were limited to 6 characters. They went from 8 characters to 32 characters in v7 (I think).
Here's a log from running your code, showing it works as you want in 9.3
51 data work.x123456789;
52 x=1;
53 run;
NOTE: The data set WORK.X123456789 has 1 observations and 1 variables.
54
55 proc datasets library=work memtype=data nolist;
56 contents data=_all_ noprint out=Datasets_in_Lib1(keep=memname) ;
57 run;
NOTE: The data set WORK.DATASETS_IN_LIB1 has 1 observations and 1 variables.
58
59 data _null_;
60 set Datasets_in_Lib1;
61 put _all_;
62 run;
MEMNAME=X123456789 _ERROR_=0 _N_=1
NOTE: There were 1 observations read from the data set WORK.DATASETS_IN_LIB1.
You can also query the sashelp.vtable to obtain the list of datasets:
proc sql;
create table mem_list as
select memname
from sashelp.vtable
where libname='LIB1' and memtype='DATA';
quit;