How to create a macro variable from a list of date in a table SAS - sas

I have a work table named WORK.WEEK_YEAR_FESTIVITA with two records with dates 08dec2022 and 09dec2022 in the field "HolidayDate" and I would to convert this list in a macro variable like "Elenco_Date = '08dec2022'd,'09dec2022'd" and so on for other possible dates in the table WORK.WEEK_YEAR_FESTIVITA. I tried with this proc SQL:
proc sql noprint;
select distinct HolidayDate into : Elenco_Date separated by "'d,"
from WORK.WEEK_YEAR_FESTIVITA;
quit;
%put &Elenco_Date;
but the result IS:
ELENCO_DATE = 08DEC2022'd,09DEC2022;
and not
ELENCO_DATE = '08DEC2022'd,'09DEC2022'd;
as desired
Do you have suggestions?
Thanks

Create the date literal strings in the column list part of the SELECT statement.
select distinct quote(put(HolidayDate,date9.))||'d'
into :Elenco_Date separated by ','
from WORK.WEEK_YEAR_FESTIVITA
;
Or you could just store the raw number of days into the macro variable.
select distinct HolidayDate format=6.
into :Elenco_Date separated by ','
from WORK.WEEK_YEAR_FESTIVITA
;
Since there is no difference between using
where date = "08DEC2022"d
and
where date = 22987

Related

Macro variable (date) not working as expected in query

I've several SAS (PROC SQL) queries using a MIN(startdate) and MAX(enddate).
To avoid having to calculate these every time I want to do this once at the beginning and store it in a macro variable but I get an error every time.
What is going wrong or how to achieve this ?
Thanks in advance for the help !
This works:
WHERE DATE BETWEEN
(SELECT MIN(startdate format yymmddn8. FROM work.mydata)
AND (SELECT MAX(enddate format yymmddn8. FROM work.mydata)
DATE format is YYMMDD8n and length is 8.
Creating macro variables:
PROC SQL;
SELECT MIN(startdate), MAX(enddate)
INTO :start_date, :end_date
FROM work.mydata
QUIT;
/*Formatting the macro variable:*/
%macro format(value,format);
%if %datatyp(&value)=CHAR
%THEN %SYSFUNC(PUTC(&value, &format));
%ELSE %LEFT(%QSYSFUNC(PUTN($value,&format)));
%MEND format;
Tried:
WHERE DATE BETWEEN "%format(&start_date, yymmddn8.)" AND "%format(&end_date, yymmddn8.)"
Error message:
ERROR: Expression using equals (=) has components that are of different data types
First, you are missing d when providing date for BETWEEN operator.
WHERE DATE BETWEEN "%format(&start_date, yymmddn8.)"d AND "%format(&end_date, yymmddn8.)"d
But keep in mind tht date string must be in date9. format.
"4NOV2022"d
Second, you dont need to format date for this WHERE condition. Date is numeric and numeric value whould work fine.
WHERE DATE BETWEEN &start_date AND &end_date
If you really want to have date formated you can format it directly inside PROC SQL:
PROC SQL;
SELECT
MIN(startdate) format=date9.,
MAX(enddate) format=date9.
INTO
:start_date,
:end_date
FROM
work.mydata
QUIT;
and then
WHERE DATE BETWEEN "&start_date"d AND "&end_date"d
Note that in a PROC SQL query the format attached to a variable does not carry over to the result of aggregate functions, like MIN() and MAX(), performed on the variable. For numeric variables PROC SQL will use the BEST8. format when converting the number into a string to store into the macro variable. You can remove the extra spaces that causes by adding the TRIMMED keyword.
proc sql noprint;
select min(startdate), max(enddate)
into :start_date trimmed
, :end_date trimmed
from work.mydata
;
quit;
Do not add quotes around the values generated by expanding the macro variables. That would generate a string literal and not a numeric literal.
where date between &start_date and &end_date
If you want the values put into the macro variables by the into syntax to be formatted in some other way you need to attach the format as part of the query.
For example if you wanted the value to be something that could be used to generate a date literal, that is a string that the DATE informat understands, then use the DATE format. Make sure the width used is long enough to include all four digits of the year.
proc sql noprint;
select min(startdate) format=date9.
, max(enddate) format=date9.
into :start_date trimmed
, :end_date trimmed
from work.mydata
;
quit;
...
where date between "&start_date"d and "&end_date"d

SAS - Keep only columns listed in a separate dataset

I have two datasets. The first, big_dataset, has around 3000 columns, most of which are never used. The second, column_list, contains a single column called column_name with around 100 values. Each value is the name of a column I want to keep.
I want to filter big_dataset so that only columns in column_list are kept, and the rest are discarded.
If I were using Pandas dataframes in Python, this would be a trivial task:
cols = column_list['column_name'].tolist()
smaller_dataset = big_dataset[cols]
However, I can't figure out the SAS equivalent. Proc Transpose doesn't let me turn the rows into headers. I can't figure out a statement in the data step that would let this work, and as far as I'm aware this isn't something that Proc SQL could handle. I've read through the docs on Proc Datasets and that doesn't seem to have what I need either.
To obtain a list of columns from column_list to use against big_dataset, you can query the column_list table and put the result into a macro variable. This can be achieved with PROC SQL and the SEPARATED BY clause:
proc sql noprint;
select column_name
into :cols separated by ','
from column_list;
create table SMALLER_DATASET AS
select &cols.
from WORK.BIG_DATASET;
quit;
Alternatively you may use SEPARATED BY ' ' and then use the resulting list in a KEEP statement or dataset option:
proc sql noprint;
select column_name
into :cols separated by ' '
from column_list;
quit;
data small_dataset;
set big_dataset (keep=&cols.);
/* or keep=&cols.; */
run;

appending text to all columns at once in sas

I have a tables which have columns from col1 to col10.
I would like to append a string such as italy_col1 to italy_col10.
how can I achieve this without a macro.
Since i am joining multiple table i want to append a text "Italy" for all column in table 1 and "USA" in table 2. I tried below example it doesnt suit my requirement
https://support.sas.com/kb/48/674.html
cats function appends all the values in the column of the tables. Any suggestions?
One way is to generate macro variables and then use those in your code.
First get lists of variables to rename from TABLE1 and TABLE2.
proc sql noprint;
select catx('=',name,cats('Italy_',name) into :rename1 separated by ' '
from dictionary.columns
where libname="WORK" and memname="TABLE1" and upcase(name) ne 'ID'
;
select catx('=',name,cats('USA_',name) into :rename2 separated by ' '
from dictionary.columns
where libname="WORK" and memname="TABLE2" and upcase(name) ne 'ID'
;
quit;
Then use the list of rename pairs in your code that merges the datasets.
data want;
merge table1(rename=(&rename1)) table2(rename=(&rename2));
by id;
run;
Note this will only work when the number of variables to rename is small enough to fit into a single macro variable. If the list is longer just use another method, such as a data step, to generate the same code.
Also watch out for variable names that are too long. SAS has a limit of 32 bytes for variable names so adding 4 or 7 extra characters might result in names that are too long. You might just truncate to 32 characters , but then you risk forming duplicate names.

Refer to column name using variable in PROC SQL

I'm new to SAS so please bear with me.
I have monthly data for the trailing 7 months. It goes through a PROC TRANSPOSE such that the resulting table has columns named FEB2015, MAR2015,...,AUG2015. These columns will change each month I rerun my program so that the earliest month will go from Feb to Mar, etc. in successive months of reruns. I want to be able to reference this "earliest month" later on in the program. For example, I'd like to run a PROC SQL that returns rows that have no values in the FEB2015 column but a value under 1000 in the AUG2015 Column and I'd like to do this based on the fact that the columns are named after last month, and the month 7 months ago.
Here's an example of code I'd be trying to run. Assume the table has columns row_ID, FEB2015, MAR2015, APR2015, MAY2015, JUN2015, JUL2015, AUG2015, all integers.
%let first = put(intnx('month',today(),-7,'begin'), MONYY7.);
%let second = put(intnx('month',today(),-6,'begin'), MONYY7.);
%let last = put(intnx('month',today(),-1,'begin'), MONYY7.);
PROC SQL noprint;
SELECT row_id, &first, &second, &last
FROM mytable
WHERE &first is missing
and &second is not missing
and &last < 1000;
QUIT;
I think the values in the macro variables are just being read as strings and are not being recognized as the name of the column. I've tried wrapping them in NLITERAL() but haven't had any luck.
Thanks!
Of course the macro variable values are strings. Macro variable values are ALWAYS strings. The problem is that your strings are not valid names of variables. Variable names cannot have parentheses or quotes in them. If you want to call a function in macro code you need to nest the call inside of the %SYSFUNC() macro function.
%let first = %sysfunc(intnx(month,%sysfunc(today()),-7,begin), MONYY7.);

Delete the last 3 letters set sas variables

I would like to delete the last 3 letters from a set of variables in a SAS dataset.
(have to concatenante 2 datasets. )
And in the first set, the variables are named for example:
abc ,
def ,
ghi ...
While in the 2nd set they are named:
abc_1A ,
def_1A ,
ghi_1A ...
How can I delete the '_1A' from the 100+ variables? I don't want to simply add '_1A' to my first Dataset
Thanks
There are a few options. Here is one 'safe' option and one 'unsafe' option.
Safe option:
Write a macro to rename a variable to its name without the _1A.;
Arguments: var = variable name, len = length desired of final variable name.;
%macro rename_shorter(var=,len=);
&var. = %substr(&var.,1,&len.)
%mend rename_shorter;
Create a list of calls from dictionary.columns;
proc sql;
select cats('%rename_shorter(var=',name,',len=',length(name)-3,')')
into :renamelist separated by ' '
from dictionary.columns
where libname='SASHELP' and upcase(memname)='CLASS';
quit;
Call that list;
data want;
set sashelp.class(rename=(&renamelist.));
run;
An unsafe option, in the sense that it doesn't check that things are aligned correctly, is
proc sql;
create table want as
select * from have_namedright H
outer union corr
select * from have_namedwrong W
;
quit;