I have created a numeric variable using the Prompt Manager in EG.
This variable is called HYr for the highest year of data that I am pulling.
When running the program I create 4 new variables based on the highest year and this is where I am having issues.
I have the following:
%Let Yr2 = &HYr. - 1;
%Let Yr3 = "&HYr." - 2;
%Let Yr4 = &HYr. - 3;
%Let Yr5 = '&HYr.' - 4;
I am trying to subtract the value from the year and the new variable will be used in determining date ranges that are being pulled. I am trying several things and learning in the process but I am still stuck.
I know it is probably just a simple syntax issue and given enough time I will probably be able to get it but no one in my office has any better SAS skills than I do and that isn't much.
thanks in advance.
Use %EVAL() to do calculations with integers and macro variables.
%let HYR = 2018;
%Let Yr2 = %eval(&HYr. - 1);
%Let Yr5 = %eval(&HYr. - 4);
%put HYR: &hyr;
%put YR2: &yr2.;
%put YR5: &yr5.;
EDIT: If you were trying to do other calculations that included decimals you would need to use %SYSEVALF instead.
%let HYR = 2018;
%Let Yr2 = %sysevalf(&HYr. - 0.1);
%Let Yr5 = %sysevalf(&HYr. - 0.4);
%put HYR: &hyr;
%put YR2: &yr2.;
%put YR5: &yr5.;
I am working on a macro for regressions using the following code:
%Macro Regression;
%let index = 1;
%do %until (%Scan(&Var2,&index," ")=);
%let Ind = %Scan(&Var2,&index," ");
ods output SelectionSummary = SelectionSummary;
proc reg data = Regression2 plots = none;
model &Ind = &var / selection = stepwise maxstep=1;
output out = summary R = RSQUARE;
run;
quit;
%if &index = 1 %then %do;
data final;
set selectionsummary;
run;
%end;
%else %do;
data final;
set final selectionsummary;
run;
%end;
%let index = %eval(&Index + 1);
%end;
%mend;
%Regression;
This code works and gives me a table which highlights the independent variable that explains with the most variation the dependent variable.
I'm looking for a way to run this but the regression gives me the three best independent variables to explain the dependent variable if it was chosen to be the first variable, for example:
models chosen:
GDP = Human Capital
GDP = Working Capital
GDP = Growth
DependentVar Ind1 Ind2 Ind3 Rsq1 Rsq2 Rsq3
GDP human capital working capital growth 0.76 0.75 0.69
or
DependentVar Independent1 Rsq
GDP human capital 0.76
GDP working capital 0.75
GDP growth 0.69
EDIT:
It would be an absolute bonus if there is a way to put stepwise maxstep = 3 and have the best three independent variable combinations for each dependent variable with the condition that the first independent variable is unique.
TIA.
Try STOP=3 option on your model statement. It will fit the best model with up to three variables. However, it does not work with the stepwise option, but will work with the R^squared option.
model &Ind = &var / selection = maxR stop=3;
If you only want to consider 3 variable models include start=3 as well.
model &Ind = &var / selection = maxR stop=3 start=3;
When developing ETL programs with SAS Data Integration (DI) Studio, each transformation you specify has a neath user interface to specify the columns of the datasets(=tables) you create and their type, length and format.
When the existing transformations can not do the job and jou need user written code, you should do as much as possible with the point-and-click interface because
it enables impact analysis and reverse impact analysis
your successors will probably not dive into the SAS base code unless they realy need to
if they ever need to change the columns or their format, they expect changing the specifications in the user interface will do the job
Fortunately SAS discloses the format of the output to the programmer of a user written transformation in macro variables
So I often write my code inside the user written transformation as
%macro format_output(data);
%do _col = 0 %to &&&data._col_count - 1;
length &&&data._col&_col._name &&&data._col&_col._type&&&data._col&_col._length;
%if %length(&&&data._col&_col._format) %then %do;
format &&&data._col&_col._name &&&data._col&_col._format;
%end;
%end;
%mend;
data &_OUTPUT1 (keep=&_OUTPUT1_keep);
%format_output(_OUTPUT1);
set &_INPUT1;
* actua code ;
run;
Does SAS supply an embedded macro or something to do the same thing?
For completenes: the way SAS discloses the structure of the output looks like
%let _OUTPUT1 = myLib.myTable;
%let _OUTPUT1_connect = ;
%let _OUTPUT1_engine = ;
%let _OUTPUT1_memtype = DATA;
%let _OUTPUT1_options = %nrquote();
%let _OUTPUT1_alter = %nrquote();
%let _OUTPUT1_path = %nrquote(/myTable_A5E5JYT6.C7000MOL%(WorkTable%));
%let _OUTPUT1_type = 1;
%let _OUTPUT1_label = %nrquote();
/* List of target columns to keep */
%let _OUTPUT1_keep = myID first_item <16 more items> last_item;
%let _OUTPUT1_col_count = 19;
%let _OUTPUT1_col0_name = myID;
%let _OUTPUT1_col0_table = myLib.myTable;
%let _OUTPUT1_col0_length = 8;
%let _OUTPUT1_col0_type = ;
%let _OUTPUT1_col0_format = 13.;
%let _OUTPUT1_col0_informat = 13.;
%let _OUTPUT1_col0_label = %nrquote(my ID);
%let _OUTPUT1_col0_input0 = myID;
%let _OUTPUT1_col0_exp = ;
%let _OUTPUT1_col0_input = myID;
%let _OUTPUT1_col0_input_count = 1;
%let _OUTPUT1_col1_name = first_item;
%let _OUTPUT1_col1_table = myLib.myTable;
%let _OUTPUT1_col1_length = 8;
%let _OUTPUT1_col1_type = $;
%let _OUTPUT1_col1_format = $CHAR8.;
%let _OUTPUT1_col1_informat = $CHAR8.;
%let _OUTPUT1_col1_label = %nrquote(first data item in my table);
%let _OUTPUT1_col1_input0 = first_item;
%let _OUTPUT1_col1_exp = ;
%let _OUTPUT1_col1_input = first_item;
%let _OUTPUT1_col1_input_count = 1;
<documentation about 16 more columns>
%let _OUTPUT1_col18_name = last_item;
%let _OUTPUT1_col18_table = myLib.myTable;
%let _OUTPUT1_col18_length = 16;
%let _OUTPUT1_col18_type = $;
%let _OUTPUT1_col18_format = $16.;
%let _OUTPUT1_col18_informat = $16.;
%let _OUTPUT1_col18_label = %nrquote(last data item in my table);
%let _OUTPUT1_col18_input0 = last_item;
%let _OUTPUT1_col18_exp = ;
%let _OUTPUT1_col18_input = last_item;
%let _OUTPUT1_col18_input_count = 1;
%let _OUTPUT1_filetype = WorkTable;
In SAS DI Studio, The user written nodes can have more than one output table visualy visible & accessible from the node icon, and by right clicking on the table you can see change the table details (table name, type, length, format, informat, .. etc ). Please note I am using version 4.6
In my Screenshot below my user written node already has three output table. to add a forth one:
Right Click on the node
select "Add Work Table"
A new window will open for you to enter the table detail
In the DI Auto generated code for your node, sas will assign _OUTPUT1,2,3,4 to your tables, so in your code you can reference them with their corresponding &Output
Once you added your tables, you can see and edit all the output table details (length, format, informat) by selecting their properties from the visual interface (point and click). So you wont need a macro for that.
The following macro makes an inner join between two tables containing one column from each table in addition to the joining column :
%macro ij(x=,y=,to=".default",xc=,yc=,by=);
%if &to = ".default" %then %let to = &from;
PROC SQL;
CREATE TABLE &to AS
SELECT t1.&xc, t2.&yc, t1.&by
FROM &x t1 INNER JOIN &y t2
ON t1.&by = t2.&by;
RUN;
%mend;
I want to find a way to use several columns in &xc, &yc and &by.
As I don't think I can use vectors of variables.
My idea is to pass parameters as vectors of strings instead of simple variables, for example xc = {"col1" "col2"} and loop through them
using %let some_var= %sysfunc(dequote(&some_string)); to convert them back to variables.
Applied on xc only it would become something like:
%macro ij(x=,y=,to=".default",xc=,yc=,by=);
%if &to = ".default" %then %let to = &from;
PROC SQL;
CREATE TABLE &to AS
SELECT
%do i = 1 %to %NCOL(&xc)
%let xci = %sysfunc(dequote(&xc[1]));
t1.&xci,
%end;
t2.&yc, t1.&by
FROM &x t1 INNER JOIN &y t2
ON t1.&by = t2.&by;
RUN;
%mend;
But this loop fails. How could I make it work ?
Note: this is a simplified example, my ultimate ambition is to build join macros that would be as little verbose as possible and integrate data quality checks.
Really this would be much easier to code use SAS dataset options instead of building complicated macro logic.
proc sql ;
create table want2 as
select *
from sashelp.class(keep=name age)
natural inner join sashelp.class(keep=name height weight)
;
quit;
I would suggest learning how to use data step code instead of SQL code. For most normal data manipulations it is clearer and simpler. Say you wanted to combine IN1 and IN2 on the variable ID and keep the variable A and B from IN1 and the variable X and Y from the IN2.
data out ;
merge in1 in2 ;
by id ;
keep id a b x y ;
run;
Second I would resist the urge to generate too complex a web of macro code. It will make the programs harder to understand for the next programmer. Including yourself two weeks later. Your particular example does not look like something that is worth coding as a macro. You are not really typing less information, just using a few commas in place of where your SQL code would have had keywords like FROM or JOIN.
Now to answer your actual question. To pass in a list of values to macro use a delimited list. When at all possible use space as the delimiter, but especially avoid using comma as the delimiter. This will be easier to type, easier to pass into the macro and easier to use since it matches the SAS language as you can see in the data step above. If you really need to generate code like SQL syntax that uses commas then have the macro code generate them where needed.
%macro ij
(x= /* First dataset name */
,y= /* Second dataset name */
,by= /* BY variable list */
,to= /* Output dataset name. If empty use data step to generate DATAn work name */
,xc= /* Variable list from first dataset */
,yc= /* Variable list from second dataset */
);
%if not %length(&to) %then %do;
* Let SAS generate a name for new dataset ;
data ; run;
%let to=&syslast ;
proc delete data=&to; run;
%end;
%if not %length(&xc) %then %let xc=*;
%if not %length(&yc) %then %let yx=*;
%local i sep ;
proc sql ;
create table &to as
select
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
&sep.T1.%scan(&by,&i)
%let sep=,;
%end;
%do i=1 %to %sysfunc(countw(&xc)) ;
&sep.T1.%scan(&xc,&i)
%end;
%do i=1 %to %sysfunc(countw(&yc)) ;
&sep.T2.%scan(&yc,&i)
%end;
from &x T1 inner join &y T2 on
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
&sep.T1.%scan(&by,&i)=T2.%scan(&by,&i)
%let sep=,;
%end;
;
quit;
%mend ij ;
Try it:
options mprint;
%ij(x=sashelp.class,y=sashelp.class,by=name,to=want,xc=age,yc=height weight);
SAS LOG:
MPRINT(IJ): proc sql ;
MPRINT(IJ): create table want as select T1.name ,T1.age ,T2.height ,T2.weight from sashelp.class
T1 inner join sashelp.class T2 on T1.name=T2.name ;
NOTE: Table WORK.WANT created, with 19 rows and 4 columns.
MPRINT(IJ): quit;
Instead of vectors, think simple lists.
Pass your variable lists as unquoted, space separated list of values. The values are SAS variable names that can be scanned out as tokens.
%macro ij (x=, ...);
...
%local i token;
%let i = 1;
%do %while (%length(%scan(&X,&i)));
%let token = %scan(&X,&i);
&token.,/* emit the token as source code */
%let i = %eval(&i+1);
%end;
...
%mend;
%ij ( x = one two three, ... )
Be sure to localize all your macro variables to prevent unwanted side effects outside the macro.
For consistency I try to use i/o related macro parameters that mimic SAS Procs -- data=, out=, file=, ...
Some would say named arguments are verbose!
If your 'proto-code' expects the xci symbol to be some sort of serially numbered variable, it is not. You would have to use %local xc&i; %let xc&i= for assignment, and &&xc&i for resolution. Also, your original code references &from which is not passed.
Building is fun. I would also recommend surveying past conference papers and SAS literature for similar works that may already meet your goal.
You could start with a space-separated list of column names and avoid looping entirely:
/*Define list of columns*/
%let COLS = A B C;
%put COLS = &COLS;
/*Add table alias prefix*/
%let REGEX = %sysfunc(prxparse(s/(\S+)/t1.$1/));
%let COLS = %sysfunc(prxchange(®EX,-1,&COLS));
%put COLS = &COLS;
%syscall prxfree(REGEX);
/*Condense multiple spaces to a single space*/
%let COLS = %sysfunc(compbl(&COLS));
%put COLS = &COLS;
/*Replace spaces with commas*/
%let COLS = %sysfunc(translate(&COLS,%str(,),%str( )));
%put COLS = &COLS;
In the end as #Tom noted, SAS dataset options are more convenient, and using them one doesn't need to loop over variables.
Here is the macro I came with :
*--------------------------------------------------------------------------------------------- ;
* JOIN ;
* Performs any join (defaults to inner join). ;
* By default left table is overwritten (convenient for successive left joins) ;
* Performs a natural join so columns should be renamed accordingly through 'rename' parameters ;
*----------------------------------------------------------------------------------------------;
%macro join
(data1= /* left table */
,data2= /* right table */
,keep1= /* columns to keep (default: keep all), don't use with drop */
,keep2=
,drop1= /* columns to drop (default: none), don't use with keep */
,drop2=
,rename1= /* rename statement, such as 'old1 = new1 old2 = new2 */
,rename2=
,j=ij /* join type, either ij lj or rj */
,out= /* created table, by default data1 (left table is overwritten)*/
);
%if not %length(&out) %then %let out = &data1;
%if %length(&keep1) %then %let keep1 = keep=&keep1;
%if %length(&keep2) %then %let keep2 = keep=&keep2;
%if %length(&drop1) %then %let drop1 = drop=&drop1;
%if %length(&drop2) %then %let drop2 = drop=&drop2;
%if %length(&rename1) %then %let rename1 = rename=(&rename1);
%if %length(&rename2) %then %let rename2 = rename=(&rename2);
%let kdr1 =;
%let kdr2 =;
%if (%length(&keep1) | %length(&drop1) | %length(&rename1)) %then %let kdr1 = (&keep1&drop1 &rename1);
%if (%length(&keep2) | %length(&drop2) | %length(&rename2)) %then %let kdr2 = (&keep2&drop2 &rename2);
%if &j=lj %then %let j = LEFT JOIN;
%if &j=ij %then %let j = INNER JOIN;
%if &j=rj %then %let j = RIGHT JOIN;
proc sql;
create table &out as select *
from &data1&kdr1 t1 natural &j &data2&kdr2 t2;
quit;
%mend;
Reproducible Examples:
data temp1;
input letter $ number1 $;
datalines;
a 1
a 2
a 3
b 4
c 8
;
data temp2;
input letter $ letter2 $ number2 $;
datalines;
a c 666
b d 0
;
* left join on common columns into new table temp3;
%join(data1=temp1,data2=temp2,j=lj,out=temp3)
* inner join by default, overwriting temp 1, after renaming to join on another column;
%join(data1=temp1,data2=temp2,drop2=letter,rename2= letter2=letter)
I've a parameter of named _ID who accepts multiple values from a list and send them to my stored process, lets say I've sent four values (1,2,3,4) in it, I'll receive them in my store process as,
_ID0 = 4
_ID1 = 1
_ID2 = 2
_ID3 = 3
_ID4 = 4
_ID_COUNT = 4
I'm receiving and filtering them as following.
%let ID = "&_ID";
%let Count = "&_ID_COUNT";
%macro IDs;
%global _ID0;
/* If more than one ID value was selected then cycle through the values */
%if %eval(&_ID0 ge 2) %then %do;
%do i=1 %to &_ID0;
&&_ID&i
%end;
%end;
/* If only one ID value was selected */
%else &_ID
%mend;
****************************;
%macro filter;
%if &Count ne "0" %then %do;
%stpbegin;
proc sql noprint;
create table users as
select *
from work.users
where id in(%IDs);
run;
%stpend;
%end;
%mend; %filter;
there is not any error in log above one is my code but It's not filtering anything. if user table has values 1-10 in id column user should be update with only 1,2,3,4
user
id
1
2
3
4
5
6
7
8
9
10
after filter i want
user
id
1
2
3
4
I don't know What's wrong and did I miss any better approach.
Your code should work. The %IDS macro could be more concise and might need more logic to deal with the inconsistency of how macro variables are created when count is less than 2. So this macro will make sure that the 0 and 1 variables are populated (at least while the macro is running) before trying to loop over them.
%macro IDs;
%local i ;
%let _id0 = &_id_count ;
%if &_id0 = 1 %then %let _id1 = &_id ;
%if &_id0 = 0 %then -99999 ;
%do i=1 %to &_id0;
&&_ID&i
%end;
%mend IDs;
Based on your example data it should work like this:
1071 %put (%ids);
(1 2 3 4)
What value do you want to emit if they don't select any values? I have set this example up to generate -9999, but your other macro should already be skipping the call in that case so it shouldn't matter.