The following variables seem to be a standard macro in multiple SAS Code that I come across. Can someone explain the following please?
&dsin.
&dsout.
&cj_yyyymm_1.
&cj_yyyymm_2.
No, those are not "standard" macro variables. There are automatic macro variables, which you can view with
%put _automatic_;
And some other system-generated macro variables are sometimes just stored as regular global macro variables, which you can view with:
%put _global_;
or
%put _all_;
Which will print all currently defined macro variables - run it at startup and you'll see just the ones SAS defines.
What you show there are macro variables that perhaps are standard for your company, but don't have any standard meaning. I would posit that &dsin is an input dataset to a macro, and &dsout is an output dataset, and the other two are year/month stamped variables, but they don't have any official, standard definition, nor would I say those are particularly commonly seen.
Those are not generic, they are specific to your program or company but you can make educated guesses. DS is a common abbreviation for data set.
&dsin. = Input data set
&dsout. = Output data set
&cj_yyyymm_1. = some date parameter, probably like 202110
&cj_yyyymm_2. = some other date parameter....
CJ could mean something specific at your company or may reference something in your code.
Related
I have macros in my code that are created and used in random order.
It's showing "reference not resolved" for "%put &pincuk", and syntax error for the "&pincuk". It works fine when I run the code twice. I'm guessing it happens when SAS is reaching the &pincuk before the macro is created. For example,
data x.fcastukcalc;
do day=&daycountuk + 1 to &daycountkorea
retain fcast &ukdmax;
fcast=(fcast * &pincuk) + fcast;
output;
end;
run;
/* then this, later on */
data _null_;
keep deathsuk inc1 inc2 pinc1 pinc2 pinc;
set x.uk;
inc1=deathsuk - lag1(deathsuk);
pinc1=(inc1 / lag1(deathsuk));
inc2=lag1 (deathsuk) - lag2(deathsuk);
pinc2=(inc2 / lag2(deathsuk));
pinc=(pinc1 + pinc2) / 2;
call symputx('pincuk', pinc);
run;
%put &pincUK;
If, on the first run through, x.uk does not exist, or has zero rows, the call symputx('pincUK',pinc); will never be reached. So, in a 'random' situation of code running, lets be more generous and say a developmental situation, your expectations may have been subverted by a misremembered operation or subtle change of state. Check your code for %SYMDEL statements. During development of a large macro, you may be submitting parts of the interior of a macro and not have complete 'simulation' of the state expected to exist during an actual call of the macro.
Start a fresh SAS session and see if the problem with the code persists and can be reproduced more directly.
To be specific, your question is about macro symbols, often referred to as macro variables. Macros themselves are names for groups of SAS programming statements.
From help (good read)
When SAS compiles program text, two delimiters trigger macro processor activity:
&name
refers to a macro variable. Replacing Text Strings Using Macro Variables explains how to create a macro variable. The form &name is called a macro variable reference.
%name
refers to a macro. Generating SAS Code Using Macros explains how to create a macro. The form %name is called a macro call.
The text substitution produced by the macro processor is completed before the program text is compiled and executed. The macro facility uses statements and functions that resemble the statements and functions that you use in the DATA step. An important difference, however, is that macro language elements can enable only text substitution and are not present during program or command execution.
------ Edit (added) ------
Writing code with lots of abstractions (i.e. macro variables) requires a certain level of discipline and system design. A macro must be compiled before it can be called, however the macro variables (i.e. symbols) it resolves within do NOT need to exist at compilation time, only at macro invocation (call) time. For old fuddies the concept is like a mail merge boilerplate with too many fields.
Macro variables can be local (as a parameter in the macro definition, or explicitly stated in a %LOCAL, or as an assignment to a previously undefined symbol. Reliance on GLOBAL macro variables should be reduced to a minimum or none, as should an over-reliance on variables to be expected to exist in the caller scope. Dependence on a variable being global should be explicitly stated with a %GLOBAL in the macros source.
An assignment to a non-declared %LOCAL can be a problem because the assignment could accidentally (unexpectedly) replace the value of a declared or existing variable in an outer (or calling) scope, and be the cause of it don't work right problems. Good discipline is to explicitly %LOCAL all variables within a macro definition - the macro system does not have a strict mode (as found in other languages) that reports problematic macro variables.
I have a stubborn lecturer who insists that defining all macro variables inside the parenthesis of the macro statement like this
%MACRO TEST(Var1= , Var2= , Var3= );
What are the advantages of this? What are the advantages of actually defining your function like this instead:
%LET var1= <Insert long list of 50 variables here>;
%LET var2= <name of input data>;
%LET var3= <group by variables>;
%MACRO TEST;
I argue that the second option provides clarity and a neat coding structure, could anyone point out any other advantages or disadvantages of the two methods?
Two main points:
Use of global variables is widely considered to be bad practice.
Using your system, how would you write out multiple calls to the same macro in different places in your code? How would you keep track of which parameter lists correspond to which macro calls?
Macro variable scope - having variables only available within the macro ensures that any previously declared macro variables are not used by accident. If you accidentally run things out of order, with your method you're likely to run into issues.
It makes it clear what parameters the macro requires - otherwise you have to read the code, find all the & and declare them at the top.
Less typing overall
You can set default values within the parameter list and then only list/call what options you need when declaring.
Macro definition:
%macro test(var1 = , var2 = , var3 = 25);
Macro call/execution:
%test(var1 = 5, var2 = 4);
What value will var3 have in the macro?
You're still using very simple cases and a lot of the more complicated usages work better when you have parameters. Consider the case with calling the same macro 50 times for different parameters which happen to be in a data set. You can use CALL SYMPUTX() for each but then you'll run into timing issues of when the macro is called and such. Whereas using CALL EXECUTE and the parameters inline makes it very easy.
PS. In general, the odds are 99% your lecturer will be correct at this point in time when you're starting out. Assuming that will help you frame your questions differently, rather than trying to prove someone wrong (which is how your question is coming across) you'll be looking at understanding how something works. Also, it's possible that your lecturer will be online as well so if they see your questions at some point you won't come across as a know-it-all kid. Ultimately that's your choice though.
It depends on how the macros will be used. Global macros can be very helpful and as you pointed out provide clarity if they are used correctly. For example, if I have a bunch of SAS programs that need to run in order to generate data sets or reports, I would put them in a wrapper program and use global macros.
%Let year = 2019;
%Let State = CA;
%let Dept = DOE;
%macro MakeRpt;
%include "MakeData.sas";
.
. more %include statements
.
%include "GenerateReport.sas";
%mend;
%makeRpt;
However, if I'm making a macro "utility" that will be called by a user whenever they need it, using local macros makes the most sense. It's really a question of how a macro will be used as to whether global or local makes more sense.
The only time you'd ever want to do this is if you have global variables that will appear throughout the program. For example, it is not uncommon to have special setup or initialization programs to hold commonly-referred values, especially when going between development and production. This can make things easier to handle when promoting a program, or easier to adjust if certain things change later on (such as a directory location or hostname).
For example, the below macro can change some global macro variables to point to certain directories that differ between two servers depending on where the code is run.
%macro dev_prod;
%global directory inlib outlib;
%if(&syshostname. = production-server.company.com) %then %do;
%let directory = C:\prodlocation;
%let inlib = C:\prodlib;
%let outlib = C:\outlib;
%end;
%else %if(&syshostname. = dev-server.company.com) %then %do;
%let directory = C:\devlocation;
%let inlib = C:\devlib;
%let outlib = C:\outlib;
%end;
%mend;
%dev_prod;
In general, you want to use local macro variables in macros that perform specific functions. For example, the below macro regresses on variables on a dataset:
%macro regression(data=, dep=, indep=);
proc reg data=&data.;
model &dep. = &indep.;
run;
%mend;
%regression(data=sashelp.cars, dep=horsepower, indep=msrp);
In a modern language (e.g.python), you could do something like
def do_a_thing(foo,bar):
thing = (... do a thing to foo(bar) ...)
return thing
How does one do this (or something similar enough) in SAS? In my concrete application I have defined a bunch of functions, and need to do the same thing to all of them, so I thought it would be nice to have a function that takes a function as an argument and then does the thing to that function, and then apply it where needed. The "obvious" solution doesn't work, e.g. in a proc fcmp doing this:
function do_a_thing(foo,bar);
thing = (... do a thing to foo(bar) ...)
return(thing);
endsub;
This fails because SAS doesn't know about any function called foo, and throws an error.
I expect the answer involves some macro trickery, but I find the macro system somewhat opaque and can't quite figure it out. What's the best way to do this?
Show your code regarding defined a bunch of functions.
Macro, at it's core, is a text generating system with side effects. Macro can perform dispatch like processing using indirect resolution -- see answer Invoke a Macro Using A Macro Variable Name
If you are trying to code a general purpose function invokable from DATA Step, SQL, %SYSFUNC, or DS2 both Proc FCMP and DS2 can create user defined functions (UDF). The method(or function)-name to invoke (or dispatch, or APPLY) would likely have to be passed as a string into said UDF.
You will also want to look into DOSUB and DOSUBL
Details
The DOSUBL function enables the immediate execution of SAS code after a text string is passed. Macro variables that are created or updated during the execution of the submitted code are exported back to the calling environment.
DOSUBL returns a value of zero if SAS code was able to execute, and returns a nonzero value if SAS code was not able to execute.
As for modern... SAS SCL had CALL APPLY ages ago -- sadly SCL never made it to the Foundation product or escaped the confines of SAS.
You haven't really shown an example where this might be required (or even useful).
But in general in SAS you would use code generation to implement that type of mis-direction. For example your second "function" could be a statement style macro. That is macro that only emits part of a statement to be included into the actual SAS program you want to create.
%macro do_a_thing(function,arglist);
&function(&arglist)
%mend;
Then you might use it in a program
data want ;
set have ;
mean = %do_a_thing(mean,of _numeric_);
std = %do_a_thing(std,of _numeric_);
run;
For more complex things you will have more trouble. The new-ish DOSUBL() function might help in that they can allow you to run multiple steps in a separate execution space. But for most things the performance cost might be too high to make it worth while.
Don't really know anything about SAS, but in general you would need some way to make a difference between make a function call and passing a parameter..
A Google search for the title of this question led me here, perhaps it might help you:
https://communities.sas.com/t5/SAS-Procedures/how-to-pass-a-parameter-with-macro-variables-into-macro/td-p/330627#messageview_4
In SAS, why cannot we write
let name = abc;
put "&name";
Why do we have to include the % sign like this:
%let name = abc;
%put &name;
Imagine I am writing the statements in the main body of the code, not inside a data step.
Also, is the second way of writing it same as:
%macro test;
%let name = abc;
%put &name;
%mend;
The %LET and %PUT statements are part of the SAS macro processor and not part of base SAS. The % (and &) triggers are what activate the macro processor and allow it to recognize that these strings need be processed before they are passed to the SAS compiler/interpreter.
You cannot use an assignment statement like
x = 3.5 ;
outside of a data step (or some proc that support these types of statements).
To your second question, if you wrap the macro statements inside of a macro definition then the main impact will be.
The macro variable NAME will be defined as local to the macro if it does not already exist.
Nothing will happen until you invoke the macro. The %macro statement begins the definition of a macro. So all of the code up to the corresponding %mend statement define the macro. To execute it you will need to invoke the macro using syntax like %test.
%let and %put are part of the SAS macro language. Macro language statements are (with one or two particular exceptions) prefixed by % to tell the SAS macro parser to operate on them.
They do entirely different things from the non-% version - except when it works out to the same thing. You can write put "&mvar."; - as long as it's in a data step (As that's a data step statement). Macro commands/functions/statements are allowed in open code sometimes (and not in others).
Writing it inside an actual macro is more-or-less the same. There are scoping issues, though; &name won't be available outside of that macro, unless it's been declared global.
What exactly is %sysfunction in sas and what could it be used for. Is it that it has many functions and we could use those functions for various getting results.
It's a macro function that can execute various inherent SAS functions or user custom made functions. However, I believe function inside cannot be a macro function. And there are also some other exceptions, such as you cannot use input with %sysfunction. In %sysfunc, a format can also be applied to the output of the function that is being executed. Here are papers by Chris Yindra and Paul Hamilton outlining the usage: (imo. the latter is more clearer)
http://www2.sas.com/proceedings/sugi23/Advtutor/p44.pdf
http://www.lexjansen.com/pnwsug/2005/how/SysFunc.pdf
The SAS macro language only has around 30 builtin functions. %sysfunc allows you to access the 100s of standard SAS functions from within the macro language.