Put with macro variable as format - sas

I have a dataset with a variable called pt with observations 8.1,8.2,8.3 etc and a variable called mean with values like 8.24 8.1 8.234 etc. Which are paired with each other.
I want to be able to set my put informat to the formats from the variable num.
I get the errors "Expecting an arithmetic expression"
"the symbol is not recognized and will be ignored" and "syntax error" from my code. (underlining the &fmt. part)
if pt=&type;
call symput("fmt",pt);
fmt_mean = putn(mean,&fmt.);
Thanks in advance for your help.

The macro processor's work is done before SAS compiles and runs the data step. So trying to place the value into a macro variable and then use it immediately to generate and execute SAS code will not work.
But since you are using the PUTN() function it can use the value of an actual variable, so there is no need to put the format into a macro variable.
fmt_mean = putn(mean,pt);

Please, post your data set and data step. Your description is hard to understand.
However the solution seems to be simple: do not use macro variables! You don't need them here. Unlike put() function which expect format know at compile time (that is when you can use macro variables) its analog putn() expects second argument to be variable. Of course, it works a little slower due to that permittance. So your code can look like that:
data ...;
set ...(keep=mean pt);
fmt_mean = putn(mean, pt);
run;
where pt variable maybe numeric, i.e. 8.2, or character, i.e. '8.2'.
If you want to understand how SAS macro works and what call symput does look here:
https://stackoverflow.com/a/69979074/7864377

Related

Check whether a SAS date is valid before using MDY()?

Is there a way to check whether three variables (month, day, year) can actually build a valid SAS date format before handing those variables over to MDY() (maybe except checking all possible cases)?
Right now I am dealing with a couple of thousand input variables and let SAS put them together - there are a lot of date variables which cannot work like month=0, day=33, year=10 etc. and I'd like to catch them. Otherwise I will get way too many Notes like
NOTE: Invalid argument to function MDY(13,12,2014)
which then eventually culminate in Warnings like
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
I really would like too prevent getting those Warnings and I thought the best way would be to actually check the validity of the date - any recommendations?
Use an INFORMAT instead, then you can use the ?? modifier to suppress errors.
month=0;
day=33;
year=10;
date = input(cats(put(year,z4.),put(month,z2.),put(day,z2.)),??yymmdd8.);
SAS documentation: ? or ?? (Format Modifiers for Error Reporting)

How to choose indexed assignment variable dynamically in SAS?

I am trying to build a custom transformation in SAS DI. This transformation will "act" on columns in an input data set, producing the desired output. For simplicity let's assume the transformation will use input_col1 to compute output_col1, input_col2 to compute output_col2, and so on up to some specified number of columns to act on (let's say 2).
In the Code Options section of the custom transformation users are able to specify (via prompts) the names of the columns to be acted on; for example, a user could specify that input_col1 should refer to the column named "order_datetime" in the input dataset, and either make a similar specification for input_col2 or else leave that prompt blank.
Here is the code I am using to generate the output for the custom transformation:
data cust_trans;
set &_INPUT0;
i=1;
do while(i<3);
call symputx('index',i);
result = myfunc("&&input_col&index");
output_col&index = result; /*what is proper syntax here?*/
i = i+1;
end;
run;
Here myfunc refers to a custom function I made using proc fcmp which works fine.
The custom transformation works fine if I do not try to take into account the variable number of input columns to act on (i.e. if I use "&&input_col&i" instead of "&&input_col&index" and just use the column result on the output table).
However, I'm having two issues with trying to make the approach more dynamic:
I get the following warning on the line containing
result = myfunc("&&input_col&index"):
WARNING: Apparent symbolic reference INDEX not resolved.
I do not know how to have the assignment to the desired output column happen dynamically; i.e., depending on the iteration of the do loop I'd like to assign the output value to the corresponding output column.
I feel confident that the solution to this must be well known amongst experts, but I cannot find anything explaining how to do this.
Any help is greatly appreciated!
You can't use macro variables that depend on data variables, in this manner. Macro variables are resolved at compile time, not at run time.
So you either have to
%do i = 1 %to .. ;
which is fine if you're in a macro (it won't work outside of an actual macro), or you need to use an array.
data cust_trans;
set &_INPUT0;
array in[2] &input_col1 &input_col2; *or however you determine the input columns;
array output_col[2]; *automatically names the results;
do i = 1 to dim(in);
result = myfunc(in[i]); *You quote the input - I cannot see what your function is doing, but it is probably wrong to do so;
output_col[i] = result; /*what is proper syntax here?*/
end;
run;
That's the way you'd normally do that. I don't know what myfunc does, and I also don't know why you quote "&&input_col&index." when you pass it to it, but that would be a strange way to operate unless you want the name of the input column as text (and don't want to know what data is in that variable). If you do, then pass vname(in[i]) which passes the name of the variable as a character.

How do I retrieve numerical value of macro argument set in data step

I've gone in circles on this one for 1.5 hours, so I'm giving in and asking for help here. What I'm trying to do is dead simple but I cannot for the life of me find a link describing the process.
I have the following data step:
data _null_;
some_date = "01JAN2000"D;
call symput('macro_input_date',left(put(some_date),date9.)));
%useful_macro(&macro_input_date);
run;
where a date value is passed to a macro function (I'm new to these). I'd like to use the numeric value of the date value - let's be wild and say I want to get the value of the year, multiply it by the day value, and subtract the remainder after dividing the month value by 3. I can't seem to get just the year value out of the input. I've tried various things such as
symget, both "naked" and prepended with "%", with arguments that represent all possible permutations of the following variants:
have a naked reference to the variable, e.g. macro_input_date
enclose in single quotes, e.g. 'macro_input_date'
enclose in double quotes, e.g. "macro_input_date"
prepend with the ampersand, e.g. &macro_input_date
direct call to %sysfunc(year(<argument as variously specified above>)
Can anyone tell me what I am missing?
Thanks!
Given that you asked about macro functions, I'll guess that your example date processing is just an example. Talking about macro functions in general, it's important to understand that a macro function will (generally) not be doing any processing of its own, it will just be generating some data step code to do some task. So, for something like your contrived example, the data step code would be something like:
data out;
set in; * Assume this contains a numeric called 'some_date';
result = year(some_date) * day(some_date) - mod(month(some_date), 3);
run;
To macroise this, you don't need to transfer the data values to the macro, you just need to transfer the variable name:
%macro date_func(var=);
year(&var) * day(&var) - mod(month(&var), 3)
%mend;
data out;
set in; * Assume this contains a numeric called 'some_date';
result = %date_func(var=some_date);
run;
Note that the value of the var parameter here is the literal text some_date, not the value of the some_date data step variable. There are other ways to do it of course - you could actually pass this macro a date literal and it would still work:
data out;
set in; * Assume this contains a numeric called 'some_date';
result = %date_func(var="21apr2017"d);
run;
so it all depends on exactly what you're trying to do... maybe you want to assign the result to another macro variable, so it doesn't need to be part of a data step at all, in which case you could do a similar thing with %sysfunc functions etc.
If you're just trying to get the year, you would do something like:
data _null_;
some_date = "01JAN2000"D;
call symput('macro_input_date',left(put(some_date,date9.)));
yearval = substr(symget('macro_input_date'),6,4);
put yearval=;
run;
Your macro value (&macro_input_date) is not the actual date value (14610) but is the text 01JAN2000. So you cannot use the year function (unless you INPUT it back), you would use substr to grab the year part.
Of course, this is all sort of pointless as going to/from macro variable doesn't really accomplish much here.
Are you just have trouble with date literals? Your data step code
data _null_;
some_date = "01JAN2000"D;
call symput('macro_input_date',left(put(some_date),date9.)));
run;
is just going to do the same thing as
%let macro_input_date=01JAN2000 ;
Now if you want to treat that string of characters as if it represents a date then you need to either wrap it up as a date literal
"&macro_input_date"d
Or convert it.
%sysfunc(inputn(&macro_input_date,date9))
Why not just store the actual date value into the macro variable?
call symputx('macro_input_date',some_date);
Then it wouldn't look like a date to you but it would look like a date to the YEAR() function.

SAS new variable name using macro variable

I am trying to create a new variable based on the value of a macro variable. However, SAS highlights 'vari' as red, seemingly indicating that I am doing something wrong. The statement still seems to get executed correctly though. Any thoughts?
%let i=7;
data d1;
set d1;
vari&i=7;
run;
SAS syntax highlighter is an aid, but there are many situations where it is not "correct". Particularly for the macro language, it can't always guess how symbols will resolve. It doesn't have all the information (or intelligence) as the SAS word scanner/tokenizer. I use syntax highlighting as a hint that something might be wrong, but I ignore it when I've checked the code and confirmed it is correct.
The code in your example is fine.

Stata : generate/replace alternatives?

I use Stata since several years now, along with other languages like R.
Stata is great, but there is one thing that annoys me : the generate/replace behaviour, and especially the "... already defined" error.
It means that if we want to run a piece of code twice, if this piece of code contains the definition of a variable, this definition needs 2 lines :
capture drop foo
generate foo = ...
While it takes just one line in other languages such as R.
So is there another way to define variables that combines "generate" and "replace" in one command ?
I am unaware of any way to do this directly. Further, as #Roberto's comment implies, there are reasons simply issuing a generate command will not overwrite (see: replace) the contents of a variable.
To be able to do this while maintaining data integrity, you would need to issue two separate commands as your question points out (explicitly dropping the existing variable before generating the new one) - I see this as method in which Stata forces the user to be clear about his/her intentions.
It might be noted that Stata is not alone in this regard. SQL Server, for example, requires the user drop an existing table before creating a table with the same name (in the same database), does not allow multiple columns with the same name in a table, etc. and all for good reason.
However, if you are really set on being able to issue a one-liner in Stata to do what you desire, you could write a very simple program. The following should get you started:
program mkvar
version 13
syntax anything=exp [if] [in]
capture confirm variable `anything'
if !_rc {
drop `anything'
}
generate `anything' `exp' `if' `in'
end
You then would naturally save the program to mkvar.ado in a directory that Stata would find (i.e., C:\ado\personal\ on Windows. If you are unsure, type sysdir), and call it using:
mkvar newvar=expression [if] [in]
Now, I haven't tested the above code much so you may have to do a bit of de-bugging, but it has worked fine in the examples I've tried.
On a closing note, I'd advise you to exercise caution when doing this - certainly you will want to be vigilant with regard to altering your data, retain a copy of your raw data while a do file manipulates the data in memory, etc.