Quotation mark SAS (+) PROC FORMAT value|invalue - sas

I'm still stucked with SAS special characters treatment.
%macro mFormat();
%do i=1 %to &numVar. ;
proc format library = work ;
invalue $ inf&&nomVar&i..s
%do j=1 %to &&numMod&i.;
"%superq(tb&i.mod&j.)" = &j.
%end;
;
run;
proc format library = work ;
value f&&nomVar&i..s
%do k=1 %to &&numMod&i.;
&k. = "%superq(tb&i.mod&k.)"
%end;
;
run;
%end;
%mend mFormat;
%mFormat();
As you can see, the program supposes to create the format and the informats for each variable. My only problem is when the variable name resolves to Brand which contains
GOTAN-GOTAN
FRANCES-FRANCES
+&DECO-+DECO&
etc ...
These names leads me to this error
“ERROR: This range is repeated, or values overlap:”
I hope I can force SAS to read those names. Or perhaps, this is not the best approach to generate FORMATS and INFORMATS for variables that contain these characters( &, %, -, ', ").

Because your macro is using so many global macro variables, it's hard to see the problem. That error message indicates that your macro is genenerating duplicate ranges to PROC FORMAT. The complete error message should tell you which range is in error; if that is all you see, my guess is that more than more of your macro variables resolves to a blank.
There is no restriction on using hypens when defining PROC FORMAT ranges. I made up this little example to illustrate:
proc format library = work ;
invalue infs
'GOTAN-GOTAN' = 1
'FRANCES-FRANCES' = 2
'+&DECO-+DECO&' = 3;
value fs
1 = 'GOTAN-GOTAN'
2 = 'FRANCES-FRANCES'
3 = '+&DECO-+DECO&';
run;
data a;
test = 'FRANCES-FRANCES';
in_test = input(test,infs.);
put test= in_test= in_test= fs.;
run;
Although you may find some trick to solve your macro problem, I'd suggest you toss that out and use the CNTLIN option of PROC FORMAT to use a data set to create your custom formats and informats. That would certainly make things easier to maintain and might also help create some useful metadata for your project. Here is a simple example to create the same format and informat as above:
data fmt_defs;
length fmtname start label $32 type $1;
fmtname = 'INFS';
type = 'I';
start = 'GOTAN-GOTAN'; label = '1'; output;
start = 'FRANCES-FRANCES'; label = '2'; output;
start = '+&DECO-+DECO&'; label = '3'; output;
fmtname = 'FS';
type = 'N';
start = '1'; label='GOTAN-GOTAN'; output;
start = '2'; label='FRANCES-FRANCES'; output;
start = '3'; label='+&DECO-+DECO&'; output;
run;
proc format library = work cntLin=fmt_defs;
run;
You can find much more information about PROC FORMAT in the online documentation.
Good luck,
Bob

I think the hypen is the problem for the samples you provided. Maybe you could use a character replacement function to TRANSLATE the hyphen (or other problem characters) to something else like a space or underscore.
%Let Test=One-Two;
%Put &test;
%Let Test=%sysfunc(translate(&test,%str(_),%str(-)));
%Put &test;

Related

Matching SAS character variables to a list

So I have a vector of search terms, and my main data set. My goal is to create an indicator for each observation in my main data set where variable1 includes at least one of the search terms. Both the search terms and variable1 are character variables.
Currently, I am trying to use a macro to iterate through the search terms, and for each search term, indicate if it is in the variable1. I do not care which search term triggered the match, I just care that there was a match (hence I only need 1 indicator variable at the end).
I am a novice when it comes to using SAS macros and loops, but have tried searching and piecing together code from some online sites, unfortunately, when I run it, it does nothing, not even give me an error.
I have put the code I am trying to run below.
*for example, I am just testing on one of the SASHELP data sets;
*I take the first five team names to create a search list;
data terms; set sashelp.baseball (obs=5);
search_term = substr(team,1,3);
keep search_term;;
run;
*I will be searching through the baseball data set;
data test; set sashelp.baseball;
run;
%macro search;
%local i name_list next_name;
proc SQL;
select distinct search_term into : name_list separated by ' ' from work.terms;
quit;
%let i=1;
%do %while (%scan(&name_list, &i) ne );
%let next_name = %scan(&name_list, &i);
*I think one of my issues is here. I try to loop through the list, and use the find command to find the next_name and if it is in the variable, then I should get a non-zero value returned;
data test; set test;
indicator = index(team,&next_name);
run;
%let i = %eval(&i + 1);
%end;
%mend;
Thanks
Here's the temporary array solution which is fully data driven.
Store the number of terms in a macro variable to assign the length of arrays
Load terms to search into a temporary array
Loop through for each word and search the terms
Exit loop if you find the term to help speed up the process
/*1*/
proc sql noprint;
select count(*) into :num_search_terms from terms;
quit;
%put &num_search_terms.;
data flagged;
*declare array;
array _search(&num_search_terms.) $ _temporary_;
/*2*/
*load array into memory;
if _n_ = 1 then do j=1 to &num_search_terms.;
set terms;
_search(j) = search_term;
end;
set test;
*set flag to 0 for initial start;
flag = 0;
/*3*/
*loop through and craete flag;
do i=1 to &num_search_terms. while(flag=0); /*4*/
if find(team, _search(i), 'it')>0 then flag=1;
end;
drop i j search_term ;
run;
Not sure I totally understand what you are trying to do but if you want to add a new binary variable that indicates if any of the substrings are found just use code like:
data want;
set have;
indicator = index(term,'string1') or index(term,'string2')
... or index(term,'string27') ;
run;
Not sure what a "vector" would be but if you had the list of terms in a dataset you could easily generate that code from the data. And then use %include to add it to your program.
filename code temp;
data _null_;
set term_list end=eof;
file code ;
if _n_ =1 then put 'indicator=' # ;
else put ' or ' #;
put 'index(term,' string :$quote. ')' #;
if eof then put ';' ;
run;
data want;
set have;
%include code / source2;
run;
If you did want to think about creating a macro to generate code like that then the parameters to the macro might be the two input dataset names, the two input variable names and the output variable name.

put values to a file using functions without creating new variables

I am processing a dataset, the contents of which I do not know in advance. My target SAS instance is 9.3, and I cannot use SQL as that has certain 'reserved' names (such as "user") that cannot be used as column names.
The puzzle looks like this:
data _null_;
set some.dataset; file somefile;
/* no problem can even apply formats */
put name age;
/* how to do this without making new vars? */
put somefunc(name) max(age);
run;
I can't put var1=somefunc(name); put var1; as that may clash with a source variable named var1.
I'm guessing the answer is to make some macro function that will read the dataset header and return me a "safe" (non-clashing) variable, or an fcmp function in a format, but I thought I'd check with the community to see - is there some "old school" way to outPUT directly from a function, in a data step?
Temporary array?
34 data _null_;
35 set sashelp.class;
36 array _n[*] _numeric_;
37 array _f[3] _temporary_;
38 put _n_ #;
39 do _n_ = 1 to dim(_f);
40 _f[_n_] = log(_n[_n_]);
41 put _f[_n_]= #;
42 end;
43 put ;
44 run;
1 _f[1]=2.6390573296 _f[2]=4.2341065046 _f[3]=4.7229532216
2 _f[1]=2.5649493575 _f[2]=4.0342406382 _f[3]=4.4308167988
3 _f[1]=2.5649493575 _f[2]=4.1789920363 _f[3]=4.5849674787
4 _f[1]=2.6390573296 _f[2]=4.1399550735 _f[3]=4.6298627986
5 _f[1]=2.6390573296 _f[2]=4.1510399059 _f[3]=4.6298627986
6 _f[1]=2.4849066498 _f[2]=4.0483006237 _f[3]=4.4188406078
7 _f[1]=2.4849066498 _f[2]=4.091005661 _f[3]=4.4367515344
8 _f[1]=2.7080502011 _f[2]=4.1351665567 _f[3]=4.7229532216
9 _f[1]=2.5649493575 _f[2]=4.1351665567 _f[3]=4.4308167988
The PUT statement does not accept a function invocation as a valid item for output.
A DATA step does not do columnar functions as you indicated with max(age) (so it would be even less likely to use such a function in PUT ;-)
Avoid name collisions
My recommendation is to use a variable name that is highly unlikely to collide.
_temp_001 = somefunc(<var>);
_temp_002 = somefunc2(<var2>);
put _temp_001 _temp_002;
drop _temp_:;
or
%let tempvar = _%sysfunc(rand(uniform, 1e15),z15.);
&tempvar = somefunc(<var>);
put &tempvar;
drop &tempvar;
%symdel tempvar;
Repurpose
You can re-purpose any automatic variable that is not important to the running step. Some omni-present candidates include:
numeric variables:
_n_
_iorc_
_threadid_
_nthreads_
first.<any-name> (only tweak after first. logic associated with BY statement)
last.<any-name>
character variables:
_infile_ (requires an empty datalines;)
_hostname_
avoid
_file_
_error_
I think you would be pretty safe choosing some unlikely to collide names. An easy way to generate these and still make the code somewhat readable would be to just hash a string to create a valid SAS varname and use a macro reference to make the code readable. Something like this:
%macro get_low_collision_varname(iSeed=);
%local try cnt result;
%let cnt = 0;
%let result = ;
%do %while ("&result" eq "");
%let try = %sysfunc(md5(&iSeed&cnt),hex32.);
%if %sysfunc(anyalpha(%substr(&try,1,1))) gt 0 %then %do;
%let result = &try;
%end;
%let cnt = %eval(&cnt + 1);
%end;
&result
%mend;
The above code takes a seed string and just adds a number to the end of it. It iterates the number until it gets a valid SAS varname as output from the md5() function. You could even then test the target dataset name to make sure the variable doesn't already exist. If it does build that logic into the above function.
Test it:
%let my_var = %get_low_collision_varname(iSeed=this shouldnt collide);
%put &my_var;
data _null_;
set sashelp.class;
&my_var = 1;
put _all_;
run;
Results:
Name=Alfred Sex=M Age=14 Height=69 Weight=112.5 C34FD80ED9E856160E59FCEBF37F00D2=1 _ERROR_=0 _N_=1
Name=Alice Sex=F Age=13 Height=56.5 Weight=84 C34FD80ED9E856160E59FCEBF37F00D2=1 _ERROR_=0 _N_=2
This doesn't specifically answer the question of how to achieve it without creating new varnames, but it does give a practical workaround.

Using quotes in macro variables used in defineKey

I'm trying to build a macro around the solution to this question
My reproducible example here doesn't do anything useful, it's just to highlight the syntax error I'm getting.
The line rc = mx.defineKey(&groups) works in the first case, executing rc = mx.defineKey('grp1','grp2').
In the second case however, where I define &groups differently but aiming at the same value, it fails with error:
NOTE: Line generated by the macro variable "GROUPS". 21
'grp1','grp2'
_
386
_
200
76 MPRINT(TEST2): rc = mx.defineKey('grp1','grp2'); ERROR: DATA STEP Component Object failure. Aborted during the
COMPILATION phase. ERROR 386-185: Expecting an arithmetic expression.
ERROR 200-322: The symbol is not recognized and will be ignored.
ERROR 76-322: Syntax error, statement will be ignored.
Here are the working example followed by the non working one. I would like to:
Understand why second case is not working
Get it to work by defining groups from grp_list the appropriate way.
Reproducible code:
data have;
input grp1 $ grp2 $ number;
datalines;
a c 3
b d 4
;
%macro test;
data want;
set have;
if _n_ = 1 then do;
declare hash mx();
%let groups = 'grp1','grp2';
%put rc = mx.defineKey(&groups);
%put rc = mx.defineKey('grp1','grp2');
rc = mx.defineKey(&groups);
rc = mx.definedata('number');
rc = mx.definedone();
end;
run;
%mend;
%test
%macro test2;
data want;
set have;
if _n_ = 1 then do;
declare hash mx();
%let grp_list = grp1 grp2;
%let groups = %str(%')%qsysfunc(tranwrd(&grp_list,%str( ),%str(%',%')))%str(%');
%put rc = mx.defineKey(&groups);
%put rc = mx.defineKey('grp1','grp2');
rc = mx.defineKey(&groups);
rc = mx.definedata('number');
rc = mx.definedone();
end;
run;
%mend;
%test2
You need to remove the macro quoting as it is adding invisible control characters that are messing up your hash definition. Change the relevant line to this:
rc = mx.defineKey(%unquote(&groups));
You might also consider defining a separate macro to quote each of your list items, and using double quotes rather than single quotes, e.g.
%macro quotelist(list);
%local i word;
%do %while(1);
%let i = %eval(&i + 1);
%let word = %scan(&list,&i,%str( ));
%if &word = %then %return;
%sysfunc(quote(&word)) /*De-indent to avoid adding extra spaces*/
%end;
%mend;
%put %sysfunc(translate(%quotelist(a b c),%str(,),%str( )));
This avoids the additional hassles associated with resolving macro variables inside single quotes.
Don't embed your macro logic into the middle of the DATA step. It might confuse you into thinking that it runs while the data step is running instead of running before the data step as it actually does.
Why are you jumping through hoops to add single quotes? Just use double quotes.
%macro test2;
%let grp_list = grp1 grp2;
%let groups = "%sysfunc(tranwrd(&grp_list,%str( ),"%str(,)"))";
%put rc = mx.defineKey(&groups);
%put rc = mx.defineKey("grp1","grp2");
data want;
set have;
if _n_ = 1 then do;
declare hash mx();
rc = mx.defineKey(&groups);
rc = mx.definedata("number");
rc = mx.definedone();
end;
run;
%mend test2;
This problem is due to macro quoting. It is fine when you print it to log via put. However, when you call it with defineKey methods. SAS parser sees additional quotations. That is way you get it failed at complication phase, before execution. (The symbol is not recognized - that is the macro quoting). If you remove the macro quoting, it will work.
%macro test2;
data want;
set have;
if _n_ = 1 then do;
declare hash mx();
%let grp_list = grp1 grp2;
%let groups = %str(%')%qsysfunc(tranwrd(&grp_list,%str( ),%str(%',%')))%str(%');
%put rc = mx.defineKey(&groups);
%put rc = mx.defineKey('grp1','grp2');
rc = mx.defineKey(%unquote(&groups));
rc = mx.definedata('number');
rc = mx.definedone();
end;
run;
%mend;

How to force SAS to interpret macro vars as numbers in if condition

SAS is interpreting in_year_tolerance and abs_cost as text vars in the below if statement. Therefore the if statement is testing the alphabetical order of the vars rather than the numerical order. How do I get SAS to treat them as numbers? I have tried sticking the if condition, and the macro vars, in %sysevalf but that made no difference. in_year_tolerance is 10,000,000 and cost can vary, but in the test run starts around 20,000,000 before dropping to 9,000,000 at which point it should exit the loop but doesn't.
%macro set_downward_caps(year, in_year_tolerance, large, small, start, end, increment);
%do c = &start. %to &end. %by &increment.;
%let nominal_down_large_&year. = %sysevalf(&large. + (&c. / 1000));
%let nominal_down_small_&year. = %sysevalf(&small. + (&c. / 100));
%let real_down_large_&year. = %sysevalf((1 - &&nominal_down_large_&year.) * &&rpi&year.);
%let real_down_small_&year. = %sysevalf((1 - &&nominal_down_small_&year.) * &&rpi&year.);
%rates(&year.);
proc means data = output.s_&scenario. noprint nway;
var transbill&year.;
output out = temporary (drop = _type_ _freq_) sum=cost;
run;
data _null_;
set temporary;
call symputx('cost', put(cost,best32.));
run;
data temp;
length scenario $ 30;
scenario = "&scenario.";
large = &&real_down_large_&year.;
small = &&real_down_small_&year.;
cost = &cost.;
run;
data output.summary_of_caps;
set output.summary_of_caps temp;
run;
%let abs_cost = %sysevalf(%sysfunc(abs(&cost)));
%if &in_year_tolerance. > &abs_cost. %then %return;
%end;
%mend set_downward_caps;
Use %sysevalf(&in_year_tolerance > &abs_cost,boolean).
As you have seen, compares are text based. If you put it in an %eval() or %sysevalf() the values will be interpreted as numbers.
The ,boolean option lets it know you want a TRUE/FALSE.

How do I work out the data type of my macro variable in SAS

How do I print out the data type of a macro variable in the log
%macro mymacro(dt2);
%LET c_mth = %SYSFUNC(intnx(month,&dt2.d,-1,e),date9.) ;
%put &c_mth;
%mend;
mymacro('01sep2014')
I have a bunch of macro variables assigned using a %let or into:
my problem is I'm trying to do a bunch of boolean condition on dates but I suspect that some of my variables are strings and some are dates
I have casted them in my code but to triple check there is surely a way to return something to the log
I want something similar to using str() or mode() or is.numeric() in R
H,
The SAS macro language is weird. : )
As Reeza said, macro variables do not have a type, they are all text.
But, if you use Boolean logic (%IF statement), and both operands are integers, the macro language will do a numeric comparison rather than a character comparison.
So you can use the INPUTN() function to convert the date strings to SAS dates (number of days since 01Jan1960), and then compare those. Here's an example, jumping off from your code:
%macro mymacro(dt1,dt2);
%local c_mth1 c_mth2 n_mth1 n_mth2;
%let c_mth1 = %sysfunc(intnx(month,&dt1.d,-1,e),date9.) ;
%let c_mth2 = %sysfunc(intnx(month,&dt2.d,-1,e),date9.) ;
%let n_mth1 = %sysfunc(inputn(&c_mth1,date9.)) ;
%let n_mth2 = %sysfunc(inputn(&c_mth2,date9.)) ;
%put &c_mth1 -- &n_mth1;
%put &c_mth2 -- &n_mth2;
%if &n_mth1<&n_mth2 %then %put &c_mth1 is before &c_mth2;
%else %put &c_mth1 is NOT before &c_mth2;
%mend;
Log from a sample call:
236 %mymacro('01feb1960','01mar1960')
31JAN1960 -- 30
29FEB1960 -- 59
31JAN1960 is before 29FEB1960
--Q.
Macro variables do not have a type, they are all text.
You have to make sure the variable is passed in a way that makes sense to the program and generates valid SAS code.
%let date1=01Jan2014;
%let date2=31Jan2014;
data _null_;
x = "&date1"d > "&date2"d;
y = "&date2"d > "&date1"d;
z = "&date2"d-"&date1"d;
put 'x=' x;
put 'y=' y;
put 'z=' z;
run;
Log should show:
x=0
y=1
z=30
If your macro variables resolve to date literals, you can use intck combined with %eval to compare them, e.g.
%let mvar1 = '01jan2015'd;
%let mvar2 = '01feb2015'd;
/*Prints 1 if mvar2 > mvar1*/
%put %eval(%sysfunc(intck(day,&mvar1,&mvar2)) > 0);