compress function in SAS - sas

I want to remove multiple blanks from a string.
%let CDate = 25Mar2015;
data _null_;
call symput('TitleDate',cat(put("&CDate."d,monname9.),', ',year("&CDate."d)));
run;
%put &TitleDate; /* March, 2015 */
%put Title is &TitleDate; /* multiple blanks between 'is' and 'March' */
I tried compress: %put Title is %sysfunc(compress(&TitleDate)); But it returns Title is March without year part.

Very close, two modifications:
CALL SYMPUTX instead of CALL SYMPUT
WORDDATE20. format instead of the cat/put combinations
call symputx('TitleDate',put("&CDate."d,worddate20.));
EDIT (to answer question in comments):
The compress function takes 3 arguments, the first is mandatory and the last two are option.
The compress function doesn't work because what SAS is seeing is:
compress(March 25, 2015)
This is an invalid call to the compress function. I think the compress function would assume the second argument would be 2015 and I don't know what it does with the 25. I would actually expect it to generate an error, but it doesn't.
If you did want to use the compress function to pass in a character value you would need to quote it using the %quote function, but that gets rid of all the spaces and I think you just wanted to get rid of the leading spaces.
%put Title is %sysfunc(compress(%quote(&TitleDate.)));

Related

Using SAS finance XIRR function on multiple rows with different numbers of variables

I am trying to use the SAS XIRR function on a dataset. The syntax is:
finance('XIRR',value1, value2, value3...valuen,date1,date2,date3...daten);
My problem is that the data has different numbers of values/dates on each row. There could be up to 122 values/dates per row.
Where there are missing values the XIRR function fails, so I set all missing values to 0. Now the function fails as the 'missing' dates are now Jan1960. Anyone got any ideas?
in the code below cf1-cf122 are the cash flow values and ed1-ed122 are the dates.
/* remove blanks */
data irrtable3;
set irrtable2;
array change _numeric_;
do over change;
if change=. then change=0;
end;
run;
/* create irr */
data irrtable4;
set irrtable3;
IRR=finance('XIRR',OF CF1-CF122,OF ED1-ED122);
run;```
You can use codegen to construct a dynamic FINANCE(..) call, with a variable number of arguments, that is resolved by the macro system at DATA step run-time.
Using RESOLVE to compute the result in macro environment for many, many rows will likely have a noticeable slowness compared to plain DATA step.
Example:
data have;
v1=−10000; d1=mdy(1, 1, 2008);
v2=2750; d2=mdy(3, 1, 2008);
v3=4250; d3=mdy(10, 30, 2008);
v4=3250; d4=mdy(2, 15, 2009);
v5=2750; d5=mdy(4, 1, 2009);
output;
call missing(v5,d5); output;
call missing(v4,d4); output;
call missing(v3,d3); output;
call missing(v2,d2); output;
run;
options missing=' ';
data want;
set have;
args = catx(',', of v1-v5, of d1-d5);
result = resolve( cats (
'%sysfunc(FINANCE(XIRR,', args, '))'
));
run;
options missing='.';
From what I can tell (And I don't work with Finance functions, so I'm not an expert), if you have all of the 'filled' arguments prior to the 'unfilled', you are okay to just set everything to zero that's missing (both on the 'value' and 'date' side). Using the example Richard provides (which is the one from the SAS documentation):
data want2;
set have;
array v v1-v5;
array d d1-d5;
do _i_ = 1 to dim(v);
if missing(v[_i_]) then do;
v[_i_]=0; d[_i_]=0;
end;
end;
args = catx(',', of v1-v5, of d1-d5);
result =FINANCE('XIRR',of v1-v5, of d1-d5);
run;
That works and gets the same result as Richard's, and is probably faster.
This does require the 0s to all be at the end - if they're interspersed, and you can't use CALL SORTN to get them put all on one end - and your data is too big to use with RESOLVE, then I would construct this entirely in the macro language. You could do a few things, all of which are too long for this answer, but the simplest is probably to create code for every line, and put them behind if _n_ = 5 then do; &row5code.; end; for each row. This would be very long, certainly, but should be faster than the resolve (just a lot less maintainable). You could also do a CALL EXECUTE for each line, also slow but a possibility, or even DOSUBL.

Remove single quotes in list of values in macro variable

I have a project with multiple programs. Each program has a proc SQL statement which will use the same list of values for a condition in the WHERE clause; however, the column type of one database table needed is a character type while the column type of the other is numeric.
So I have a list of "Client ID" values I'd like to put into a macro variable as these IDs can change, and I would like to change them once in the variable instead of in multiple programs.
For example, I have this macro variable set up like so and it works in the proc SQL which queries the character column:
%let CLNT_ID_STR = ('179966', '200829', '201104', '211828', '264138');
Proc SQL part:
...IN &CLNT_ID_STR.
I would like to create another macro variable, say CLNT_ID_NUM, which takes the first variable (CLNT_ID_STR) but removes the quotes.
Desired output: (179966, 200829, 201104, 211828, 264138)
Proc SQL part: ...IN &CLNT_ID_NUM.
I've tried using the sysfunc, dequote and translate functions but have not figured it out.
TRANSLATE doesn't seem to want to allow a null string as the replacement.
Below uses TRANSTRN, which has no problem translating single quote into null:
1 %let CLNT_ID_STR = ('179966', '200829', '201104', '211828', '264138');
2 %let want=%sysfunc(transtrn(&clnt_id_str,%str(%'),%str())) ;
3 %put &want ;
(179966, 200829, 201104, 211828, 264138)
It uses the macro quoting function %str() to mask the meaning of a single quote.
Three other ways to remove single quotes are COMPRESS, TRANSLATE and PRXCHANGE
%let CLNT_ID_STR = ('179966', '200829', '201104', '211828', '264138');
%let id_list_1 = %sysfunc(compress (&CLNT_ID_STR, %str(%')));
%let id_list_2 = %sysfunc(translate(&CLNT_ID_STR, %str( ), %str(%')));
%let id_list_3 = %sysfunc(prxchange(%str(s/%'//), -1, &CLNT_ID_STR));
%put &=id_list_1;
%put &=id_list_2;
%put &=id_list_3;
----- LOG -----
ID_LIST_1=(179966, 200829, 201104, 211828, 264138)
ID_LIST_2=( 179966 , 200829 , 201104 , 211828 , 264138 )
ID_LIST_3=(179966, 200829, 201104, 211828, 264138)
It really doesn't matter that TRANSLATE replaces the ' with a single blank () because the context for interpretation is numeric.

SAS call symput in data step

I am facing this issue with sas data step. My requirement is to get a list of variables such as
total_jun2018 = sum(jun2018, dep_jun2018);
total_jul2018 = sum(jul2018, dep_jul2018);
Data final4;
set final3;
by hh_no;
do i=0 to &tot_bal_mnth.;
bal_mnth = put(intnx('month',"&min_Completed_dt."d, i-1), monyy7.);
call symputx('bal_mnth', bal_mnth);
&bal_mnth._total=sum(&bal_mnth., Dep_&bal_mnth.);
output;
end;
But I am facing error that macro variable bal_mnth not resolved. Also once it did ran successfully but I want that output must be printed sequentially but it only prints output for last loop when i=6 then it prints only Total_DEC2018=sum(DEC2018, DEP_DEC2018);
Any help will be appreciated!
Thanks,
Ajay
This is a common issue when learning SAS Macro. The problem is that the macro processor needs to resolve &bal_mnth to a value when the data step is first submitted for execution, but the CALL SYMPUT doesn't execute until the data step is actually executed, so at the time you submit the code, there is no value available for &bal_mnth.
In this case you don't need bal_mnth to be created as a variable in the data set, so you could replace the line that starts bal_mnth = put(intck(...)) with a %let bal_mnth = ... statement. The %let executes while the data step is being submitted, so that way its value will be available when you need it.
My proposed %let statement will need to wrap the functions in at least one SYSFUNC call, which is left as an exercise for the reader :-)
It looks like you want to generate a series of assignment statements like:
total_jun2018 = sum(jun2018, dep_jun2018);
total_jul2018 = sum(jul2018, dep_jul2018);
...
total_jan2019 = sum(jan2019, dep_jan2019);
What is known as wallpaper code.
If your variables names were easier, such as dep1 to dep18 then it would be easy to use arrays to process the data. With your current naming convention the problem with generating the array statements is not much different than the problem of generating a series of assignment statements.
You can create a macro so that you could use a %DO loop to generate your wallpaper code.
%local i bal_mnth;
%do i=0 %to &tot_bal_mnth.;
%let bal_mnth = %sysfunc(intnx(month,"&min_Completed_dt."d, &i-1), monyy7.);
total_&bal_mnth = sum(&bal_mnth , Dep_&bal_mnth );
%end;
Or you could just generate the code to a file with a data step.
%let tot_bal_mnth = 7;
%let min_Completed_dt=01JUN2018;
filename code temp;
data _null_;
file code;
length bal_mnth $7 ;
do i=0 to &tot_bal_mnth.;
bal_mnth = put(intnx('month',"&min_Completed_dt."d, i-1), monyy7.);
put 'total_' bal_mnth $7. ' = sum(' bal_mnth $7. ', Dep_' bal_mnth $7. ');';
end;
run;
So the generated file of code looks like this:
total_MAY2018 = sum(MAY2018, Dep_MAY2018);
total_JUN2018 = sum(JUN2018, Dep_JUN2018);
total_JUL2018 = sum(JUL2018, Dep_JUL2018);
total_AUG2018 = sum(AUG2018, Dep_AUG2018);
total_SEP2018 = sum(SEP2018, Dep_SEP2018);
total_OCT2018 = sum(OCT2018, Dep_OCT2018);
total_NOV2018 = sum(NOV2018, Dep_NOV2018);
total_DEC2018 = sum(DEC2018, Dep_DEC2018);
You can then use %include to run it in your data step.
data final4;
set final3;
by hh_no;
%include code / source2 ;
run;
I would like to offer another point of view: the difficulty you are having here results from the use of a wide data shape, with lots of columns.
Rather than working with your data in this shape, you could first transpose from wide to long, so that instead of having lots of total_xxx columns you just have 3: total, total_dep and date, with one row per month. Once it's in this format, it will be much easier to work with, potentially allowing you to avoid resorting to macros and wallpaper code.
Suggested reading:
Transpose wide to long with dynamic variables

Declaring macro variable with evaluation of function as value

I'm new to SAS. I encurred into a problem when trying to declare a macro variable with the result of some operation as value.
data _null_;
%let var1 = 12345;
%let var2 = substr(&var1., 4,5);
run;
I get that var2 has value substr(&var1., 4,5) (a string) instead of 45 as I would like. How to make the variable declaration evaluate the function?
Sorry it the question is trivial. I looked in the documentation for a bit but couldn't find an answer.
There is a macro equivalent called %substr() which can be used as follows:
%let var1 = 12345;
%let var2 = %substr(&var1., 4,2);
%put var2 = &var2;
Note that the data and run statements are not required for macro language processing and the 3rd argument to %substr() (and substr()) specifies the length you want, not the position of the last character, which is why I used 2 instead of 5.
Edit: Also, if there is no macro equivalent then you could use %sysfunc() to make use of the data step function in macro code. See the documention for full details as there are some quirks, such as not using quotes and a few exceptions to the list of data step functions that can be used.

Pass Character dynamically to a macro

I'm trying to pass file name to a macro. The macro runs once a month, therefore,I'm trying to store the output file with a month prefix. In the current code someone has to manually provide a file name every month (Sep17_Sales, Oct17_Sales etc.). I want to automate this so that SAS generates files with the name of the month prefixed to the data file.
Macro:
%macro sales (outdata = , dt =);
Current Code
%Sales(Outdata = Sep17_Sales, dt = '2017-09-01');
%Sales(Outdata =Oct17_Sales, dt ='2017-10-01');
My approach:
data _null_;
current_date = today();
current_month = intnx('month', current_date, 0, "Begginning");
Name = "_Sales";
Result = put(current_month, monyy7.) || name;
run;
%Sales(Outdata=Result, dt='2017-10-01');
When I try to pass the parameter, it throws error. I tried changing Result to %Let Result and pass a reference &Result to the macro but it also fails.
Any suggestion how to solve this? Thank you for all the help!!
What you are doing there is assigning a value to a data step variable called Result. The name Result doesn't mean anything outside the context of that datastep and therefore does not resolve to anything when you call your macro. What you are doing instead is telling your macro that your output file should be called "Result".
You could fix that by replacing your Result= line with call symput('Result',put(current_month, monyy7.) || name);, which effectively creates a macro variable called "Result", then call your sales macro like so: ``%Sales(Outdata=&Result, dt='2017-10-01');
OR, you could scratch all that and simply call your macro like this:
%sales(outdata=%sysfunc(today(),monyy7.)_Sales, dt='2017-10-01');
Going further, assuming the second argument (dt) is always meant to be the first day of the month formatted as yyyy-mm-dd and enclosed in single quotes (although if that is the case I see little use in specifying it as a parameter of the macro), you could make the call even more dynamic:
%sales(outdata=%sysfunc(today(),monyy7.)_Sales, dt=%str(%')%sysfunc(intnx(month,%sysfunc(today()),0,B),E8601DA.)%str(%'));
if that date can be enclosed in double quotes, this can be simplified a little as:
%sales(outdata=%sysfunc(today(),monyy7.)_Sales, dt="%sysfunc(intnx(month,%sysfunc(today()),0,B),E8601DA.)");