I have the following code where I rename column names; I would like to keep only the variables created by the macro. I do realize I can drop the old variables but am curious if there is a keep option I can place inside the macro.
So for example, in the datastep, I would want to keep only the variable that start with '%transform_this(JUNE19)';
Thanks!
%macro transform_this(x);
&x._Actual=input(Current_Month, 9.0);
&x._Actual_Per_Unit = input(B, 9.);
&x._Budget=input(C, 9.);
&x._Budget_Per_Unit=input(D, 9.);
&x._Variance=input(E, 9.);
&x._Prior_Year_Act=input(G, 9.);
Account_Number=input(H, 9.);
Account_Description=put(I, 35.);
&x._YTD_Actual=input(Year_to_Date, 9.);
&x._YTD_Actual_Per_Unit=input(L, 9.);
%mend transform_this;
data June_53410_v1;
set June_53410;
%transform_this(JUNE19);
if Account_Description='Account Description' then DELETE;
Drop Current_Month B C D E G H I Year_to_Date L M N;
run;
keep June19_: Account_:;
This keeps all variables starting with June19_ and Account_ which are the ones you need evidently.
am curious if there is a keep option I can place inside the macro.
You can definitely use keep in your macro:
%macro transform_this(x);
keep &x._Actual &x._Actual_Per_Unit
&x._Budget &x._Budget_Per_Unit
&x._Variance &x._Prior_Year_Act
Account_Number Account_Description
&x._YTD_Actual &x._YTD_Actual_Per_Unit
;
&x._Actual=input(Current_Month, 9.0);
/* ...and the rest of your code */
%mend transform_this;
Any reason you thought you can't?
Add two sentinel variables to the data step, one before the macro call and one after. Use the double dash -- variable name list construct in a keep statement and drop the sentinels in the data step output data set specifier drop= option.
data want (drop=sentinel1 sentinel2); /* remove sentinels */
set have;
retain sentinel1 0;
%myMacro (…)
retain sentinel2 0;
…
keep sentinel1--sentinel2; * keep all variables created by code between sentinel declarations;
run;
Name Range Lists
Name range lists rely on the order of variable definition, as shown in
the following table:
Name Range Lists
Variable List Included Variables
x -- a all variables in order of variable definition, from
variable x to variable a inclusive
x -NUMERIC- a all numeric variables from variable x to variable a inclusive
x -CHARACTER- a all character variables from variable x to variable a inclusive
Note: Notice that name range lists use a double hyphen ( -- ) to designate
the range between variables, and numbered range lists use a single
hyphen to designate the range.
Related
I'm running SAS code on a data set that is thousands of rows (typical). I need to create 2 new variables in a data step that includes the sum of each row (categories by either X or Z in the title for each observation of y based on the Variable Name. Obviously I cannot write out each variable I need the sum of because it will be impossible in my actual data set. I think the answer is a Loop of sorts, but not having any luck finding a solution online where I don't need to list all of the variables.
A much smaller example data set is listed below of what I need the data to look like at the end.
So far I tried doing something like this but I KNOW this is so far off, I just am really stuck on how to get it to recognize the variables name and stop when it hits the last X or last Z.
DATA sample1 (drop = i);
set data;
do i = i to 10;
answer = sum(i);
end;
run
You can use a variable short cut references with the :.
of X: means sum everything that starts with the variable X.
data want;
set have;
sumx = sum(of X:);
sumZ = sum(of Z:);
*if you know the end of the series;
sumx = sum(of X1-X4);
sumZ = sum(of Z1-Z5);
run;
Different ways of specifying the variable list is illustrated here
I am trying to find the optimized way to do this :
I want to delete from a character variable all the observations STARTING with different possible strings such as :
"Subtotal" "Including:"
So if it starts with any of these values (or many others that i didn't write here) then delete them from the dataset.
Best solution would be a macro variable containing all the values but i don't know how to deal with it. (%let list = Subtotal Including: but counts them as variables while they are values)
I did this :
data a ; set b ;
if findw(product,"Subtotal") then delete ;
if findw(product,"Including:") then delete;
...
...
Would appreciate any suggestions !Thanks
First figure out what SAS code you want. Then you can begin to worry about how to use macro logic or macro variables.
Do you just to exclude the strings that start with the values?
data want ;
set have ;
where product not in: ("Subtotal" "Including");
run;
Or do you want to subset based on the first "word" in the string variable?
where scan(product,1) not in ("Subtotal" "Including");
Or perhaps case insensitive?
where lowcase(scan(product,1)) not in ("subtotal" "including");
Now if the list of values is small enough (less than 64K bytes) then you could put the list into a macro variable.
%let list="Subtotal" "Including";
And then later use the macro variable to generate the WHERE statement.
where product not in: (&list);
You could even generate the macro variable from a dataset of prefix values.
proc sql noprint;
select quote(trim(prefix)) into :list separated by ' '
from prefixes
;
quit;
I'm new to SAS. I encurred into a problem when trying to declare a macro variable with the result of some operation as value.
data _null_;
%let var1 = 12345;
%let var2 = substr(&var1., 4,5);
run;
I get that var2 has value substr(&var1., 4,5) (a string) instead of 45 as I would like. How to make the variable declaration evaluate the function?
Sorry it the question is trivial. I looked in the documentation for a bit but couldn't find an answer.
There is a macro equivalent called %substr() which can be used as follows:
%let var1 = 12345;
%let var2 = %substr(&var1., 4,2);
%put var2 = &var2;
Note that the data and run statements are not required for macro language processing and the 3rd argument to %substr() (and substr()) specifies the length you want, not the position of the last character, which is why I used 2 instead of 5.
Edit: Also, if there is no macro equivalent then you could use %sysfunc() to make use of the data step function in macro code. See the documention for full details as there are some quirks, such as not using quotes and a few exceptions to the list of data step functions that can be used.
In a basic data step I'm creating a new variable and I need to filter the dataset based on this new variable.
data want;
set have;
newVariable = 'aaa';
*lots of computations that change newVariable ;
*if xxx then newVariable = 'bbb';
*if yyy AND not zzz then newVariable = 'ccc';
*etc.;
where newVariable ne 'aaa';
run;
ERROR: Variable newVariable is not on file WORK.have.
I usually do this in 2 steps, but I'm wondering if there is a better way.
( Of course you could always write a complex where statement based on variables present in WORK.have. But in this case the computation of newVariable it's too complex and it is more efficient to do the filter in a 2nd data step )
I couldn't find any info on this, I apologize for the dumb question if the answer is in the documentation and I didn't find it. I'll remove the question if needed.
Thanks!
Use a subsetting if statement:
if newVariable ne 'aaa';
In general, if <condition>; is equivalent to if not(<condition>) then delete;. The delete statement tells SAS to abandon this iteration of the data step and go back to the start for the next iteration. Unless you have used an explicit output statement before your subsetting if statement, this will prevent a row from being output.
I have been working on converting a number of variables in my table to numerical types from characters. I discovered the method to alter one variable and can continue doing so for each variable. However, I wanted to solicit SE because I am having trouble developing a sustainable solution.
How can I edit multiple variables at once in SAS Studio 3.5?
My attempt thus far:
What works:
data work.want(rename=(age_group='Age Group'n));
set work.import;
age_group=input('Age Group'n,8.);
drop 'Age Group'n;
run;
What doesn't work:
data work.want(rename=(age_group='Age Group'n), rename=(dwelling_type='Dwelling Type'n));
set work.import;
age_group=input('Age Group'n,8.);
dwelling_type=input('Dwelling Type'n,8.);
drop 'Age Group'n, 'Dwelling Type'n;
run;
For starters your RENAME statement is incorrect. I don't recommend using that type of variable notation though, so I'm going to suggest labels instead. To convert multiple variables use an array. You do have to list them out once at least though, in the array statement.
data work.want;
set work.import;
array num_vars(*) age_group dwelling_type;
array char_vars(*) 'Age Group'n 'Dwelling Type'n;
do i=1 to dim(num_vars);
num_vars(i) = input(char_vars(i), 8.);
end;
label age_group = 'Age Group'
dwelling_type = 'Dwelling Type';
run;
If you wanted to do a RENAME as a dataset option, you would do it as follows, no comma's and the keyword rename once.
(rename=(age_group='Age Group'n dwelling_type='Dwelling Type'n));