I'm facing the problem that I want to put data into a character variable.
So I have a long tranposed dataset where I have three variables: date( by which i transposed before hand) var (has three different outputs of my previous variables) and col1 (which includes the values of my previous variables).
Now i want to create a forth variable which has as well three different outputs. My problem is that I can create the variable put with my code it does always create missing value.
data pair2;
set data1;
if var="BNNESR" or var="BNNESR_r" or var="BNNESR_t" then output;
length all $ 20;
all=" ";
if var="BNNESR" then all="pdev";
if var="BNNESR_t" then all="trigger";
if var="BNNESR_r" then all="rdev";
drop var;
run;
Afterwards I want to tranpose it back by the "all" variable. I know i could just rename the old vars before I transpose it and then just keep them.
But the complete calculation will go on and actually will be turned into a macro where it is not that easy if would do it like that way.
Your program will just subset the input data and add a new variable that is empty because you are writing the data out before you assign any value to the new variable.
Use a subsetting IF (or WHERE) statement instead of using an explicit OUTPUT statement. Once your data step has an explicit OUTPUT statement then SAS no longer automatically writes the observation at the end of the data step iteration.
data pair2;
set data1;
if var="BNNESR" or var="BNNESR_r" or var="BNNESR_t" ;
length all $20;
if var="BNNESR" then all="pdev";
else if var="BNNESR_t" then all="trigger";
else if var="BNNESR_r" then all="rdev";
drop var;
run;
Since the list in the IF statement matches the values in the recode step then perhaps you want to just use a DELETE statement instead?
data pair2;
set data1;
length all $20;
if var="BNNESR" then all="pdev";
else if var="BNNESR_t" then all="trigger";
else if var="BNNESR_r" then all="rdev";
else delete;
drop var;
run;
Related
In a SAS Data Step i have a character variable called "varName". This variable stores the name of another variable. In below's example, it stores the name of the numeric variable "changeMe":
data TMP;
length
varName $32
changeMe 8
;
varName = ‘changeMe’;
/*??? How to change the content of variable that varName holds ???*/
run;
Now the question is: how do i change the content of the variable that varName holds?
The use case would be that varName acts as a dynamic pointer to different variables that i want to manipulate in a big SAS Data Set.
DATA Step does not directly provide for named indirect assignment.
In some cases, the indirect assignment requirement might indicate you want to perform a Proc TRANSPOSE data transformation. If the variable names and values are provided in a transaction data set, and the data has BY group variables, your better solution might be to TRANSPOSE the transaction data and merge that transform to the master data using an UPDATE or MODIFY statement.
Regardless, you can array variables of a given type and iterate the array looking for the target requiring assignment.
Example:
data want;
set sashelp.class;
varname = 'name';
varvalue = 'Scooter';
array chars _character_;
do _n_ = 1 to dim(chars);
if upcase (vname(chars(_n_))) = upcase(varname) then do;
chars(_n_) = varvalue;
end;
end;
run;
Output
call execute() is a highly feasible solution.
data TMP;
length
varName $32
changeMe 8
;
varName = 'changeMe';
run;
data _null_;
set TMP end=eof;
if _n_ = 1 then call execute('data %trim(&syslast.); modify %trim(&syslast.);');
call execute(cats(varName)||' = rand("uniform",1,0);');
if eof then call execute('run;');
run;
Log:
NOTE: CALL EXECUTE generated line.
1 + data WORK.TMP; modify WORK.TMP;
2 + changeMe = rand("uniform",1,0);
3 + run;
NOTE: There were 1 observations read from the data set WORK.TMP.
NOTE: The data set WORK.TMP has been updated. There were 1 observations rewritten, 0 observations
added and 0 observations deleted.
I have a lot of data sets that I need to have the same structure - same variables, same order. I have a data set serving as a template ("all" in the code below). Other data are given this form by listing both the template data set (with obs=0) and a particular data set ("some" in the code below) in the same set statement. This works just fine.
I then want to loop through the variables. If one of them is missing (as it will be, if it's not present in the particular data set), it should be set to the value of the previous variable. var2 should get the value from var1 etc.
This should be done within each row. This works fine if done in a separate data step, but doesn't work if done in the same data step described above.
If done in the same data step, the values inserted for missing values will always be from row 1. Why is this? Can I achieve the wanted result without using another data step?
/* All the variables a complete data set should contain.*/
data all;
format var1-var5 $20.;
run;
/* Actual data have some of these variables, but not all. var1 is never missing, all other variables might be*/
data some;
var1="Obs 1, Value 1";
var4="Obs 1, Value 4";
output;
var1="Obs 2, Value 1";
var4="Obs 2, Value 4";
output;
run;
/* Not working - The values inserted when the conditional is true are all from row 1*/
data dont_want;
set all(obs=0) some;
array chars{*} _character_;
do i=1 to dim(chars);
if missing(chars{i}) then chars{i}=chars{i-1};
end;
drop i;
run;
/* Working*/
data temp;
set all(obs=0) some;
run;
data want;
set temp;
array chars{*} _character_;
do i=1 to dim(chars);
if missing(chars{i}) then chars{i}=chars{i-1};
end;
drop i;
run;
The values for the "extra" variables are being RETAINed since they are sourced from the ALL dataset. Any variable that is sourced from an input dataset is NOT reset to missing at the start of the data step iteration. Since those variables are not on the SOME dataset they do not change when an observation is read from it.
Just add code to reset them to missing. If you want to do it without knowing the names of the variables you might consider re-ordering the code.
You could define and clear the array after the compiler has "seen" the ALL dataset but before the run-time has read the SOME dataset.
data dont_want;
if 0 then set all;
array chars{*} _character_;
call missing(of chars{*});
set some;
do i=2 to dim(chars);
if missing(chars{i}) then chars{i}=chars{i-1};
end;
drop i;
run;
Or add an explicit OUTPUT statement and reset them after that.
data dont_want;
set all(obs=0) some;
array chars{*} _character_;
do i=2 to dim(chars);
if missing(chars{i}) then chars{i}=chars{i-1};
end;
drop i;
output;
call missing(of _all_);
run;
Couple things to do if you want to use implicit OUTPUT
Prep the PDV prior to the data reading set using a non-reading set
Set up array based on prepped PDV
Clear the array
Read the data with set
Impute your data
output
Example:
data dont_want;
if 0 then set all some; * non reading set preps the PDV;
array chars{*} _character_;
call missing(of chars(*)); * clears all auto-retained data set variables;
set all(obs=0) some; * data reading set;
* shift right an array requires left to right processing;
do i=dim(chars) to 2 by -1;
if missing(chars{i}) then chars{i}=chars{i-1};
end;
*** OR COPY right into empty slots, repeating prior copy if needed;
do i=2 to dim(chars);
if missing(chars{i}) then chars{i}=chars{i-1};
end;
drop i;
* implicit output;
run;
I want to be able to create a flag, here called timeflag, that is set to 1 for every first and last entry of a certain Session, designated by logflag. What I have is the following but this gives me null data points:
data OUT.TENMAY_TIMEFLAG;
set IN.TENMAY_LOGFLAG;
if first.logflag then timeflag = 1;
if last.logflag then timeflag = 1;
run;
What is it about the first. and last. functions that I am not understanding here or is it that I have 2 if statements?
To have SAS create FIRST. and LAST. automatic variables you need to use a BY statement. If you want the new variable to be coded 1/0 then no need for the IF statement, just assign the automatic variable to a new permanent variable. To make one variable that is 1 for the first and the last then just use an OR.
data want;
set have;
by logflag ;
timeflag = first.logflag or last.logflag ;
run;
data OUT.TENMAY_TIMEFLAG;
set IN.TENMAY_LOGFLAG;
by logflag;
if first.logflag then timeflag = 1;
if last.logflag then timeflag = 1;
run;
P.S. in this case the dataset IN.TENMAY_LOGFLAG should be sorted by logflag.
I've tried something like this :
data wynik;
set dane;
if x>3 than x3=3*x;
else set dane2; x3=x2;set dane;
run;
dane and dane2 have the same number of rows
result is interesting - condition x>3 is still holding after setting dane2, but SAS always takes first observation - that is, it doesn't pass the current state of hidden loop counter. Make question is - does SAS have/use hidden loop with counter while iterating through dataset which could be accessed by user ?
editon :
mayby I should add in title - without expicit loops, but this would also be welcomed
Making some assumptions:
data dane;
do x = 1 to 5;
output;
end;
run;
data dane2;
do x2 = 5 to 1 by -1;
output;
end;
run;
data wynik;
merge dane dane2;
if x > 3 then x3=3*x;
else x3=x2;
put x3=;
run;
That uses the side-by-side merge (merge with no by statement) to get you both values at once.
To answer your followup question:
does SAS have/use hidden loop with counter while iterating through dataset which could be accessed by user ?
Yes, it does; _n_ defines the current loop iteration (as long as it isn't modified externally, which it can be - it is just a regular variable that's not written out to the dataset). So you could similarly do the following:
data wynik;
set dane;
if x > 3 then x3=x*3;
else do;
set dane2 point=_n_;
x3=x2;
end;
put x3=;
run;
The side-by-side merge is preferred because it will be faster, unless you very infrequently need to look at DANE2. It's also easier to code.
Assume that I have a SAS data step, where I subtract every observation (say, I only have variable X) from its mean:
data tmp;
set tmp;
x = x-2;
run;
Let's say mean is not always 2 and I have another script that creates a text file with one line, which contains:
x = x-2;
Now, the question is, is there any way I can have something like:
data tmp;
set tmp;
load text_file;
run;
To do the same thing as the first data step? In other words, I want a solution that relies on using the content of the file (either as I showed in the data step or within a macro).
%INCLUDE will do what you want. Assuming your text file "c:\mycode.sas" has the line
x=x-2;
then you can do this:
data tmp;
set tmp;
%include "c:\mycode.sas";
run;
I'd note that this is a really, really bad way to do this, but it's what you asked for.
If I wanted to subtract the mean of x from x (standardizing the data), I'd either use PROC STDIZE, or do this:
proc means data=tmp;
var x;
output out=x_mean mean=x_bar;
run;
data want;
set tmp;
if _n_ = 1 then set x_mean;
x=x-x_bar;
run;
Or, PROC STDIZE (included in SAS/STAT):
proc stdize data=tmp out=want_std method=mean;
var x;
run;