Hi i have 25 variables like Name1, name2, name3.....name 25. Few names are having 0 data which means few name are have the value 0.
I want to concate all the name from 1 to 25 and drop those name which has 0 values.
i am trying
data test;
set need;
new_name = catx (',',Name1,Name2, Name3, Name4, Name5, Name6, Name7, Name8, Name9, Name10, Name11, Name12, Name13, Name14, Name15, Name16, Name17, Name18, Name19, Name20, Name21, Name22, Name23, Name24, Name25);
); run;
but i am getting name like 0,0,0,0,0,Amit,0,0,0,Dave,0,0,0,Pam,0,0,0,0,Deepka,0,0,0,0,0,0. However, i only need names not 0 value in the outcome. Variable data type is charracter
Any help would be much appreciated
This should work:
data have;
input (Name1-Name25) (:$20.);
datalines;
0 0 0 0 0 Amit 0 0 0 Dave 0 0 0 Pam 0 0 0 0 Deepka 0 0 0 0 0 0
Tom Steve 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Richard 0 0 0 0
;
run;
First I changed 0 with nulls, so catx doesn't concatenate those zeros.
Data have;
set have;
array Name[25];
do i=1 to 25;
if Name[i]='0' then do;
Name[i]='';
end;
end;
run;
Your code then works just fine.
proc sql;
create table want as
select catx (',',Name1,Name2, Name3, Name4, Name5, Name6, Name7, Name8, Name9, Name10, Name11, Name12, Name13, Name14, Name15, Name16, Name17, Name18, Name19, Name20, Name21, Name22, Name23, Name24, Name25) as new_name
from have
;
quit;
This is naturally not the only way to do this. Best way would probably be to fix the issue at the source - to put proper null values into the table instead of 0s.
Related
I need to do the following stuff with Sas.
I have a dataset like this;
ID flg_1 flg_2 flg_3 ... flg_200
1 0 1 0 ... 1
2 1 0 0 ... 0
3 0 0 1 ... 0
4 1 1 1 .... 0
....
I would like to create a new column having the name of flags equal to 1. I mean:
ID flg_1 flg_2 flg_3 ... flg_200 NEW_VAR
1 0 1 0 ... 1 flg_2-flg_200
2 1 0 0 ... 0 flg_1
3 0 0 1 ... 0 flg_3
4 1 1 1 .... 0 flg_1-flg_2-flg_3
....
Could you help me?
thanks
Use a variable based array for iterating over flags and the vname to retrieve the variable name.
Example:
/* make sure the accumulator variable named 'flagged' is wide
* enough accommodate the case of all variables being flagged.
*/
data want;
set have;
attrib flagged length=$1600 label='List of variables that were flagged';
array flags flg_1-flg_200;
do _n_ = 1 to dim(flags);
if flags(_n_) then flagged = catx(',', flagged, vname(flags(_n_)));
end;
run;
Suppose i have random diagnostic codes, such as 001, v58, ..., 142,.. How can I construct columns from the codes which is 1 for the records?
Input:
id found code
1 1 001
2 0 v58
3 1 v58
4 1 003
5 0 v58
......
......
15000 0 v58
Output:
id code_001 code_v58 code_003 .......
1 1 0 0
2 0 0 0
3 0 1 0
4 1 0 0
5 0 0 0
.........
.........
You will want to TRANSPOSE the values and name the pivoted columns according to data (value of code) with an ID statement.
Example:
In real world data it is often the case that missing diagnoses will be flagged zero, and that has to be done in a subsequent step.
data have;
input id found code $;
datalines;
1 1 001
2 0 v58
2 1 003 /* second diagnosis result for patient 2 */
3 1 v58
4 1 003
5 0 v58
;
proc transpose data=have out=want(drop=_name_) prefix=code_;
by id;
id code; * column name becomes <prefix><code>;
var found;
run;
* missing occurs when an id was not diagnosed with a code;
* if that means the flag should be zero (for logistic modeling perhaps)
* the missings need to be changed to zeroes;
data want;
set want;
array codes code_:;
do _n_ = 1 to dim(codes); /* repurpose automatic variable _n_ for loop index */
if missing(codes(_n_)) then codes(_n_) = 0;
end;
run;
I have a dataset like this(type is an indicator):
datetime type
...
ddmmyy:10:30:00 0
ddmmyy:10:31:00 0
ddmmyy:10:32:00 1
ddmmyy:10:33:00 0
ddmmyy:10:34:00 1
ddmmyy:10:35:00 0
...
I was trying to extract data with type 1 and also the previous and next one. Just try to extract (-1,+1) window based on type 1.
datetime type
...
ddmmyy:10:31:00 0
ddmmyy:10:32:00 1
ddmmyy:10:33:00 0
ddmmyy:10:34:00 1
ddmmyy:10:35:00 0
...
I found a similar post here. I copied and pasted the code, but I am not quite sure what does 'x' mean in his code. SAS gives me 'File WORK.x does not exist'.
Can someone help me out? Thx.
The X data set in the other post is the same source table you are filtering, so the logical order of the code is:
Check every row in the table 'Have', _N_ holds the current row number,
If Type = 1 then Set Have Point=_N_ goes to row _N_ in the 'Have' table and outputs that row to the new table 'want', then continues to the next row. The _N_ can be the pointer to the current, previous or next row. ( The two IF statements handles the cases of first row and last row; where there is no Previous or no Next)
Full Working Code:
data have;
length datetime $23.;
input datetime $ type ;
datalines;
ddmmyy:10:30:00 0
ddmmyy:10:31:00 0
ddmmyy:10:32:00 1
ddmmyy:10:33:00 0
ddmmyy:10:34:00 1
ddmmyy:10:35:00 0
;
run;
data want;
set have nobs=nobs;
if type = 1 then do;
current = _N_;
prev = current - 1;
next = current + 1;
if prev > 0 then do;
set have point = prev;
output;
end;
set have point = current;
output;
if next <= nobs then do;
set have point = next;
output;
end;
end;
run;
proc sort data=want noduprecs;
by _all_ ; Run;
Note: I added an extra step proc sort to remove duplicate rows.
Output:
datetime=ddmmyy:10:31:00 type=0
datetime=ddmmyy:10:32:00 type=1
datetime=ddmmyy:10:33:00 type=0
datetime=ddmmyy:10:34:00 type=1
datetime=ddmmyy:10:35:00 type=0
For you example data that does not have any id or group variable it should be pretty straight forward. Instead of thinking about moving back and forth in the file, just create new variables that contain the previous (LAG_TYPE) and next (LEAD_TYPE) value for TYPE. Then your requirement to keep the observations before the one with TYPE=1 is translated to keeping the observations where LEAD_TYPE=1.
Let's convert your sample data into a dataset.
data have ;
input datetime :$15. type ;
cards;
ddmmyy:10:30:00 0
ddmmyy:10:31:00 0
ddmmyy:10:32:00 1
ddmmyy:10:33:00 0
ddmmyy:10:34:00 1
ddmmyy:10:35:00 0
;
Rather than actually keeping the required observations I will make a new variable KEEP that will be true for records that meet your criteria.
data want ;
recno+1;
set have end=eof;
lag_type=lag(type);
if not eof then set have(firstobs=2 keep=type rename=(type=lead_type));
else lead_type=.;
keep= (type=1 or lag_type=1 or lead_type=1) ;
run;
Here is the result.
recno datetime type lag_type lead_type keep
1 ddmmyy:10:30:00 0 . 0 0
2 ddmmyy:10:31:00 0 0 1 1
3 ddmmyy:10:32:00 1 0 0 1
4 ddmmyy:10:33:00 0 1 1 1
5 ddmmyy:10:34:00 1 0 0 1
6 ddmmyy:10:35:00 0 1 . 1
Move next observation up and compare two observations at the same row or use lag to compare the current observation and previous observation.
data have;
length datetime $23.;
input datetime $ type ;
datalines;
ddmmyy:10:30:00 0
ddmmyy:10:31:00 0
ddmmyy:10:32:00 1
ddmmyy:10:33:00 0
ddmmyy:10:34:00 1
ddmmyy:10:35:00 0
;
run;
data want;
merge have have(firstobs=2 keep=type rename=(type=_type));
if max(type,_type) or max(type,lag(type)) ;
drop _type;
run;
I wish to get last number in a sequence of equal numbers. For example, I have the following dataset:
X
1
1
0
0
0
1
1
0
Given that sequence of numbers I need extract the last number of a sequence of "ones" until appear a 0. That is what I want:
X Seq
1 1
1 2
1 3
0 1
0 2
0 3
1 1
1 2
0 1
1 1
1 2
1 3
0 1
I need create a new dataset with the numbers in bold, that is:
Seq1
3
2
3
Thanks for any advice.
One more option - use the NOTSORTED BY group option.
Data want;
Set have;
By x NOTSORTED;
Retain count;
If first.x then count=1;
Else count+1;
If last.x then output;
Keep count;
Run;
You can create group variables using a lag and then keep the last observation of each group you have created:
data temp;
input x $;
datalines;
1
1
1
0
0
0
1
1
0
1
1
1
0
;
run;
data temp2;
set temp;
retain flag;
if lag(x) > x then flag = _n_;
if x = 0 then delete;
run;
data temp3 (keep = seq1);
set temp2;
seq1 + 1;
by flag;
if first.flag then seq1 = 1;
if last.flag then save = 1;
if missing(save) then delete;
run;
Use a proc summary with notsorted option here:
data math;
input x;
datalines;
1
1
1
0
0
0
1
1
0
1
0
1
1
1
1
0
1
;
run;
proc summary data=math;
by x notsorted;
class x;
output Out=z;
run;
data z (drop=_type_ x);
set z (rename=(_FREQ_=COUNT));
where _type_=1 and x=1;/*if you are looking for other number then 1, replace it here*/
run;
proc print data=z noobs;
run;
result is:
Here's a solution using only one data step with a retain statement.
data have;
input x ##;
output;
datalines;
1 1 1 0 0 0 1 1 0 1 0 1 1 1 1 0 1
;
data want(keep = count);
set have end = last;
retain x_previous . count .;
if x = 0 then do;
if x_previous = 1 then do;
output;
count = 0;
end;
end;
else if x = 1 then count + 1;
if last = 1 and count > 0 then output;
x_previous = x;
run;
Results
count
-----
3
2
1
4
1
I have a dataset with patient having multiple courses during a treatment phase.
Data set looks like:
C 1 1 0
C 0 0 1
C 1 1 0
C 0 0 1
The first two rows: patient start at row1 and finishes at row2. This is the first course of patient C.
The second two rows: patient C again starts at row3 and finishes at row four.
How can I create an identifier for these two courses using the first and last statements in SAS.
Expected output should look like this;
C 1 1 0 23
C 0 0 1 23
C 1 1 0 24
C 0 0 1 24
C 1 1 1 25
The counts for one course should be the same and different from courses to courses within he same patient.
Thanks.
Assuming the third variable, whatever it is, is your 'end state' the following works. Probably not the easiest method but hopefully clear. I don't know if First/Last will actually help in this situation except for when the ID switches.
Idea is look for the V3=1 and then set a flag to 1. If the flag is 1, then the next record increments and resets the flag and the process is continued. Retain is used to hold the values of Flag and Course across the rows.
data have;
input ID $ v1-v3;
cards;
C 1 1 0
C 0 0 1
C 1 1 0
C 0 0 1
D 1 0 0
D 0 1 0
D 0 0 1
;
run;
data want;
set have;
BY ID;
retain flag 0 course;
if first.ID then do;
Course=1;
flag=0;
end;
if flag=1 then do;
course=course+1;
flag=0;
end;
else if v3=1 and flag=0 then flag=1;
run;
proc print;
run;