I have a data like below
laonno debit childno credit
1234 4162.98 . .
1234 0.02 . .
. . 1234 1387.66
. . 1234 1387.66
. . 1234 1387.66
I need output as when the debit sum is equals to credit sum then for those observations flag should be generated as mentioned below
laonno debit childno credit flag
1234 4162.98 . . matched
1234 0.02 . . N
. . 1234 1387.66 matched
. . 1234 1387.66 matched
. . 1234 1387.66 matched
The data rows will be dynamic but when the sum of debit matches credit then the following flag should be as "MATCHED" .
If your data is representative, here is one way
data want (drop=s);
if _N_ = 1 then do;
dcl hash h ();
h.definekey ('childno');
h.definedata ('s');
h.definedone ();
dcl hash hh ();
hh.definekey ('laonno');
hh.definedone ();
do until (lr);
set yy(where=(childno)) end=lr;
if h.find() ne 0 then s = credit;
else s = sum(s, credit);
h.replace();
end;
end;
set yy;
s = .;
if h.find(key : laonno) = 0 & round(s, .001) = debit then do;
flag = 'Matched';
hh.ref();
end;
else flag = 'N';
if hh.check(key : childno) = 0 then flag = 'Matched';
run;
Related
Players from 1 to 50 are placed in a row in order. The coach said: "Odds number athletes out!" The remaining athletes re-queue and re-number. The coach ordered again: "Odds number athletes out!" In this way, there is only one person left at last. What number of athletes is he? What if the coach's keep ordering "Even number athletes out!" Who is left at the end?
I know it requires me to use loop in SAS to answer the question. But can only write code below:
data a;
do i=1 to 50;
output;
end;
run;
proc sql;
select i
from a
where mod(i,2**5)=0;
quit;
But it won't work for keeping the last odd number athelete. Could you guys figure out a way to simulate this process by using loop? Thanks so much
#Doris welcome :-)
Try this. The Final_Player data set contains the number of the final player in the simulation.
Simply change the mod(N, 2) = 0 to = 1 for the even problem. Feel free to ask.
data _null_;
dcl hash h(ordered : 'y');
h.definekey('p');
h.definedone();
dcl hiter ih('h');
dcl hash i(ordered : 'Y');
i.definekey('id');
i.definedone();
dcl hiter ii('i');
do p = 1 to 50;
h.add();
end;
id = .;
do while (h.num_items > 1);
do _N_ = 1 by 1 while (ih.next() = 0);
if mod(_N_, 2) = 1 then do;
i.add(key : p, data : p);
end;
end;
do while (ii.next() = 0);
rc = h.remove(key : id);
end;
i.clear();
end;
h.output(dataset : 'Final_Player');
run;
Just use algebra.
want = 2 ** floor( log2(n) );
So if you are starting with an arbitrary dataset you can find the one observation you need directly.
data want;
point = 2**floor(log2(nobs));
set a point=point nobs=nobs;
output;
stop;
put i= ;
run;
Here is example using array showing how it works.
373 data test;
374 array x [15];
375 do index=1 to dim(x); x[index]=index; end;
376 do iteration=1 by 1 while(n(of x[*])>1);
377 do index= 2**(iteration-1) to dim(x) by 2**iteration ;
378 x[index]=.;
379 end;
380 put iteration= (x[*]) (3.);
381 end;
382 do index=1 to dim(x) until(x[index] ne .);
383 end;
384 put index= x[index]= ;
385
386 run;
iteration=1 . 2 . 4 . 6 . 8 . 10 . 12 . 14 .
iteration=2 . . . 4 . . . 8 . . . 12 . . .
iteration=3 . . . . . . . 8 . . . . . . .
index=8 x8=8
I have the following dataset
data have;
input SUBJID VISIT$ PARAMN ABLF$ AVAL;
cards;
1 screen 1 . 151
1 random 1 YES .
1 visit1 1 . .
1 screen 2 . 65.5
1 random 2 YES 65
1 visit1 2 . .
1 screen 3 . .
1 random 3 YES 400
1 visit1 3 . 420
;
run;
I want to create another variable called BASE that captures the value of AVAL (when there is an actual value in place) when ABLF=YES and and then drag it down until a new PARAMN is encountered.
Basically I want the output to look like this
SUBJID VISIT$ PARAMN ABLF$ AVAL BASE;
1 screen 1 . 151 .
1 random 1 YES . .
1 visit1 1 . . .
1 screen 2 . 65.5 65
1 random 2 YES 65 65
1 visit1 2 . . 65
1 screen 3 . . 400
1 random 3 YES 400 400
1 visit1 3 . 420 400
I used the the following code
data want;
set have;
by SUBJID PARAMN;
if first.PARAMN and ABLF=' ' then BASE=.;
if ABLF='YES' then BASE=AVAL;
retain BASE;
run;
however when I run this I don't the data to look exactly as I want above
RETAIN does not look like the right tool for this. RETAIN can only move data forward in the file. It cannot move it backwards.
Looks like there is just one observation with the "BASE" value. So just merge it back onto the data.
data want;
merge have
have(keep=subjid paramn aval ablf rename=(aval=BASE ablf=xx)
where=(xx='YES'))
;
by SUBJID PARAMN;
drop xx;
run;
Pro SQL:
proc sql;
select a.*,b.aval as BASE from have a left join have(drop=visit where=(ablf='YES')) b
on a.subjid=b.subjid and a.paramn=b.paramn;
quit;
Double do loop:
data want;
do until(last.visit);
set have;
retain temp;
by subjid paramn notsorted;
if ablf='YES' then temp=aval;
end;
do until(last.visit);
set have;
by subjid paramn notsorted;
base=temp;
end;
drop temp;
run;
I have a series of string values with missing observations. I would like to use flat substitution. For instance variable x has 3 available values. There should be a 33.333% chance that a missing value will be assigned to the available values for x under this substitution method. How would I do this?
DATA have;
INPUT id a $ b $ c $ x;
CARDS;
1 Y Male . 5
2 Y Female . 4
3 . Female Tall 4
4 Y . Short 2
5 N Male Tall 1
;
Run;
You could use temporary arrays to store the possible values. Then generate a random index into the array.
DATA have;
INPUT id a $ b $ c $ x;
CARDS;
1 Y Male . 5
2 Y Female . 4
3 . Female Tall 4
4 Y . Short 2
5 N Male Tall 1
;
data want ;
set have ;
array possible_b (2) $8 ('Male','Female') ;
if missing(b) then b=possible_b(1+int(rand('uniform')*dim(possible_b)));
run;
I did this with generating random numbers and hard coding the limits. There should be an easier way to do this, but for the purposes of the question this should work.
option missing='';
data begin;
input a $;
cards;
a
.
b
c
.
e
.
f
g
h
.
.
j
.
;
run;
data intermediate;
set begin;
if a EQ '' then help= rand("uniform");
else help=.;
run;
data wanted;
set intermediate;
format help populated.;
if a EQ '' then do;
if 0<=help<0.33 then a='V1';
else if 0.33<=help<0.66 then a='V2';
else if 0.66<=help then a='V3';
end;
drop help;
run;
I am working with data that derives from an 'indicate all that apply' question. Two raters were asked to complete the question for a unique subject list. The data looks something like this.
ID| Rater|Q1A|Q1B|Q1C|Q1D
------------------------
1 | 1 | A | F | E | B
1 | 2 | E | G |
2 | 1 | D | C | A
2 | 2 | C | D | A
I want to compare the two raters' answers for each ID and determine whether answers for Q1A-Q1D are the same. I am not interested in the direct comparisons between each rater by ID for Q1A, Q1B, etc. individually. I want to know if all the values in Q1A-Q1D as a set are the same. (E.g., in the example data above, the raters for ID 2 would be identical). I am assuming I would do this with an array. Thanks.
Here is a similar solution also using call sortc, but rather using vectors and retain variables.
Create example dataset
data ratings;
infile datalines truncover;
input ID Rater (Q1A Q1B Q1C Q1D) ($);
datalines;
1 1 A F E B
1 2 E G
2 1 D C A
2 2 C D A
3 1 A B C
3 2 A B D
;
Do the comparison
data compare(keep=ID EQUAL);
set ratings;
by ID;
format PREV_1A PREV_Q1B PREV_Q1C PREV_Q1D $1.
EQUAL 1.;
retain PREV_:;
call sortc(of Q1:);
array Q(4) Q1:;
array PREV(4) PREV_:;
if first.ID then do;
do _i = 1 to 4;
PREV(_i) = Q(_i);
end;
end;
else do;
EQUAL = 1;
do _i = 1 to 4;
if Q(_i) NE PREV(_i) then EQUAL = 0;
end;
output;
end;
run;
Results
ID EQUAL
1 0
2 1
3 0
This looks like a job for call sortc:
data have;
infile cards missover;
input ID Rater (Q1A Q1B Q1C Q1D) ($);
cards;
1 1 A F E B
1 2 E G
2 1 D C A
2 2 C D A
3 1 A B C
3 2 A B D
;
run;
/*You can use an array if you like, but this works fine too*/
data temp /view = temp;
set have;
call sortc(of q:);
run;
data want;
set temp;
/*If you have more questions, extend the double-dash list to cover all of them*/
by ID Q1A--Q1D notsorted;
/*Replace Q1D with the name of the variable for the last question*/
IDENTICAL_RATERS = not(first.Q1D and last.Q1D);
run;
Sort, Concatenate, then compare.
data want ;
set ratings;
by id;
call sortc(of Q1A -- Q1D);
rating = cats(of Q1A -- Q1D);
retain rater1 rating1 ;
if first.id then rater1=rater;
if first.id then rating1=rating;
if not first.id ;
rater2 = rater ;
rating2 = rating;
match = rating1=rating2 ;
keep id rater1 rater2 rating1 rating2 match;
run;
I have dataset M
number id_no date
1 123 3/3/2012
2 123 3/3/2012
3 . .
4 . .
How do I copy 123 and 3/3/2012 into the obs 4 & 5.
This should get you there.
data one;
input
number id_no date mmddyy10.;
format date mmddyy10.;
datalines;
1 123 3/3/2012
2 123 3/3/2012
3 . .
4 . .
5 456 .
;
run;
proc sort data = one;
by number;
run;
data two;
set one;
retain _id_no _date;
if missing(_id_no) then _id_no = id_no;
if missing(id_no) then id_no = _id_no;
if missing(_date) then _date = date;
if missing(date) then date = _date;
drop _id_no _date;
run;