Dynamic summing of n columns in SAS using arrays

Dynamic summing of n columns in SAS using arrays - sas

I have a problem in SAS where I have to sum n columns(Time(1) to time(N)) where the N is defined as a variable in another column(Min_Remain_wthdrw_Prd).
I am writing the below code but it is not working:
data certain;set certain;
array t(*) t1-t60;
do while(i<=Min_Remain_wthdrw_Prd);
S_Disc=sum(t(1)-t(i));
end;
end;
run;
Kindly help

You have too many end statements, and you can just use a regular do loop...
data certain ;
set certain ;
array t(*) t1-t60 ;
S_Disc = 0 ;
do i = 1 to Min_Remain_wthdrw_Prd ;
S_Disc+t{i} ;
end ;
run;

Related

Mixing macro-DO-loops with data step DO-loops

Some context:
I have a string of digits (not ordered, but with known range 1 - 78) and I want to extract the digits to create specific variables with it, so I have
"64,2,3" => var_64 = 1; var_02 = 2; var_03 = 1; (the rest, like var_01 are all set to missing)
I basically came up with two solutions, one is using a macro DO loop and the other one a data step DO loop. The non-macro solution was to fist initialize all variables var_01 - var_78 (via a macro), then to put them into an array and then to gradually set the values of this array while looping through the string, word-by-word.
I then realized that it would be way easier to use the loop iterator as a macro variable and I came up with this MWE:
%macro fast(w,l);
do p = 1 to &l.;
%do j = 1 %to 9;
if &j. = scan(&w.,p,",") then var_0&j. = 1 ;
%end;
%do j = 10 %to 78;
if &j. = scan(&w.,p,",") then var_&j. = 1 ;
%end;
end;
%mend;
data want;
string = "2,4,64,54,1,4,7";
l = countw(string,",");
%fast(string,l);
run;
It works (no errors, no warnings, expected result) but I am unsure about mixing macro-DO-loops and non-macro-DO-loops. Could this lead to any inconsistencies or should I just stay with the non-macro solution?

Your current code is comparing numbers like 1 to strings like "1".
&j. = scan(&w.,p,",")
It will work as long as the strings can be converted into numbers, but it is not a good practice. It would be better to explicitly convert the strings into numbers.
input(scan(&w.,p,","),32.)
You can do what you want with an array. Use the number generated from the next item in the list as the index into the array.
data want;
string = "2,4,64,54,1,4,7";
array var_ var_01-var_78 ;
do index=1 to countw(string,",");
var_[input(scan(string,index,","),32.)]=1;
end;
drop index;
run;

SAS: Coffee anyone?

I tried this in C# but have not had much success. So I am now trying in SAS. Using an EG session and my SAS code, we work with the list of students in SASHELP.CLASS.
These people want to get to know each other and have a monthly random pairing to go on a Coffee Date.
Rules:
A random Coffee Date List is Generated monthly;
I store each months pairing into a Historical Dataset, which I append monthly.
One person cannot have coffee with the same person within a 6 month period. So we keep a separate dataset for historical purposes with 3 Vars:
LastDate,InviterID,InvitedID
We check each pairing against the Historical list of which we only load the most recent 6 months data into a temp dataset for checking purposes.
If no recent matched pair is found, a new matched pair is added to a new Paired Dataset, and the 2 names (Rows) are removed from the original Participants dataset until the dataset has less than 2 rows. (a single person cannot be paired with another)
Unfortunately we have 19 people in this list so one person will be left out until we can add a new participant. Is anyone interested in joining our coffee club? :-)
So I start by deriving and ID (n) from the dataset, and I only keep the Name
Data Participants(Keep=ID Name);
FORMAT ID 8.;
set SASHelp.class;
ID=_n_;
run;
These 19 People will be my Participants in the Coffee Club.
I more or less follow the line of thought:
data _null_;
randvar = ceil(rand('UNIFORM') * 100000);
call symput('RANDSEED', randvar);
run;
data CR.names2(keep=MEMID randid);
set CR.MasterNames;
randid = rand('UNIFORM');
run;
proc sort data=CR.names2 ; by randid; run;
data CR.pairs(keep=pairgrp MEMID);
set CR.names2 nobs=num_peeps;
pairgrp+1;
if pairgrp > floor(num_peeps/2) then pairgrp=1;
run;
proc sort data=CR.pairs; by pairgrp;run;
proc transpose data=CR.pairs
out=CR.pairs2 (drop=_NAME_);
var memid;
by pairgrp;
run;
Data CR.Pairs3;
set CR.pairs2;
rename COL1=InviterID COL2=InvitedID;
run;
But I get stuck :-(
I need help with the rest please...
Has anyone else done this type of random pairing successfully before? I am grasping straws here...
Any help much appreciated.
Len

Here is my idea. This is far from efficient. Esp. when NOBS is getting big, as there is a cartesian product involved. Also I cheated on the odd number by adding another row in that case.
Prepare data and generate empty result table.
Create a list of all possible pairings (combinations) excluding recent pairings.
Random sort and descend through the list until every element has been picked once.
Append to result table.
There is a drawback as there might be members who will not get pairings as all possible partners are already picked. To avoid that we could iterate until we get a maximum of pairings.
EDIT: Added iteration. Now the program makes draws randomly until everyone is matched or a threshold is reached.
This problem should probably be implemented in a matrix orientated language like IML or R.
data Participants(Keep=ID Name) ;
set SASHelp.class nobs = num_peeps ;
ID=_n_ ;
output ;
if _n_ = 1 and mod(num_peeps,2) then do ; /* get even number of members: empty ID to pair with last participant*/
name = 'empty' ;
id = 0 ;
output ;
end ;
run ;
data list_of_meetings ;
length iteration InviterID InvitedID 8. ;
run ;
/****
iter = number of club meetings
hist = length of memory for pairings
tries = number of iterations to pair everyone
****/
%macro loop_coffee (iter=, hist=6, tries= 10) ;
proc sql noprint ;
select max(0,max(iteration)) + 1 into :base
from list_of_meetings ;
quit ;
%do i = &base. %to &iter. ; /* loop through number of meetings */
proc sort data = list_of_meetings (where=(iteration >= &i - &hist )) out = lookup nodupkey ; by InviterID InvitedID ; run ; /* get memory of pairings */
proc sql ; /* list all acceptable pairs */
create table all_pairs as
select a.ID as InviterID, b.ID as InvitedID
from Participants a
inner join Participants b
on a.ID lt b.ID
left join lookup c /* exclude the memory */
on a.ID eq c.InviterID and b.ID eq c.InvitedID
where c.InviterID is NULL ;
quit ;
%let j = 0 ;
%let all_pairs = 0 ;
%do %until (&all_pairs | &j > &tries) ; /* iterate and random sort until all members are paired */
%let j = %eval( &j + 1 ) ;
data all_pairs;
set all_pairs;
randnum = ranuni(12345 + &i + &j);
run;
proc sort data = all_pairs ; by randnum ; run ; /* random sort */
data out_pairs ; /* select the pairs: no. of IDs/2 */
declare hash h() ;
h.defineKey("ID") ;
h.defineDone() ;
do until ( eof1 ) ;
set Participants (keep= ID) end = eof1 ;
rc = h.add () ; /* populate list of members */
end ;
do until ( eof2 ) ;
set all_pairs (keep= InviterID InvitedID) end = eof2 ;
rc1 = h.check (key:InviterID) ;
rc2 = h.check (key:InvitedID) ;
if rc1 = 0 and rc2 = 0 then do ;
rc = h.remove (key:InviterID) ; /* delete member from list if paired */
rc = h.remove (key:InvitedID) ;
output ;
end ;
if h.num_items = 0 then do ;
call symput('all_pairs', 1 ) ;
stop ;
end;
end ;
stop ;
keep InviterID InvitedID ;
run ;
%end ;
data list_of_meetings ;
set list_of_meetings (where=(iteration ne .))
Out_pairs (in=pairs) ;
if pairs then iteration = &i. ;
run ;
%end ;
%mend ;
%loop_coffee (iter=10,hist=6,tries=10) ;

Polynomials in character form to numeric (SAS)

I have a SAS dataset which contains one column of polynomials. For example, X1**(-2)+X1**(2).
Is there a function to transform this into a numeric expression?
Many thanks,

If I understand you correctly, I don't think there is a specific function that will easily let you do this. You have two options - write your own logic to interpret the polynomial expressions, or use call execute to have SAS write out a (potentially very long) data step for you, assuming that the polynomials are all entered as valid data step code. Here's a call execute approach:
data have;
input x1 polynomial $255.;
infile datalines truncover;
datalines;
1 X1**(-2)+X1**(2)
2 X1**(-1)+X1**(1)
3 X1**(1)+X1**(-1)
;
run;
data _null_;
set have end = eof;
if _n_ = 1 then call execute('data want; set have; select(_n_);');
call execute(catx(' ','when(',_N_,') y =',polynomial,';'));
if eof then call execute('end; run;');
run;

Convert them to macro variables, and then resolve them into a calculation...
Using the dataset example in user667489's answer :
/* Create numbered macro variables, 1 per row of data */
data _null_ ;
set have end=eof ;
call symputx(cats('POLY',_n_),polynomial) ;
if eof then call symputx('POLYN',_n_) ;
run ;
%MACRO ROWLOOPER ;
%DO N = 1 %TO &POLYN ;
if _n_ = &N then result = &&POLY&N ;
%END ;
%MEND ;
data want ;
set have ;
/* Not very efficient, looping over all polynomials on each row of data */
/* So for 3 rows, you'll perform 9 iterations here */
%ROWLOOPER ;
run ;
Or, alternatively, write your dataset out into a SAS program, and %inc that program :
data _null_ ;
file "polynomials.sas" ;
set have end=eof ;
if _n_ = 1 then do ;
put "data poly;" ;
put " set have;" ;
end ;
put " result = " polynomial ";" ;
if eof then put "run;" ;
run ;
%inc "polynomials.sas" ;

SAS: generate abstractly long and large dataset

Trying to do some performance testing
I can't figure out a macro
%generate(n_rows,n_cols);
that would generate a table with n_rows and n_cols, filled with random numbers/strings
I tried using this link:
http://bi-notes.com/2012/08/benchmark-io-performance/
But I quickly encounter a memory issue
Thanks!

Try this. I added a 2 input parameters. So now you have a number of numerics and a number of characters. Also the ability to define the output dataset name.
%macro generate(n_rows,n_num_cols,n_char_cols,outdata=test,seed=0);
data &outdata;
array nums[&n_num_cols];
array chars[&n_char_cols] $;
temp = "abcdefghijklmnopqrstuvwxyz";
do i=1 to &n_rows;
do j=1 to &n_num_cols;
nums[j] = ranuni(&seed);
end;
do j=1 to &n_char_cols;
chars[j] = substr(temp,ceil(ranuni(&seed)*18),8);
end;
output;
end;
drop i j temp;
run;
%mend;
%generate(10,10,10,outdata=test);

Summing a table with an unknown number of variables?

I'm fairly new with SAS. I've used it a bit in the past but am really rusty.
I've got a table that looks like this:
Key Group1 Metric1 Group2 Metric2 Group3 Metric3
1 . r 20 .
1 . . t 3
For several unique keys.
I want everything to appear on one row so it looks like.
Key Group1 Metric1 Group2 Metric2 Group3 Metric3
1 . r 20 t 3
Another wrinkle is I don't know how many group and metric columns I'll have (although I'll always have the same number).
I'm not sure how to approach this. I'm able to get a list of column names and use them in a macro, I'm just not sure what proc or datastep function I need to use to collapse everything down. I would be extremely greatful for any suggestions.

There's a very simple way to do this using a nice trick. I've answered similar questions on this before, see here for one of them. This should achieve exactly what you're after.

You can use 2 temporary arrays (one for the character variables, and another for the numeric), and fill them with the non-blank values accordingly. When you reach last.key, you can load the temporary arrays back into the source variables.
If you know the maximum length of the character variables in advance, you can hard code it, but if not you can determine it dynamically.
This assumes that for each key, each variable is only populated once. Otherwise it will take the last value it sees for a particular variable within each key.
%LET LIB = work ;
%LET DSN = mydata ;
%LET KEYVAR = key ;
/* Get column name/type/max length */
proc sql ;
/* Numerics */
select name, count(name) into :NVARNAMES separated by ' ', :NVARNUM
from dictionary.columns
where libname = upcase("&LIB")
and memname = upcase("&DSN")
and name ^= upcase("&KEYVAR")
and type = 'num' ;
/* Characters */
select name, count(name), max(length) into :CVARNAMES separated by ' ', :CVARNUM, :CVARLEN
from dictionary.columns
where libname = upcase("&LIB")
and memname = upcase("&DSN")
and name ^= upcase("&KEYVAR")
and type = 'char' ;
quit ;
data flatten ;
set &LIB..&DSN ;
by &KEYVAR ;
array n{&NVARNUM} &NVARNAMES ;
array nt{&NVARNUM} _TEMPORARY_ ;
array c{&CVARNUM} &CVARNAMES ;
array ct{&CVARNUM} $&CVARLEN.. _TEMPORARY_ ;
retain nt ct ;
if first.&KEYVAR then do ;
call missing(of nt{*}, of ct{*}) ;
end ;
/* Load non-missing numeric values into temporary array */
do i = 1 to dim(n) ;
if not missing(n{i}) then nt{i} = n{i} ;
end ;
/* Load non-missing character values into temporary array */
do i = 1 to dim(c) ;
if not missing(c{i}) then ct{i} = c{i} ;
end ;
if last.&KEYVAR then do ;
/* Load numeric back into original variables */
call missing(of n{*}) ;
do i = 1 to dim(n) ;
n{i} = nt{i} ;
end ;
/* Load character back into original variables */
call missing(of c{*}) ;
do i = 1 to dim(c) ;
c{i} = ct{i} ;
end ;
output ;
end ;
drop i ;
run ;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Dynamic summing of n columns in SAS using arrays - sas

You have too many end statements, and you can just use a regular do loop... data certain ; set certain ; array t(*) t1-t60 ; S_Disc = 0 ; do i = 1 to Min_Remain_wthdrw_Prd ; S_Disc+t{i} ; end ; run;

Related

Mixing macro-DO-loops with data step DO-loops

SAS: Coffee anyone?

Polynomials in character form to numeric (SAS)

SAS: generate abstractly long and large dataset

Summing a table with an unknown number of variables?

Categories

Resources