How to output sas format as proc format syntax? - sas

I have created a format based on a dataset. Now I want to store this format as a value-list as part of the proc format syntax in my sas program. Is there a way to accomplish this?
The reason for doing this is that I often need to make tables which group the country background of people into groups similar to continents. Until now this has been done by joining the data using country code as key variable with another dataset which contain a continents variable, and then applying a format $continents on the continents variable.
I want to be able to skip this join operation by making a format for continents that takes country codes as input values. I also want this format to be stored in the syntax file which produces the tables and not in a format catalog. Since the world has a lot of countries, writing this format manually seems prone to error.

This is just a guide, hasn't been tested with every scenario e.g. numeric, character & informat or multi-label/picture formats.
/* Create a dummy format */
data dummyfmt ;
retain fmtname 'DUMMY' type 'N' ;
do i = 1 to 10 ;
start = i ;
label = repeat(byte(round(ranuni(0) * (122 - 97 + 1),1) + 96),10) ;
if i = 10 then hlo = 'O' ;
output ;
end ;
run ;
proc format cntlin=dummyfmt ; run ;
/* Dump the format back out to a dataset */
proc format cntlout=dump library=work ;
select dummy ;
run ;
proc print heading=H ; run ;
/* Write out to log... */
data _null_ ;
set dump end=eof ;
if _n_ = 1 then do ;
put "proc format ;" ;
if type = 'N' then put " value " fmtname ;
if type = 'C' then put " value $" fmtname ;
if type = 'I' then put " invalue " fmtname ;
end ;
if hlo = 'O' then do ;
if type in('N' 'C') then put " other = '" label +(-1) "'" ;
if type = 'I' then put " other = " label ;
end ;
else do ;
if type in('N' 'C') then put " " start " = '" label +(-1) "'" ;
if type = 'I' then put " " start " = " label ;
end ;
if eof then do ;
put " ;" ;
put "run ;" ;
end ;
run ;
You may need to modify the above depending on your format, especially if there's ranges involved. The SEXCL and EEXCL columns would then be relevant.
/* Example output (from Log Window) */
proc format ;
value DUMMY
1 = 'bbbbbbbbbbb'
2 = 'hhhhhhhhhhh'
3 = 'ttttttttttt'
4 = 'fffffffffff'
5 = 'sssssssssss'
6 = 'bbbbbbbbbbb'
7 = 'aaaaaaaaaaa'
8 = 'ppppppppppp'
9 = 'eeeeeeeeeee'
other = 'wwwwwwwwwww'
;
run ;

Related

print warning to log if then else statement sas

I am writing an if/then/else statement, where the final else is :
if variable2 = 'foo' then variable = 'bar'
else variable = .
Can I print a custom 'warning' to the log file that has a list or array of the variable2 names where
variable = .
You can use the PUTLOG statement to write messages to the log.
if variable2 = 'foo' then variable = 'bar' ;
else do;
variable = . ;
putlog "WARNING: bad value " variable2 = ;
end ;
It looks like the OP wanted a single message with all the relevant variable names listed. There are various ways to do that... the easiest would be a quick PROC SQL on the output data set after this step finishes:
proc sql noprint;
select distinct variable2 into :bad_vals separated by ' '
from my_data
where variable = .
;
quit;
%put WARNING: Bad values of VARIABLE: &bad_vals;
This would be fine for relatively small data sets; for big data sets you could avoid the extra pass through the data by maintaining a list of relevant values while the initial data step is running, and just print the message once at the end of the data step.
data mydata;
length bad_vals $ 10000;
drop bad_vals;
set in end=end;
...blah...
...else do;
variable = .;
bad_vals = strip(bad_vals) || ' ' || variable2;
end;
if end then do;
putlog 'WARNING: Bad values of VARIABLE:' bad_vals;
end;
run;
or you could use a macro var instead of the data step var bad_vals, etc.

SAS: Coffee anyone?

I tried this in C# but have not had much success. So I am now trying in SAS. Using an EG session and my SAS code, we work with the list of students in SASHELP.CLASS.
These people want to get to know each other and have a monthly random pairing to go on a Coffee Date.
Rules:
A random Coffee Date List is Generated monthly;
I store each months pairing into a Historical Dataset, which I append monthly.
One person cannot have coffee with the same person within a 6 month period. So we keep a separate dataset for historical purposes with 3 Vars:
LastDate,InviterID,InvitedID
We check each pairing against the Historical list of which we only load the most recent 6 months data into a temp dataset for checking purposes.
If no recent matched pair is found, a new matched pair is added to a new Paired Dataset, and the 2 names (Rows) are removed from the original Participants dataset until the dataset has less than 2 rows. (a single person cannot be paired with another)
Unfortunately we have 19 people in this list so one person will be left out until we can add a new participant. Is anyone interested in joining our coffee club? :-)
So I start by deriving and ID (n) from the dataset, and I only keep the Name
Data Participants(Keep=ID Name);
FORMAT ID 8.;
set SASHelp.class;
ID=_n_;
run;
These 19 People will be my Participants in the Coffee Club.
I more or less follow the line of thought:
data _null_;
randvar = ceil(rand('UNIFORM') * 100000);
call symput('RANDSEED', randvar);
run;
data CR.names2(keep=MEMID randid);
set CR.MasterNames;
randid = rand('UNIFORM');
run;
proc sort data=CR.names2 ; by randid; run;
data CR.pairs(keep=pairgrp MEMID);
set CR.names2 nobs=num_peeps;
pairgrp+1;
if pairgrp > floor(num_peeps/2) then pairgrp=1;
run;
proc sort data=CR.pairs; by pairgrp;run;
proc transpose data=CR.pairs
out=CR.pairs2 (drop=_NAME_);
var memid;
by pairgrp;
run;
Data CR.Pairs3;
set CR.pairs2;
rename COL1=InviterID COL2=InvitedID;
run;
But I get stuck :-(
I need help with the rest please...
Has anyone else done this type of random pairing successfully before? I am grasping straws here...
Any help much appreciated.
Len
Here is my idea. This is far from efficient. Esp. when NOBS is getting big, as there is a cartesian product involved. Also I cheated on the odd number by adding another row in that case.
Prepare data and generate empty result table.
Create a list of all possible pairings (combinations) excluding recent pairings.
Random sort and descend through the list until every element has been picked once.
Append to result table.
There is a drawback as there might be members who will not get pairings as all possible partners are already picked. To avoid that we could iterate until we get a maximum of pairings.
EDIT: Added iteration. Now the program makes draws randomly until everyone is matched or a threshold is reached.
This problem should probably be implemented in a matrix orientated language like IML or R.
data Participants(Keep=ID Name) ;
set SASHelp.class nobs = num_peeps ;
ID=_n_ ;
output ;
if _n_ = 1 and mod(num_peeps,2) then do ; /* get even number of members: empty ID to pair with last participant*/
name = 'empty' ;
id = 0 ;
output ;
end ;
run ;
data list_of_meetings ;
length iteration InviterID InvitedID 8. ;
run ;
/****
iter = number of club meetings
hist = length of memory for pairings
tries = number of iterations to pair everyone
****/
%macro loop_coffee (iter=, hist=6, tries= 10) ;
proc sql noprint ;
select max(0,max(iteration)) + 1 into :base
from list_of_meetings ;
quit ;
%do i = &base. %to &iter. ; /* loop through number of meetings */
proc sort data = list_of_meetings (where=(iteration >= &i - &hist )) out = lookup nodupkey ; by InviterID InvitedID ; run ; /* get memory of pairings */
proc sql ; /* list all acceptable pairs */
create table all_pairs as
select a.ID as InviterID, b.ID as InvitedID
from Participants a
inner join Participants b
on a.ID lt b.ID
left join lookup c /* exclude the memory */
on a.ID eq c.InviterID and b.ID eq c.InvitedID
where c.InviterID is NULL ;
quit ;
%let j = 0 ;
%let all_pairs = 0 ;
%do %until (&all_pairs | &j > &tries) ; /* iterate and random sort until all members are paired */
%let j = %eval( &j + 1 ) ;
data all_pairs;
set all_pairs;
randnum = ranuni(12345 + &i + &j);
run;
proc sort data = all_pairs ; by randnum ; run ; /* random sort */
data out_pairs ; /* select the pairs: no. of IDs/2 */
declare hash h() ;
h.defineKey("ID") ;
h.defineDone() ;
do until ( eof1 ) ;
set Participants (keep= ID) end = eof1 ;
rc = h.add () ; /* populate list of members */
end ;
do until ( eof2 ) ;
set all_pairs (keep= InviterID InvitedID) end = eof2 ;
rc1 = h.check (key:InviterID) ;
rc2 = h.check (key:InvitedID) ;
if rc1 = 0 and rc2 = 0 then do ;
rc = h.remove (key:InviterID) ; /* delete member from list if paired */
rc = h.remove (key:InvitedID) ;
output ;
end ;
if h.num_items = 0 then do ;
call symput('all_pairs', 1 ) ;
stop ;
end;
end ;
stop ;
keep InviterID InvitedID ;
run ;
%end ;
data list_of_meetings ;
set list_of_meetings (where=(iteration ne .))
Out_pairs (in=pairs) ;
if pairs then iteration = &i. ;
run ;
%end ;
%mend ;
%loop_coffee (iter=10,hist=6,tries=10) ;

In SAS, how to execute a variable string as an if-statement (var = "if x = y")?

I have a data set with a variable named "Condition" that I want to use in the code. I'm guessing I need to do it in a macro but I'm still learning how to write macros in SAS.
So if my data set is this:
Question,Answer,Condition,Result
Q1,1,Answer=1," "
Q2,2,Answer=1," "
Q3,3,Answer=4," "
Then I want the program to take the Condition variable as a string and then use it as an if statement:
if Condition then Result = "Correct";
Is this possible?
That is not easy to do. For your simple example you could do:
data want ;
set have ;
if cats('Answer=',answer) = condition then ....
But that will not generalize to situations where CONDITION references the values of other variables. You might be able to generate code from a set of unique values of CONDITION.
Sample data:
data have ;
infile cards dsd truncover ;
input Question $ Answer Condition :$30. Expected $ ;
cards;
Q1,1,Answer=1,"Correct"
Q2,2,Answer=1,"Wrong"
Q3,3,Answer=4,"Wrong"
;;;;
Generate code using unique values of CONDITION.
filename code temp ;
data _null_;
set have end=eof ;
by condition ;
file code ;
if _n_=1 then put 'SELECT ;' ;
if first.condition then put ' WHEN (' CONDITION= :$quote. ' AND (' condition ')) RESULT="CORRECT" ;' ;
if eof then put ' OTHERWISE RESULT="WRONG";'
/ 'END;'
;
run;
Use the generated code in a data step.
data want ;
set have ;
%inc code / source2;
run;
Sample Log records.;
252 data want ;
253 set have ;
254 %inc code / source2;
255 +SELECT ;
256 + WHEN (Condition="Answer=1" AND (Answer=1 )) RESULT="CORRECT" ;
257 + WHEN (Condition="Answer=4" AND (Answer=4 )) RESULT="CORRECT" ;
258 + OTHERWISE RESULT="WRONG";
259 +END;
NOTE: %INCLUDE (level 1) ending.
260 run;

Polynomials in character form to numeric (SAS)

I have a SAS dataset which contains one column of polynomials. For example, X1**(-2)+X1**(2).
Is there a function to transform this into a numeric expression?
Many thanks,
If I understand you correctly, I don't think there is a specific function that will easily let you do this. You have two options - write your own logic to interpret the polynomial expressions, or use call execute to have SAS write out a (potentially very long) data step for you, assuming that the polynomials are all entered as valid data step code. Here's a call execute approach:
data have;
input x1 polynomial $255.;
infile datalines truncover;
datalines;
1 X1**(-2)+X1**(2)
2 X1**(-1)+X1**(1)
3 X1**(1)+X1**(-1)
;
run;
data _null_;
set have end = eof;
if _n_ = 1 then call execute('data want; set have; select(_n_);');
call execute(catx(' ','when(',_N_,') y =',polynomial,';'));
if eof then call execute('end; run;');
run;
Convert them to macro variables, and then resolve them into a calculation...
Using the dataset example in user667489's answer :
/* Create numbered macro variables, 1 per row of data */
data _null_ ;
set have end=eof ;
call symputx(cats('POLY',_n_),polynomial) ;
if eof then call symputx('POLYN',_n_) ;
run ;
%MACRO ROWLOOPER ;
%DO N = 1 %TO &POLYN ;
if _n_ = &N then result = &&POLY&N ;
%END ;
%MEND ;
data want ;
set have ;
/* Not very efficient, looping over all polynomials on each row of data */
/* So for 3 rows, you'll perform 9 iterations here */
%ROWLOOPER ;
run ;
Or, alternatively, write your dataset out into a SAS program, and %inc that program :
data _null_ ;
file "polynomials.sas" ;
set have end=eof ;
if _n_ = 1 then do ;
put "data poly;" ;
put " set have;" ;
end ;
put " result = " polynomial ";" ;
if eof then put "run;" ;
run ;
%inc "polynomials.sas" ;

Summing a table with an unknown number of variables?

I'm fairly new with SAS. I've used it a bit in the past but am really rusty.
I've got a table that looks like this:
Key Group1 Metric1 Group2 Metric2 Group3 Metric3
1 . r 20 .
1 . . t 3
For several unique keys.
I want everything to appear on one row so it looks like.
Key Group1 Metric1 Group2 Metric2 Group3 Metric3
1 . r 20 t 3
Another wrinkle is I don't know how many group and metric columns I'll have (although I'll always have the same number).
I'm not sure how to approach this. I'm able to get a list of column names and use them in a macro, I'm just not sure what proc or datastep function I need to use to collapse everything down. I would be extremely greatful for any suggestions.
There's a very simple way to do this using a nice trick. I've answered similar questions on this before, see here for one of them. This should achieve exactly what you're after.
You can use 2 temporary arrays (one for the character variables, and another for the numeric), and fill them with the non-blank values accordingly. When you reach last.key, you can load the temporary arrays back into the source variables.
If you know the maximum length of the character variables in advance, you can hard code it, but if not you can determine it dynamically.
This assumes that for each key, each variable is only populated once. Otherwise it will take the last value it sees for a particular variable within each key.
%LET LIB = work ;
%LET DSN = mydata ;
%LET KEYVAR = key ;
/* Get column name/type/max length */
proc sql ;
/* Numerics */
select name, count(name) into :NVARNAMES separated by ' ', :NVARNUM
from dictionary.columns
where libname = upcase("&LIB")
and memname = upcase("&DSN")
and name ^= upcase("&KEYVAR")
and type = 'num' ;
/* Characters */
select name, count(name), max(length) into :CVARNAMES separated by ' ', :CVARNUM, :CVARLEN
from dictionary.columns
where libname = upcase("&LIB")
and memname = upcase("&DSN")
and name ^= upcase("&KEYVAR")
and type = 'char' ;
quit ;
data flatten ;
set &LIB..&DSN ;
by &KEYVAR ;
array n{&NVARNUM} &NVARNAMES ;
array nt{&NVARNUM} _TEMPORARY_ ;
array c{&CVARNUM} &CVARNAMES ;
array ct{&CVARNUM} $&CVARLEN.. _TEMPORARY_ ;
retain nt ct ;
if first.&KEYVAR then do ;
call missing(of nt{*}, of ct{*}) ;
end ;
/* Load non-missing numeric values into temporary array */
do i = 1 to dim(n) ;
if not missing(n{i}) then nt{i} = n{i} ;
end ;
/* Load non-missing character values into temporary array */
do i = 1 to dim(c) ;
if not missing(c{i}) then ct{i} = c{i} ;
end ;
if last.&KEYVAR then do ;
/* Load numeric back into original variables */
call missing(of n{*}) ;
do i = 1 to dim(n) ;
n{i} = nt{i} ;
end ;
/* Load character back into original variables */
call missing(of c{*}) ;
do i = 1 to dim(c) ;
c{i} = ct{i} ;
end ;
output ;
end ;
drop i ;
run ;