Categorization of non-numeric data using SAS

Categorization of non-numeric data using SAS - sas

I wonder if there is a way to categorize data when it is not numeric. Is there a way to specify all the conditions for the if-then statements in single line ?
Here is part of my code.
data new;
set old;
if target EQ 'purchase|laboratory|dept' then category = 'internal';
if target EQ 'purchase|office|member' then category ='internal';
if target EQ 'purchase|floor|ext' then category='external';
run;

Kul:
You can use if / then / else logic to perform all the assignments in a single statement
if target EQ 'purchase|laboratory|dept' then category = 'internal'; else
if target EQ 'purchase|office|member' then category ='internal'; else
if target EQ 'purchase|floor|ext' then category='external';
Long runs of your if then else can be equivalently stated in a select statement
select (target);
when ('purchase|laboratory|dept') category = 'internal';
when ('purchase|office|member') category = 'internal';
when ('purchase|floor|ext') category = 'external';
otherwise category = 'other';
end;
The parenthesis are required.
A custom format can also be used for the case of many values mapping to a few category values, and for the case of processing target alone as a formatted categorical (for example class target; format target $target_cat.;). The benefit is that the mapping is stored as data instead of as SAS source code.
* mapping data;
data target_categories;
length target $60 category $20;
input target & category; datalines;
purchase|laboratory|dept internal
purchase|office|member internal
purchase|floor|ext external
run;
* conform mapping data to proc format cntlin= requirements;
data format_data;
set target_categories;
start = target;
label = category;
fmtname = '$target_cat';
run;
* construct custom format;
proc format cntlin=format_data;
run;
Sample data
data old;
do x = 1 to 20;
target = 'purchase|laboratory|dept'; output;
target = 'purchase|office|member'; output;
target = 'purchase|floor|ext'; output;
end;
run;
Apply format using put
data new;
set old;
category = put (target,$target_cat.);
run;
You can also process the data with out creating a second variable. For example
proc tabulate data=old;
class target;
format target $target_cat.; * reported values will show internal/external;
table target, n / nocellmerge;
run;

Related

SaS 9.4: How to use different weights on the same variable without datastep or proc sql

I can't find a way to summarize the same variable using different weights.
I try to explain it with an example (of 3 records):
data pippo;
a=10;
wgt1=0.5;
wgt2=1;
wgt3=0;
output;
a=3;
wgt1=0;
wgt2=0;
wgt3=1;
output;
a=8.9;
wgt1=1.2;
wgt2=0.3;
wgt3=0.1;
output;
run;
I tried the following:
proc summary data=pippo missing nway;
var a /weight=wgt1;
var a /weight=wgt2;
var a /weight=wgt3;
output out=pluto (drop=_freq_ _type_) sum()=;
run;
Obviously it gives me a warning because I used the same variable "a" (I can't rename it!).
I've to save a huge amount of data and not so much physical space and I should construct like 120 field (a0-a6,b0-b6 etc) that are the same variables just with fixed weight (wgt0-wgt5).
I want to store a dataset with 20 columns (a,b,c..) and 6 weight (wgt0-wgt5) and, on demand, processing a "summary" without an intermediate datastep that oblige me to create 120 fields.
Due to the huge amount of data (more or less 55Gb every month) I'd like also not to use proc sql statement:
proc sql;
create table pluto
as select sum(db.a * wgt1) as a0, sum(db.a * wgt1) as a1 , etc.
quit;
There is a "Super proc summary" that can summarize the same field with different weights?
Thanks in advance,
Paolo

I think there are a few options. One is the data step view that data_null_ mentions. Another is just running the proc summary however many times you have weights, and either using ods output with the persist=proc or 20 output datasets and then setting them together.
A third option, though, is to roll your own summarization. This is advantageous in that it only sees the data once - so it's faster. It's disadvantageous in that there's a bit of work involved and it's more complicated.
Here's an example of doing this with sashelp.baseball. In your actual case you'll want to use code to generate the array reference for the variables, and possibly for the weights, if they're not easily creatable using a variable list or similar. This assumes you have no CLASS variable, but it's easy to add that into the key if you do have a single (set of) class variable(s) that you want NWAY combinations of only.
data test;
set sashelp.baseball;
array w[5];
do _i = 1 to dim(w);
w[_i] = rand('Uniform')*100+50;
end;
output;
run;
data want;
set test end=eof;
i = .;
length varname $32;
sumval = 0 ;
sum=0;
if _n_ eq 1 then do;
declare hash h_summary(suminc:'sumval',keysum:'sum',ordered:'a');;
h_summary.defineKey('i','varname'); *also would use any CLASS variable in the key;
h_summary.defineData('i','varname'); *also would include any CLASS variable in the key;
h_summary.defineDone();
end;
array w[5]; *if weights are not named in easy fashion like this generate this with code;
array vars[*] nHits nHome nRuns; *generate this with code for the real dataset;
do i = 1 to dim(w);
do j = 1 to dim(vars);
varname = vname(vars[j]);
sumval = vars[j]*w[i];
rc = h_summary.ref();
if i=1 then put varname= sumval= vars[j]= w[i]=;
end;
end;
if eof then do;
rc = h_summary.output(dataset:'summary_output');
end;
run;
One other thing to mention though... if you're doing this because you're doing something like jackknife variance estimation or that sort of thing, or anything that uses replicate weights, consider using PROC SURVEYMEANS which can handle replicate weights for you.

You can SCORE your data set using a customized SCORE data set that you can generate
with a data step.
options center=0;
data pippo;
retain a 10 b 1.75 c 5 d 3 e 32;
run;
data score;
if 0 then set pippo;
array v[*] _numeric_;
retain _TYPE_ 'SCORE';
length _name_ $32;
array wt[3] _temporary_ (.5 1 .333);
do i = 1 to dim(v);
call missing(of v[*]);
do j = 1 to dim(wt);
_name_ = catx('_',vname(v[i]),'WGT',j);
v[i] = wt[j];
output;
end;
end;
drop i j;
run;
proc print;[enter image description here][1]
run;
proc score data=pippo score=score;
id a--e;
var a--e;
run;
proc print;
run;
proc means stackods sum;
ods exclude summary;
ods output summary=summary;
run;
proc print;
run;
enter image description here

how to transpose data with multiple occurrences in sas

I have a 2 column dataset - accounts and attributes, where there are 6 types of attributes.
I am trying to use PROC TRANSPOSE in order to set the 6 different attributes as 6 new columns and set 1 where the column has that attribute and 0 where it doesn't

This answer shows two approaches:
Proc TRANSPOSE, and
array based transposition using index lookup via hash.
For the case that all of the accounts missing the same attribute, there would be no way for the data itself to exhibit all the attributes -- ideally the allowed or expected attributes should be listed in a separate table as part of your data reshaping.
Proc TRANSPOSE
When working with a table of only account and attribute you will need to construct a view adding a numeric variable that can be transposed. After TRANSPOSE the result data will have to be further massaged, replacing missing values (.) with 0.
Example:
data have;
call streaminit(123);
do account = 1 to 10;
do attribute = 'a','b','c','d','e','f';
if rand('uniform') < 0.75 then output;
end;
end;
run;
data stage / view=stage;
set have;
num = 1;
run;
proc transpose data=stage out=want;
by account;
id attribute;
var num;
run;
data want;
set want;
array attrs _numeric_;
do index = 1 to dim(attrs);
if missing(attrs(index)) then attrs(index) = 0;
end;
drop index;
run;
proc sql;
drop view stage;
From
To
Advanced technique - Array and Hash mapping
In some cases the Proc TRANSPOSE is deemed unusable by the coder or operator, perhaps very many by groups and very many attributes. An alternate way to transpose attribute values into like named flag variables is to code:
Two scans
Scan 1 determine attribute values that will be encountered and used as column names
Store list of values in a macro variable
Scan 2
Arrayify the attribute values as variable names
Map values to array index using hash (or custom informat per #Joe)
Process each group. Set arrayed variable corresponding to each encountered attribute value to 1.  Array index obtained via lookup through hash map.
Example:
* pass #1, determine attribute values present in data, the values will become column names;
proc sql noprint;
select distinct attribute into :attrs separated by ' ' from have;
* or make list of attributes from table of attributes (if such a table exists outside of 'have');
* select distinct attribute into :attrs separated by ' ' from attributes;
%put NOTE: &=attrs;
* pass #2, perform array based tranposformation;
data want2(drop=attribute);
* prep pdv, promulgate by group variable attributes;
if 0 then set have(keep=account);
array attrs &attrs.;
format &attrs. 4.;
if _n_=1 then do;
declare hash attrmap();
attrmap.defineKey('attribute');
attrmap.defineData('_n_');
attrmap.defineDone();
do _n_ = 1 to dim(attrs);
attrmap.add(key:vname(attrs(_n_)), data: _n_);
end;
end;
* preset all flags to zero;
do _n_ = 1 to dim(attrs);
attrs(_n_) = 0;
end;
* DOW loop over by group;
do until (last.account);
set have;
by account;
attrmap.find(); * lookup array index for attribute as column;
attrs(_n_) = 1; * set flag for attribute (as column);
end;
* implicit output one row per by group;
run;

One other option for doing this not using PROC TRANSPOSE is the data step array technique.
Here, I have a dataset that hopefully matches yours approximately. ID is probably your account, Product is your attribute.
data have;
call streaminit(2007);
do id = 1 to 4;
do prodnum = 1 to 6;
if rand('Uniform') > 0.5 then do;
product = byte(96+prodnum);
output;
end;
end;
end;
run;
Now, here we transpose it. We make an array with the six variables that could occur in HAVE. Then we iterate through the array to see if that variable is there. You can add a few additional lines to the if first.id block to set all of the variables to 0 instead of missing initially (I think missing is better, but YMMV).
data want;
set have;
by id;
array vars[6] a b c d e f;
retain a b c d e f;
if first.id then call missing(of vars[*]);
do _i = 1 to dim(vars);
if lowcase(vname(vars[_i])) = product then
vars[_i] = 1;
end;
if last.id then output;
run;
We could do it a lot faster if we knew how the dataset was constructed, of course.
data want;
set have;
by id;
array vars[6] a b c d e f;
if first.id then call missing(of vars[*]);
retain a b c d e f;
vars[rank(product)-96]=1;
if last.id then output;
run;
While your data doesn't really work that way, you could make an informat though that did this.
*First we build an informat relating the product to its number in the array order;
proc format;
invalue arrayi
'a'=1
'b'=2
'c'=3
'd'=4
'e'=5
'f'=6
;
quit;
*Now we can use that!;
data want;
set have;
by id;
array vars[6] a b c d e f;
if first.id then call missing(of vars[*]);
retain a b c d e f;
vars[input(product,arrayi.)]=1;
if last.id then output;
run;
This last one is probably the absolute fastest option - most likely much faster than PROC TRANSPOSE, which tends to be one of the slower procs in my book, but at the cost of having to know ahead of time what variables you're going to have in that array.

Designing new RK number for unique record

I am a SAS Developer. I am starting a project that requires me to assign RK number to unique record. Every extraction will get data that is already existed in the target table and some may not.
For example.
Source Data:
Name
A
B
C
D
E
Target Table:
Name RK
A 1
B 2
C 3
When I load, i want it to insert D and E into the target table with RK 4 & 5 respectively. Currently, I can think of doing hash lookup from source with target table. For data that is not mapped using hash object, RK field will be blank. I will put the max RK number from the target table and incremental 1 to it by appending D & E into it.
I am not sure if this is the most efficient way of doing so. Is there another more efficient way?

You could use a hash to determine if some name (I'll call it value) already exists in target table. However, new keys would have to be tracked, output at the end of the step and then PROC APPPEND'd to target table (I'll call it master) .
For the case of just updating the master table with new RK values, a traditional SAS approach is to use a DATA step to MODIFY a unique keyed master table. The coding pattern is:
SET <source>
MODIFY <master> KEY=<value> / UNIQUE;
... _IORC_ logic ...
Example:
%* Create some source data and the master table;
data have1 have2 have3 have4 have5;
call streaminit(123);
value = 2020; output; output; output;
do _n_ = 1 to 2500;
value = ceil(rand('uniform', 5000));
select;
when (rand('uniform') < 0.20) output have1;
when (rand('uniform') < 0.20) output have2;
when (rand('uniform') < 0.20) output have3;
when (rand('uniform') < 0.20) output have4;
otherwise output have5;
end;
end;
run;
data have6;
do _n_ = 1 to 20;
value = 2020;
output;
end;
run;
* Create the unique keyed master table;
* Typically done once and stored in a permanent library.;
proc sql;
create table keys (value integer, RK integer);
create distinct index value on work.keys;
quit;
%* A macro for adding new RK values as needed;
%macro RK_ASSIGN(master, data);
%local last;
proc sql noprint;
select max(RK) into :last trimmed from &master;
quit;
data &master;
retain newkey %sysevalf(0&last+0); %* trickery for 1st use case when max(RK) is .;
set &data;
modify &master key=value / unique;
if _iorc_ eq %sysrc(_DSENOM);
newkey + 1;
RK = newkey;
output;
_error_ = 0;
run;
%mend;
%* Use the macro to process source data;
%RK_ASSIGN(keys,have1)
%RK_ASSIGN(keys,have2)
%RK_ASSIGN(keys,have3)
%RK_ASSIGN(keys,have4)
%RK_ASSIGN(keys,have5)
%RK_ASSIGN(keys,have6)
You can see the forced repeats of the 2020 value in the source data is only RK'd once in the master table, and there are no errors during processing.
If you want to backfill the source data with the found or assigned RK value there would be additional steps. You could update a custom format, or do a traditional left join. If you want to focus on backfill during a read over source data the HASH step + APPEND new RK's step might be preferable.
Example 2 Master table is named values
HASH version with RK assignment added to source data. New RKs output and appended.
proc sql;
create table values (value integer, RK integer);
create distinct index value on work.values;
%macro RK_HASH_ASSIGN(master,data);
%local last;
proc sql noprint;
select max(RK) into :last trimmed from &master;
quit;
data &data(drop=next_RK);
set &data end=end;
if _n_ = 1 then do;
declare hash lookup (dataset:"&master");
lookup.defineKey("value");
lookup.defineData("value", "RK");
lookup.defineDone();
declare hash newlookup (dataset:"&master(obs=0)");
newlookup.defineKey("value");
newlookup.defineData("value", "RK");
newlookup.defineDone();
end;
retain next_RK %sysevalf(0&last+0); %* trick;
* either load existing RK from hash, or compute and apply next RK value;
if lookup.find() ne 0 then do;
next_RK + 1;
RK = next_RK;
lookup.add();
newlookup.add();
end;
if end then do;
newlookup.output(dataset:'work.newmasters');
end;
run;
proc append base=&master data=work.newmasters;
proc delete data=work.newmasters;
run;
%mend;
%RK_HASH_ASSIGN(values,have1)
%RK_HASH_ASSIGN(values,have2)
%RK_HASH_ASSIGN(values,have3)
%RK_HASH_ASSIGN(values,have4)
%RK_HASH_ASSIGN(values,have5)
%RK_HASH_ASSIGN(values,have6)
%* Compare the two assignment strategies, no differences!;
proc sort force data=values(index=(value));
by RK;
run;
proc compare noprint base=keys compare=values out=diffs outnoequal;
by RK;
run;
----- LOG -----
2525 proc compare noprint base=keys compare=values out=diffs
outnoequal <------------- do not output when data is identical ;
;
2526 by RK;
2527 run;
NOTE: There were 215971 observations read from the data set WORK.KEYS.
NOTE: There were 215971 observations read from the data set WORK.VALUES.
NOTE: The data set WORK.DIFFS has 0 observations and 4 variables. <--- all the same ---
NOTE: PROCEDURE COMPARE used (Total process time):
real time 0.25 seconds
cpu time 0.26 seconds

I have a bit of a complex question regarding writing code for many variables that aren’t identical but contain the same prefix for each

Ok so here it goes,
I am working with a dataset containing Minimum Inhibitory Concentration (MIC) values for different antibiotics (About 30 different antibiotics). Each antibiotic has MIC values from different test-types and interpretations for each of those MICs.
Example:
The MIC variables for antibiotic Amikacin have a common mnemonic suffix AMK
micmrAMK
interpmrAMK
micmsAMK
interpmsAMK
micvkAMK
interpvkAMK
micpxAMK
interppxAMK
micetAMK
interpetAMK
ALL the antibiotics have variables similar to above (I.e. the micmr, micms, interpmr, etc is all the same for each variable. The only thing that changes is the last few letters that correspond to the antibiotic name)
I am attempting to validate these data, I have a list of valid MIC values for each type of test. Is there a way to write a program that will check all the variables that start with “mic” so that I don’t have to specify each individual variable name?

I'm not a microbiologist, but guessing your variable name construct has lots of information in it.
Variable name construct:
<mic|inter><susceptibility><antibiotic>
<mic|inter>
mic - minimum inhibitory concentration
inter - susceptibility interpretation
<susceptibility> - <mr|ms|vk|px|et>
mr - resistant
ms - sensitive
vk -
px -
et -
<antibiotic>
There are two approaches to validating the presence of MIC related variable names in the data set:
Way #1 - List a comparison of the data set variables to a constructed list of variables. The list is based on pre-specified list of antibodies, OR
Way #2 - Deconstruct the data set variable names into MIC variable name parts. Report on the data set variables and any possibly missing MIC variables.
Examples:
Simlate MIC data set - Make a data set with some MIC variable names
* simluate some data;
data have;
do sampleid = 1 to 1000;
length instrumentid $20.;
format rundate yymmdd10.;
length operator $10.;
array construct_names
micmrMarbo
interpetamk micetamk interpmramk micmramk interpmsamk micmsamk
interppxamk micpxamk interpvkamk micvkamk
interpetimi micetimi interpmrimi micmrimi interpmsimi micmsimi
interppximi micpximi interpvkimi micvkimi
interpmsfubar micmsfubar
interppxfubar micpxfoobar
;
do over construct_names;
construct_names = round(rand("normal", 50,9), 0.25);
end;
output;
end;
run;
Get metadata
* get data set variable names as data;
proc contents noprint data=have out=have_names(keep=varnum name);
run;
Way #1
* compute variable names for expected MIC naming constructs;
* match only expected antibody variables;
data expect_names(keep=sequence name);
* load arrays with construct parts;
array part1(2) $6 ('mic', 'interp');
array part2(5) $2 ('mr', 'ms', 'vk', 'px', 'et');
array part3(4) $10 ('AMK', 'IMIP', 'TOBI', 'TYPO'); /* 4 expected antibodies */
* construct expected names;
do part3_index = 1 to dim(part3);
do part2_index = 1 to dim(part2);
do part1_index = 1 to dim(part1);
sequence + 1;
name = cats(part1[part1_index], part2[part2_index], part3[part3_index]);
output;
end;
end;
end;
run;
* Way 1 data validation: compare data variable names to expectations;
proc sql;
create table name_comparison as
select
varnum,
coalesce(have_names.name,expect_names.name) as name,
case
when have.name is null and expect.name is not null then 'Expected MIC variable was not in the data set'
when have.name is not null and expect.name is null then 'NOT a MIC variable construct'
else 'OK'
end as status
from have_names as have
full join expect_names as expect
on upper(have.name) eq upper(expect.name)
order by have.varnum, expect.sequence
;
ods html file='compare.html' style=plateau;
proc print data=name_comparison;
var varnum;
var name / style=[fontfamily=monospace];
var status;
run;
ods html close;
The report would be a simple listing showing how the variable names were evaluated
Way #2
Deconstruct data set variable names and color coded grid report.
* Compute construct parts and check for completeness;
proc sql;
create table part1 (
order num, mnemonic char(6), meaning char(200)
);
insert into part1
values (1, 'mic', 'minimum inhibitory concentration')
values (2, 'interp', 'susceptibility interpretation')
;
create table part2 (
order num, mnemonic char(6), meaning char(200)
);
insert into part2
values (1, 'mr', '??')
values (2, 'ms', '??')
values (3, 'vk', '??')
values (4, 'px', '??')
values (5, 'et', '??')
;
create table mic_name_prefixes as
select
part1.order as part1z format=2.
, part1.mnemonic as part1
, part2.order as part2z format=2.
, part2.mnemonic as part2
, cats(part1.mnemonic,part2.mnemonic) as prefix
from part1 cross join part2
;
create table antibodies(label="Extract antibody from variable names with proper prefix") as
select
substr(upper(name),length(prefix)+1) as antibody
, min(varnum) as abz format=6.
from have_names
join mic_name_prefixes
on upper(name) like upper(cats(prefix,'%'))
group by antibody
order by abz
;
* sub select CROSS JOIN for complete grid;
* FULL JOIN for complete comparison;
create table name_grid_data as
select
abz, part1z, part2z
, grid.part1, grid.part2, grid.antibody
, coalesce(grid.name,have.name) as varname length=32
, not missing(have.name) as expected_found format=1.
from
( select PREFIX.*, AB.*, cats(part1,part2,antibody) as name
from mic_name_prefixes PREFIX
cross join antibodies AB
) as grid
full join have_names as have
on upper(have.name) = upper(grid.name)
order by
coalesce(abz,have.varnum+1e6), part2z, part1z
;
reset noprint;
select count(distinct antibody) into :abcount trimmed from name_grid_data;
select count(distinct 0) into :abmissing trimmed from name_grid_data where missing(antibody);
%let abcount = %eval(&abcount + &abmissing);
%put NOTE: &=abcount;
%macro cols (from,to);
/* needed for array statement in compute block */
%local index;
%do index = &from %to &to;
_c&index._
%end;
%mend;
ods html file = 'mic_names.html';
proc report data=name_grid_data spanrows missing;
column
part1 part2
antibody,varname /* 'display var under across var' trick, display will be shown */
antibody=ab,expected_found /* same trick with ab alias, to get _c#_ column for compute block logic */
placeholder
;
define part1 / group order=data ' ' style=header;
define part2 / group order=data ' ' style=header;
define antibody / across order=data ' ';
define ab / across order=data ' ' noprint; /* NOPRINT, _c#_ available, but not rendered */
define varname / ' ' style=[fontfamily=monospace];
define placeholder / noprint; /* required for 'display under across' trick */
/* right most column has access to all leftward columns */
compute placeholder;
array name_col %cols(3, %eval(2+&abcount)); /* array for _c#_ columns */
array have_col %cols(%eval(3+&abcount), %eval(2+2*&abcount)); /* array for _c#_ columns */
/* conditionally highlight the missing variables */
do index = 1 to &abcount - &abmissing;
if not missing ( name_col(index) ) then do;
if not have_col(index) then
call define (vname(name_col(index)), 'style', 'style=[background=lightred]');
else
call define (vname(name_col(index)), 'style', 'style=[background=lightgreen]');
end;
end;
endcomp;
run;
ods html close;
Color coded grid report

SAS Using Data Set to Create Other Data Sets

I am supposed to create a summary data set containing the mean, median, and standard deviation broken down by gender and group (using the CLASS statement). Using this summary data set, create four other data sets (in one DATA step) as follows:
(1) grand mean
(2) stats broken down by gender
(3) stats broken down by group
(4) stats broken down by gender and group
Given the hint to use the CHARTYPE option.
I provided my attempted solution, but I don't think I did it in the way asked.
DATA CLINICAL;
*Use LENGTH statement to control the order of
variables in the data set;
LENGTH PATIENT VISIT DATE_VISIT 8;
RETAIN DATE_VISIT WEIGHT;
DO PATIENT = 1 TO 25;
IF RANUNI(135) LT .5 THEN GENDER = 'Female';
ELSE GENDER = 'Male';
X = RANUNI(135);
IF X LT .33 THEN GROUP = 'A';
ELSE IF X LT .66 THEN GROUP = 'B';
ELSE GROUP = 'C';
DO VISIT = 1 TO INT(RANUNI(135)*5);
IF VISIT = 1 THEN DO;
DATE_VISIT = INT(RANUNI(135)*100) + 15800;
WEIGHT = INT(RANNOR(135)*10 + 150);
END;
ELSE DO;
DATE_VISIT = DATE_VISIT + VISIT*(10 + INT(RANUNI(135)*50));
WEIGHT = WEIGHT + INT(RANNOR(135)*10);
END;
OUTPUT;
IF RANUNI(135) LT .2 THEN LEAVE;
END;
END;
DROP X;
FORMAT DATE_VISIT DATE9.;
RUN;
PROC MEANS DATA=CLINICAL;
CLASS GENDER GROUP;
OUTPUT OUT=SUMMARY
MEAN=
MEDIAN=
STDDEV= / AUTONAME;
RUN;

No, what they're asking you to do is:
Use the OUTPUT statement in PROC MEANS to create a summary dataset. Choose the appropriate TYPES and CLASS values in PROC MEANS such that all four sets of data are represented on the output.
Using a single data step that has four dataset names on the data statement, selectively output those rows to the correct dataset. You would use the _TYPE_ variable to determine which dataset a row would be output to.
CHARTYPES just means your _TYPE_ variable will look like 1001 instead of 9 (the binary representation, basically). 1001 indicates which class variable is used (the first and the fourth) to create that breakout. (With only two class variables, you would have values 00, 01, 10, 11 possible). This is sometimes easier for non-programmers who aren't used to thinking in binary (these values would be 0, 1, 2, and 3 in decimal without CHARTYPES and thus might be more difficult for you to tell which corresponds to which variable).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Categorization of non-numeric data using SAS - sas

Related

SaS 9.4: How to use different weights on the same variable without datastep or proc sql

how to transpose data with multiple occurrences in sas

Designing new RK number for unique record

I have a bit of a complex question regarding writing code for many variables that aren’t identical but contain the same prefix for each

SAS Using Data Set to Create Other Data Sets

Categories

Resources