Using informats when creating a dataset from another dataset - sas

I've got a dataset that's full of data all in character format.
Now I want to create another dataset from this one, put put everything it it's correct decimal or date or character format.
Here's what I'm trying.
data work.testout;
attrib account_open_date informat = mmddyy10.;
do i = 1 to nobs;
set braw.accounts point = i nobs = nobs;
output;
end;
stop;
run;
this gives me:
Variable 'account_open_date' from data set braw.accounts (at line 7 column 21) has a different type (character) to the variable type on the data vector (numeric)
What's the best way of doing this?

You cannot use an informat to convert a variable directly from character to numeric. At least in SAS proper, you cannot convert a variable from character to numeric, period, without using an intermediary. You must do something along the lines of the following:
data want;
set have(rename=varwant=temp);
varwant=input(temp,MMDDYY10.);
drop temp;
run;
There you rename the (character) variable to a temporary name, then convert it to numeric using INPUT.

Related

SAS character and numeric change with set statement

I am working to merge two data sets and get the following error:
Variable DOB has been defined as both character and numeric.
Here is my code. I know I need a set statement to change the character to numeric. I was thinking:
DATA Merged1;
SET Aug21 Aug22;
RUN;
set (rename=(DOB=DOBnum));
length DOB $ 10.;
DOB= put(DOBnum,f10. -L);
drop DOBnum;
Would this be placed before my Set statement to merge to Aug 21 Aug 22?
Thank you!
I tried to run the code but it would not merge, unsure if where the Set statement for DOB would go
You do not need the second SET statement. You need to add the RENAME= dataset option to the dataset where it is mentioned in the first SET statement.
So something like:
DATA BOTH;
SET Aug21 Aug22(in=in2 rename=(DOB=DOBnum));
if in2 then DOB= put(DOBnum,f10. -L);
drop DOBnum;
RUN;
To get a more detailed answer provide more details about the variables and the types of values they contain. For example if DOB means Date of Birth then it does not make much sense to use the F format. If DOB should be an actual DATE then it should be numeric and not character. And if the version that is numeric has actual date values then converting them to text using the F format is going to generate strings that will be confusing for humans.
If you're a beginner I recommend two steps so you can trace the work.
Convert dob from character to numeric
Append the two datasets together (assume you're stacking the data sets)
Use format to control how the date is displayed
*convert character to numeric SAS date;
data aug21_convert2num;
set aug21(rename=dob=dobchar);
dob = input(dob, anydtdte.);
drop dobchar;
run;
*append the two data sets;
data want;
set aug21_convert2num aug22;
format dob yymmdd10.;
run;

Converting type from character to numerical in SAS

First I had to format all of the categories to numbers and I used the code below to do that and was successful. I need to convert the type from character to numerical so that I can run analysis. I have tried the input function but that has not worked either. Any help would be greatly appreciated.
Proc Format;
Value $gender_num 'Male'=0 'Female'=1;
Value $att 'Yes'=0 'No'=1;
Value $bustrav 'Non-Travel'=1 'Travel_Frequently'=2 'Travel_Rarely'=3;
Value $dpt 'Research & Development'=1 'Human Resources'=2 'Sales'=3;
Value $edfd 'Life Sciences'=1 'Human Resources'=2 'Marketing'=3 'Medical'=4 'Technical Degree'=5 'Other'=6;
Value $ot 'Yes'=0 'No'=1;
Value $ms 'Divorced'=1 'Married'=2 'Single'=3;
Value $jr 'Healthcare Representative'=1 'Human Resources'=2 'Laboratory Technician'=3 'Manager'=4
'Manufacturing Director'=5 'Research Director'=6 'Research Scientist'=7 'Sales Executive'=8 'Sales
Representative'=9;
Run;
Proc Print data=work.empatt;
format gender $gender_num.;
format attrition $att.;
format businesstravel $bustrav.;
format department $dpt.;
format educationfield $edfd.;
format overtime $ot.;
format maritalstatus $ms.;
format jobrole $jr.;
Run;
You're mixing Formats with Informats.
Format: "How do I display a number on the screen for you?"
Informat: "How do I convert text to a number?"
Your code in the first step above should be invalues. Then you use input to translate. You also need to assign that to a new variable - you can't just associate the informat with the variable and magically get a numeric.
proc format;
invalue sexi
'Male'=1
'Female'=2
;
quit;
data want;
set have;
sex_n = input(sex,sexi.);
run;
You can, if you want, keep the same variable name; I'll show that in the next step, also adding a format so the value "looks" right.
proc format;
invalue sexi
'Male'=1
'Female'=2
;
value sexf
1 = 'Male'
2 = 'Female'
;
quit;
data want;
set have;
sex_n = input(sex,sexi.);
format sex_n sexf.;
drop sex;
rename sex_n = sex;
run;
You drop the original one, then rename the new one to the original name. I use the _n suffix to make it clear what I'm doing, but it's not required; nor are the 'i' and 'f' suffixes in the format/informat (and in fact you could use the identical name if you wanted to), again just a pattern I use to make it easier to distinguish.

SAS: Converting numeric to character values

I am trying to convert datatime20. from numeric to character value.
Currently I have numeric values like this: 01Jan200:00:00:00 and I need to convert it to character values and received output like: 2020-01-01 00:00:00.0
What format and informat should be used in aboved ?
I have tried used PUT function to convert numeric to character and tried many option, each time receiving other format. Should be also use DHMS function before PUT ?
There is not a native format that produces that string exactly. But it it not hard to build it in steps using existing formats. Or you could use PICTURE statement in PROC FORMAT to build your own format.
If you don't really care about the time of day part of the datetime value then this is an easy and clearly understand way to convert the numeric variable DT with number of seconds into a new character variable in that style. Use DATEPART() to get the date (number of days) from the datetime value and then use the YYMMDD format to generate the 10 character string for the date and then just append the constant string of the formatted zeros.
length dt_string $21.;
dt_string = put(datepart(dt),yymmdd10.)||' 00:00:00.0';
If you need the time of day part then you could also use the TOD format.
dt_string = put(datepart(dt),yymmdd10.)||put(dt,tod11.1);
Or you could use the format E8601DT21.1 and then change the letter T between the date and time to a space instead.
dt_string = translate(put(dt,E8601DT21.1),' ','T');
If you want to figure out what formats exist for datetime values and what the formatted results look like you could run a little program to pull the formats from the meta data and apply them to a specific datetime value.
data datetime_formats;
length format $50 string $80 ;
set sashelp.vformat;
where fmttype='F';
where also fmtinfo(fmtname,'cat')='datetime';
keep format string fmtname maxw minw maxd ;
format=cats(fmtname,maxw,'.','-L');
string=putn('01Jan2020:01:02:03'dt,format);
run;
A custom format can be defined to return the result of a user defined function. Docs
proc format;
value <format-name> (default=<width>)
other = [<function-name>()]
;
run;
Example:
options cmplib=(sasuser.functions);
proc fcmp outlib=sasuser.functions.temporal;
function E8601DTS (datetime) $21;
return (
translate (putn(datetime,'E8601DT21.1'),' ','T')
);
endsub;
run;
proc format;
value E8601DTS (default=21)
other = [E8601DTS()]
;
run;
data have;
do dt = '01jan2020:0:0'dt to '10jan2020:0:0'dt by '60:00't;
output;
end;
format dt datetime16.;
run;
ods html file='function-based-format.html';
proc print data=have(obs=4); title 'stock E8601DT';
proc print data=have(obs=4); title 'custom E8601DTS';
format dt E8601DTS.;
run;
ods html close;

SAS proc format dynamic values

PROC FORMAT;
VALUE $Gender 'M'='Male'
'F'='Female';
In the above I am passing the value for processing format as 'm'= 'male'
And 'f'='female' ... The same way I need that values to be passed from a file and the values of process format comes dynamically. How do I do that.
Like I need to pass the above mapping m=male, f=female from a file and read the file and pass that mapping to proceed format dynamically.
To do what you are asking, is done if you can place your data into Data set, two steps process
1- Get your raw data into data set
2- Use above data set to get desired format
so let do this-
*step 1-;
DATA fmt;
Infile "Textfile.txt" DSD ;
Retain fmtname '$myfmt'; /*myfmt is what your format name*/;
Length start $2 label $50;
Input start label ;
RUN;
Now since above code will create a dataset with Male female information use same dataset to create your format.
*Step2:
PROC FORMAT CNTLIN=fmt;
RUN;
The simplest way to create a format from a data set is to use the CNTLIN= option in PROC FORMAT.
REQUIRED VARIABLES IN THE FORMAT DATASET (FMTNAME, START, AND LABEL)
Variable Used for
FMTNAME- The format name
START - The left side of the formatting = sign (assumed character) –
must be unique unless defining a Multi-Label format
LABEL- The right side of the formatting = sign

How to input to date format from a number

I'm trying to put a number eg 20141001 into a date9. format eg 01OCT14.
I've tried to use the input function with an input format of yymmddn8. but SAS throws out 'informat could not be found or loaded'
Any ideas how to get around this? (Sample code below)
data _null_;
date=20141001;
output=input(date,yymmddn8.);
format output date9.;
put output=;
run;
You are almost there. Although there is a YYMMDDN format there is not an informat of the same name. Use the YYMMDD informat. The Input function is expecting a character string i.e. the DATE variable. Redefine DATE as a character variable e.g.
data _null_;
date='20141001';
output=input(date,yymmdd8.);
format output date9.;
put output=;
run;
Alternatively you could have used these assignments:
output = input('20141001',yymmdd8.);
or
output = '01oct2014'd;