I have a SAS dataset called drug with two variables code and short_description.
The type of code variable is $5 and type of short_description variable is $25.
I would like to convert this SAS dataset to format by using the following code:
data work.drug_route;
set library.drug;
length fmtname $32.;
retain fmtname '$drug_description';
rename code=start;
rename short_description = label;
run;
proc format library= library.formats cntlin=work.drug_route;
run;
Does anyone who have experience dealing with such kind of issue?
Much appreciated!
Related
I have two datatset Input12 and input18. Below is the code.
New12 dataset have variable score_date in format yymmn6.
data new12;
set input12;
run;
Now adding new variable score_date in dataset new18
%let score_date=201807;
data new18;
set input18;
format score_date yymmn6.;
run;
After concatenating the dataset new 12 and new18 the date format is not in yymmn6.
data new;
set new12 new18;
run;
This is giving informat date for new12 and blank for new18 in new
data new;
set new18 new12;
run;
This is giving correct date format for new18 and blank for new12 in new.
Is there any reason for improper format after concatenating.
Instead of relying on the source data sets for a variable's format, place a format statement in your data set stacking (concatenation) DATA step.
Example:
data new;
set
new12
new18
;
format score_date yymmn6.;
If you mean to have your date representation stored in a macro symbol, as I infer from your posted code, you need to input the representation using the appropriate informat in order to get a SAS date value.
score_date = input ("&score_date", yymmn6.);
format score_date yymmn6.;
An alternative is to set the macro symbol to the source code of a SAS date literal.
%let score_date = "01JUL2018"D;
and resolve that later in DATA Step as perhaps
data new18;
score_date = &score_date;
format score_date yymmn6.;
run;
I tried below codes.
%let score_date="01dec2019"d;
data twelvenew;
set twelve;
score_date=&scoredate.;
format score_date yymmn6.;
run;
%let score_date="01jun2019"d;
data eightnew;
set eight;
score_date=&scoredate.;
format score_date yymmn6.;
run;
data final;
set eightnew twelvenew;
run;
I am getting dates only eightnew in final dataset and others are missing.
Is iam missing anything here.
First I had to format all of the categories to numbers and I used the code below to do that and was successful. I need to convert the type from character to numerical so that I can run analysis. I have tried the input function but that has not worked either. Any help would be greatly appreciated.
Proc Format;
Value $gender_num 'Male'=0 'Female'=1;
Value $att 'Yes'=0 'No'=1;
Value $bustrav 'Non-Travel'=1 'Travel_Frequently'=2 'Travel_Rarely'=3;
Value $dpt 'Research & Development'=1 'Human Resources'=2 'Sales'=3;
Value $edfd 'Life Sciences'=1 'Human Resources'=2 'Marketing'=3 'Medical'=4 'Technical Degree'=5 'Other'=6;
Value $ot 'Yes'=0 'No'=1;
Value $ms 'Divorced'=1 'Married'=2 'Single'=3;
Value $jr 'Healthcare Representative'=1 'Human Resources'=2 'Laboratory Technician'=3 'Manager'=4
'Manufacturing Director'=5 'Research Director'=6 'Research Scientist'=7 'Sales Executive'=8 'Sales
Representative'=9;
Run;
Proc Print data=work.empatt;
format gender $gender_num.;
format attrition $att.;
format businesstravel $bustrav.;
format department $dpt.;
format educationfield $edfd.;
format overtime $ot.;
format maritalstatus $ms.;
format jobrole $jr.;
Run;
You're mixing Formats with Informats.
Format: "How do I display a number on the screen for you?"
Informat: "How do I convert text to a number?"
Your code in the first step above should be invalues. Then you use input to translate. You also need to assign that to a new variable - you can't just associate the informat with the variable and magically get a numeric.
proc format;
invalue sexi
'Male'=1
'Female'=2
;
quit;
data want;
set have;
sex_n = input(sex,sexi.);
run;
You can, if you want, keep the same variable name; I'll show that in the next step, also adding a format so the value "looks" right.
proc format;
invalue sexi
'Male'=1
'Female'=2
;
value sexf
1 = 'Male'
2 = 'Female'
;
quit;
data want;
set have;
sex_n = input(sex,sexi.);
format sex_n sexf.;
drop sex;
rename sex_n = sex;
run;
You drop the original one, then rename the new one to the original name. I use the _n suffix to make it clear what I'm doing, but it's not required; nor are the 'i' and 'f' suffixes in the format/informat (and in fact you could use the identical name if you wanted to), again just a pattern I use to make it easier to distinguish.
I have many different datasets within a particularly library, and I'm wondering whether there is a way to find a minimum and maximum date associated with a particular unique ID across ALL datasets in a library?
Currently, I can find a local minimum and local maximum date associated with a particular ID within a particular dataset, but this ID will show up again throughout different datasets and have it's own minimum/max date associated with that dataset. But I want to compare the dates on this particular unique ID throughout the entire library, so I can find the global minimum and global maximum date but I do not know how to do this search throughout the entire library.
Currently my code looks like the following
DATA SUBSET_MIN_MAX (keep= MIN_DATE MAX_DATE UNIQUEID);
DO UNTIL (LAST.UNIQUEID);
set LIBRARY.&SAS_FILE_N;
BY UNIQUEID;
MIN_DATE = MIN(MIN_DATE,DATE);
MAX_DATE = MAX(MAX_DATE,DATE);
if last.UNIQUEID then output;
END;
format MIN_DATE MAX_DATE date9.;
RUN;
Thanks so much for any assistance.
Consider this using a view and PROC SUMMARY.
data d1; set sashelp.class; date=height+ranuni(4); run;
data d2; set sashelp.class; date=height-rannor(5); run;
data d3; set sashelp.class; date=height-ranuni(3); run;
data alld/view=alld;
length indsname $64;
set work.d:(keep=name date) indsname=indsname;
source=indsname;
run;
proc summary data=alld nway missing;
class name;
var date;
output out=want(drop=_type_)
idgroup(max(date) out(source date)=source1 globalmax)
idgroup(min(date) out(source date)=source2 globalmin)
;
run;
proc print;
run;
My question is about the append of two different tables that are supposed to have the same name/format/type/length variables.
I am trying to create a step in my SAS program where I don't allow my program to be executed if the format/type/length of variables with the same name is not the same.
For example, when in one table I have a date in type string "dd-mm-yyyy" and in the other table I have the "yyyy-mm-dd" or "dd-mm-yyyy hh:mm:ss". After the append, our daily executions based on these input tables didn't work as expected. Sometimes the values come up as missing or out of order, since the formats are different.
I tried using the PROC COMPARE statement, which allowed me to check which variables have Differing Attributes (Type, Length, Format, InFormat and Labels).
proc compare base = SAS-data-set
compare = SAS-data-set;
run;
However, I only got the info on which variables have differing atributes (listing of common variables with differing attributes), not being able to do anything with/about it.
On the other hand, I would like to know if there's a chance to have a structured output table with this information, in order to use it as a control statement.
Creating an automatic task to do it would save me a lot of time.
Screenshot of an example:
You can use Proc CONTENTS to get information about a data sets variables. Do that for both data sets, and then you can use Proc COMPARE to create a data set informing you of the variable attributes differences.
data cars1;
set sashelp.cars (obs=10);
date = today ();
format date date9.;
cars1_only = 1;
x = 1.458; label x = "x-factor";
run;
data cars2;
length type $50;
set sashelp.cars (obs=10);
format date yymmdd10.;
cars2_only = 1;
X = 1.548; label x = "X factor to apply";
run;
proc contents noprint data=cars1 out=cars1_contents;
proc contents noprint data=cars2 out=cars2_contents;
run;
data cars1_contents;
set cars1_contents;
upName = upcase(Name);
run;
data cars2_contents;
set cars2_contents;
upName = upcase(Name);
run;
proc sort data=cars1_contents; by upName;
proc sort data=cars2_contents; by upName;
run;
proc compare noprint
base=cars1_contents
compare=cars2_contents
outall
out=cars_contents_compare (where=(_TYPE_ ne 'PERCENT'))
;
by upName;
run;
There is also an ODS table you can capture directly without having to run Proc CONTENTS, but the capture is not 'data-rific'
ods output CompareVariables=work.cars_vars;
proc compare base=cars1 compare=cars2;
run;
I have a column with name total transaction. I want to add a date 4 days back from now in its name .
For example if today is 20161220 so I want my variable to be renamed as total_transaction_20161216.
Please suggest me a way out of my problem.
Just create a macro variable that stores the required date format and then use that in a rename statement within proc datasets.
%let datevar = %sysfunc(intnx(day,%sysfunc(today()),-4),yymmddn8.);
%put &=datevar.;
data have;
total_transaction=1;
run;
proc datasets lib=work nolist nodetails;
modify have;
rename total_transaction = total_transaction_&datevar.;
quit;