For a Macro, I want to create an automated ATTRIB Statement.
I have an exel Table with all variable names, formats, lables and lengths.
Now I want SAS to read each row and pass it into:
%LET Format_VARIABLE = FORMAT = &For LENGTH = &len LABEL = "&lab";
Any ideas how to archive this?
Assuming you have a dataset (called metadata) containing all of you variable names (vname), formats (vfmt), lengths (vlen) and labels (vlbl) from Excel :
/* Create VNAME1-VNAMEx, VFMT1-VFMTx etc */
data _null_ ;
set metadata end=eof ;
call symputx(cats('VNAME',_n_),vname) ;
call symputx(cats('VFMT',_n_),vfmt) ;
call symputx(cats('VLEN',_n_),vlen) ;
call symputx(cats('VLBL',_n_),vlbl) ;
if eof then call symputx('VNUM',_n_) ;
run ;
%MACRO BUILD_ATTRIB ;
/* Iterate over each set of macro variables and resolve into `attrib` statement */
attrib %DO I = 1 %TO &VNUM ;
&&VNAME&I format=&&VFMT&I length=&&VLEN&I label="&&VLBL&I"
%END ;
;
%MEND ;
/* To use in a datastep */
data want ;
%BUILD_ATTRIB ;
set have ;
run ;
Related
I'm SAS user.
I want to assign year columns using date values.
for example, here is my code, below.
I want to make Y_2010, Y_2011, Y_2012 , Y_2013, Y_2014 in work.total data set.
but there is only Y_2014 as a result.
How can I change the code as I can get right result which I intended first?
options mcompilenote = all;
%let a = Y_ ;
%macro B(YMIN, YMAX) ;
%do i = &YMIN %to &YMAX ;
DATA TOTAL ;
SET SASUSER.EMPDATA ;
IF YEAR(HIREDATE) = &i THEN &a&i = 1 ;
ELSE &a&i = 0 ;
RUN;
%end;
%mend;
%B (2010, 2014) ;
Because you are repeatedly re-creating the output dataset only the final version is available. To fix the macro move the %DO loop inside the DATA step so that you are generating all of the variables in a single data step.
%macro B(YMIN, YMAX) ;
DATA TOTAL ;
SET SASUSER.EMPDATA ;
%do i = &YMIN %to &YMAX ;
IF YEAR(HIREDATE) = &i THEN &a&i = 1 ;
ELSE &a&i = 0 ;
%end;
RUN;
%mend;
But there is no need to a macro for this. Just use normal SAS statements. For example you could use an ARRAY statement to define the variables and then loop over the array and set the values. Note that the result of a boolean expression in SAS is 0 when false and 1 when true so you can eliminate the IF/THEN/ELSE statement and just use an assignment statement.
DATA TOTAL ;
SET SASUSER.EMPDATA ;
array &a &a&ymin - &a&ymax;
do i=&ymin to &ymax ;
&a[i-&ymin+1] = (year(hiredata)=i);
end;
drop i;
RUN;
I am trying to run this code
data swati;
input facility_id$ loan_desc : $50. sys_name :$50.;
cards;
fac_001 term_loan RM_platform
fac_001 business_loan IQ_platform
fac_002 business_loan BUSES_termloan
fac_002 business_loan RM_platform
fac_003 overdrafts RM_platform
fac_003 RCF IQ_platform
fac_003 term_loan BUSES_termloan
;
proc contents data=swati out=contents(keep=name varnum);
run;
proc sort data=contents;
by varnum;
run;
data contents;
set contents ;
where varnum in (2,3);
run;
data contents;
set contents;
summary=catx('_',name, 'summ');
run;
data _null_;
set contents;
call symput ("name" || put(_n_ , 10. -L), name);
call symput ("summ" || put (_n_ , 10. -L), summary);
run;
options mlogic symbolgen mprint;
%macro swati;
%do i = 1 %to 2;
proc sort data=swati;
by facility_id &&name&i.;
run;
data swati1;
set swati;
by facility_id &&name&i.;
length &&summ&i. $50.;
retain &&summ&i.;
if first.facility_id then do;
&&summ&i.="";
end;
if first.&&name&i. = last.&&name&i. then &&summ&i.=catx(',',&&name&i., &&summ&i.);
else if first.&&name&i. ne last.&&name&i. then &&summ&i.=&&name&i.;
run;
if last.facility_id ;
%end;
%mend;
%swati;
This code will create two new variables loan_desc_summ and sys_name_summ which has values of the all the loans_desc in one line and the sys_names in one line seprated by comma example (term_loan, business_loan), (RM_platform, IQ_platform) But if a customer has only one loan_desc the loan_summ should only have its value twice.
The problem while running the do loop is that after running this code, I am getting the dataset with only the sys_name_summ and not the loan_desc_summ. I want the dataset with all the five variables facility_id, loan_desc, sys_name, loan_desc_summ, sys_name_summ.
Could you please help me in finding out if there is a problem in the do loop??
Your loop is always starting with the same input dataset (swati) and generating a new dataset (SWATI1). So only the last time through the loop has any effect. Each loop would need to start with the output of the previous run.
You also need to fix your logic for eliminating the duplicates.
For example you could change the macro to:
%macro swati;
data swati1;
set swati;
run;
%do i = 1 %to 2;
proc sort data=swati1;
by facility_id &&name&i.;
run;
data swati1;
set swati1;
by facility_id &&name&i ;
length &&summ&i $500 ;
if first.facility_id then &&summ&i = ' ' ;
if first.&&name&i then catx(',',&&summ&i,&&name&i);
if last.facility_id ;
run;
%end;
%mend;
Also your program could be a lot smaller if you just used arrays.
data want ;
set have ;
by facility_id ;
array one loan_desc sys_name ;
array two $500 loan_desc_summ sys_name_summ ;
retain loan_desc_summ sys_name_summ ;
do i=1 to dim(one);
if first.facility_id then two(i)=one(i) ;
else if not findw(two(i),one(i),',','t') then two(i)=catx(',',two(i),one(i));
end;
if last.facility_id;
drop i loan_desc sys_name ;
run;
If you want to make it more flexible you can put the list of variable names into a macro variable.
%let varlist=loan_desc sys_name;
You could then generate the list of new names easily.
%let varlist2=%sysfunc(tranwrd(&varlist,%str( ),_summ%str( )))_summ ;
Then you can use the macro variables in the ARRAY, RETAIN and DROP statements.
I am trying to parse a delimited dataset with over 300 fields. Instead of listing all the input fields like
data test;
infile "delimited_filename.txt"
DSD delimiter="|" lrecl=32767 STOPOVER;
input field_A:$200.
field_B :$200.
field_C:$200.
/*continues on */
;
I am thinking I can dump all the field names into a file, read in as a sas dataset, and populate the input fields - this also gives me the dynamic control if any of the field names changes (add/remove) in the dataset. What would be some good ways to accomplish this?
Thank you very much - I just started sas, still trying to wrap my head around it.
This worked for me - Basically "write" data open code using macro language and run it.
Note: my indata_header_file contains 5 columns: Variable_Name, Variable_Length, Variable_Type, Variable_Label, and Notes.
%macro ReadDsFromFile(filename_to_process, indata_header_file, out_dsname);
%local filename_to_process indata_header_file out_dsname;
/* This macro var contain code to read data file*/
%local read_code input_in_line;
%put *** Processing file: &filename_to_process ...;
/* Read in the header file */
proc import OUT = ds_header
DATAFILE = &indata_header_file.
DBMS = EXCEL REPLACE; /* REPLACE flag */
SHEET = "Names";
GETNAMES = YES;
MIXED = NO;
SCANTEXT = YES;
run;
%let id = %sysfunc(open(ds_header));
%let NOBS = %sysfunc(attrn(&id.,NOBS));
%syscall set(id);
/*
Generates:
data &out_dsname.;
infile "&filename_to_process."
DSD delimiter="|" lrecl=32767 STOPOVER FIRSTOBS=3;
input
'7C'x
*/
%let read_code = data &out_dsname. %str(;)
infile &filename_to_process.
DSD delimiter=%str("|") lrecl=32767 STOPOVER %str(;)
input ;
/*
Generates:
<field_name> : $<field_length>;
*/
%do i = 1 %to &NObs;
%let rc = %sysfunc(fetchobs(&id., &i));
%let VAR_NAME = %sysfunc(getvarc(&id., %sysfunc(varnum(&id., Variable_Name)) ));
%let VAR_LENGTH = %sysfunc(getvarn(&id., %sysfunc(varnum(&id., Variable_Length)) ));
%let VAR_TYPE = %sysfunc(getvarc(&id., %sysfunc(varnum(&id., Variable_Type)) ));
%let VAR_LABEL = %sysfunc(getvarc(&id., %sysfunc(varnum(&id., Variable_Label)) ));
%let VAR_NOTES = %sysfunc(getvarc(&id., %sysfunc(varnum(&id., Notes)) ));
%if %upcase(%trim(&VAR_TYPE.)) eq CHAR %then
%let input_in_line = &VAR_NAME :$&VAR_LENGTH..;
%else
%let input_in_line = &VAR_NAME :&VAR_LENGTH.;
/* append in_line statment to main macro var*/
%let read_code = &read_code. &input_in_line. ;
%end;
/* Close the fid */
%let rc = %sysfunc(close(&id));
%let read_code = &read_code. %str(;) run %str(;) ;
/* Run the generated code*/
&read_code.
%mend ReadDsFromFile;
Sounds like you want to generate code based on metadata. A data step is actually a lot easier to code and debug than a macro.
Let's assume you have metadata that describes the input data. For example let's use the metadata about the SASHELP.CARS. We can build our metadata from the existing DICTIONARY.COLUMNS metadata on the existing dataset. Let's set the INFORMAT to the FORMAT since that table does not have INFORMAT value assigned.
proc sql noprint ;
create table varlist as
select memname,varnum,name,type,length,format,format as informat,label
from dictionary.columns
where libname='SASHELP' and memname='CARS'
;
quit;
Now let's make a sample text file with the data in it.
filename mydata temp;
data _null_;
set sashelp.cars ;
file mydata dsd ;
put (_all_) (:);
run;
Now we just need to use the metadata to write a program that could read that data. All we really need to do is define the variables and then add a simple INPUT firstvar -- lastvar statement to read the data.
filename code temp;
data _null_;
set varlist end=eof ;
by varnum ;
file code ;
if _n_=1 then do ;
firstvar=name ;
retain firstvar ;
put 'data ' memname ';'
/ ' infile mydata dsd truncover lrecl=1000000;'
;
end;
put ' attrib ' name 'length=' #;
if type = 'char' then put '$'# ;
put length ;
if informat ne ' ' then put #10 informat= ;
if format ne ' ' then put #10 format= ;
if label ne ' ' then put #10 label= :$quote. ;
put ' ;' ;
if eof then do ;
put ' input ' firstvar '-- ' name ';' ;
put 'run;' ;
end;
run;
Now we can just run the generated code using %INCLUDE.
%include code / source2 ;
I have a SAS dataset which contains one column of polynomials. For example, X1**(-2)+X1**(2).
Is there a function to transform this into a numeric expression?
Many thanks,
If I understand you correctly, I don't think there is a specific function that will easily let you do this. You have two options - write your own logic to interpret the polynomial expressions, or use call execute to have SAS write out a (potentially very long) data step for you, assuming that the polynomials are all entered as valid data step code. Here's a call execute approach:
data have;
input x1 polynomial $255.;
infile datalines truncover;
datalines;
1 X1**(-2)+X1**(2)
2 X1**(-1)+X1**(1)
3 X1**(1)+X1**(-1)
;
run;
data _null_;
set have end = eof;
if _n_ = 1 then call execute('data want; set have; select(_n_);');
call execute(catx(' ','when(',_N_,') y =',polynomial,';'));
if eof then call execute('end; run;');
run;
Convert them to macro variables, and then resolve them into a calculation...
Using the dataset example in user667489's answer :
/* Create numbered macro variables, 1 per row of data */
data _null_ ;
set have end=eof ;
call symputx(cats('POLY',_n_),polynomial) ;
if eof then call symputx('POLYN',_n_) ;
run ;
%MACRO ROWLOOPER ;
%DO N = 1 %TO &POLYN ;
if _n_ = &N then result = &&POLY&N ;
%END ;
%MEND ;
data want ;
set have ;
/* Not very efficient, looping over all polynomials on each row of data */
/* So for 3 rows, you'll perform 9 iterations here */
%ROWLOOPER ;
run ;
Or, alternatively, write your dataset out into a SAS program, and %inc that program :
data _null_ ;
file "polynomials.sas" ;
set have end=eof ;
if _n_ = 1 then do ;
put "data poly;" ;
put " set have;" ;
end ;
put " result = " polynomial ";" ;
if eof then put "run;" ;
run ;
%inc "polynomials.sas" ;
I have a dataset in which I store all the values to be used in IN clause elsewhere.
DATA INVALUES;
INPUT INVAL;
DATALINES;
1
2
3
;
RUN;
I have to use the invalues in another dataset as below.
DATA OUTPUT;
SET INPUT;
IF A IN ( --- INVAL values from dataset INVALUES ----);
;
RUN;
Could this be done in any way?
You can use a macro variable for this.
/* Put the inval values in a comma separated macro variable */
proc sql;
select inval into:inval separated by ","
from invalues;
/* prints the macro variable in the log */
%put &inval; /* 1,2,3 */
/* use the macrovariable in the IF statement */
DATA OUTPUT;
SET INPUT;
IF A IN &inval;
RUN;
A neater way is to use a subquery in SQL, no need to worry about the datatype/quoting.
proc sql ;
create table output as
select *
from input
where a in(select distinct inval from invalues) ;
quit ;