SAS List of variables containing special characters - sas

I am a novice for SAS programming and I am trying to create a list of old_variable names so I can work with them (subset, rename, spaces, etc.), however the variable names have funky characters (#, parenthesis, single quotes, numbers, you name it). Each variable is delimited-separated by ';' and the source file is in csv format. I need to do it for 44 different files and each file has about 199 variables.
So far, I have tried a macro where I create a list of the variables, however, the code fails when I try to use the macro &vars because of the special characters. I have checked SAS paper 005-2013, however I believe I am not really sure how to use the functions in my code.
Any insights or directions would be appreciated. Here is the code I tried so far:
1) Importing:
proc import datafile='file_oldname.csv'
dbms=csv
out= oldName
replace;
delimiter=',';
getnames=yes;
run;
2) Making my list of oldNames;
* A macro variable contanining the oldvariables names;
* Using Proc Contents and Proc SQL;
proc contents data=oldName out=listOldName;
run;
options VALIDMEMNAME=EXTEND;
proc sql noprint;
select distinct(name) into:vars separated by " " from listOldName;
quit;
%put &vars;
&vars contains the list of variables, however if I try to use it, it fails because of the special characters.
How can I wrap the &vars properly so that the variable names with special characters can be used? I want to further renamed them by new names.
Thanks a lot!

As your variable names contain special characters, they need to be referenced in "name literal!"n format.
Try:
options VALIDVARNAME=ANY;
proc sql noprint;
select distinct nliteral(name)
into:vars separated by ' '
from listOldName;
%put &=var;
More info on name literals in documentation.
Edit - as kindly pointed out by #Tom, the nliteral function handily covers the conversion you require! Documentation for that is here.

When Proc IMPORT runs it creates data sets with variable names that meet the VALIDVARNAME setting at import time. Thus, one alternative, would be to set the option prior to IMPORT to ensure 'clean' variable names, and reset it afterwards.
%let setting = %sysfunc(getoption(VALIDVARNAME));
options validvarname=v7;
Proc IMPORT ... ;
...;
run;
options validvarname = &SETTING;
Example
filename have temp;
data _null_;
file have;
put 'Name,"Age","Date of Birth",School:First,Surprise!,-Feedback';
put 'Abel,0,Yesterday,Eastern Acacdemy,Balloons!,None';
run;
options validvarname=any;
proc import file=have replace out=have_any dbms=csv;
getnames=yes;
run;
options validvarname=v7;
proc import file=have replace out=have_v7 dbms=csv;
getnames=yes;
run;
options validvarname=any;

Related

Rename SAS columns using pattern matching

Working in SAS here, and have a lot of column names that I'd like to drop a pattern from. This is pretty straightforward in R:
colnames(data) <- gsub('drop_pattern', '', colnames(data))
But is there an equivalently elegant SAS way?
You can use the RENAME statement in PROC DATASETS to modify the names of variables in a dataset without having to make a new dataset.
proc datasets lib=mylib nolist;
modify mydata ;
rename freddrop_patterndy = freddy samdrop_patternmy=sammy ;
run;
quit;
You can use any number of functions, including those that support regular expressions, to construct a new name from an old name. For example if you just want to remove some constant text then something like this could work:
new_name = transtrn(old_name,'drop_pattern',trimn(' '));
You can use a query against the metadata of the variable names to generate the oldname=newname pairs into a macro variable.
proc sql noprint ;
select catx('=',name,transtrn(old_name,'drop_pattern',trimn(' '))
into :rename_list separated by ' '
from dictionary.column
where libname='MYLIB' and memname='MYDATA' and index(name,'drop_pattern')
;
quit;
Then you can use the macro variable in your code. You will probably need to skip this step if there are no names that need to be changed.
%if &sqlobs %then %do ;
proc datasets lib=mylib nolist;
modify mydata ;
rename &rename_list ;
run;
quit;
%end;
Note if you have set the VALIDVARNAME option to ANY then you will need to use the NLITERAL() function when generating the oldname=newname pairs to handle names that might not follow normal naming rules.
select catx('=',nliteral(name),nliteral(transtrn(old_name,'drop_pattern',trimn(' ')))

Define Library in SAS where some of the Folder name will be change each time(When this SAS macro run from different users)

Topic: SAS Library
Difficulty: path(File name and Location) is changing after each run but only some of the details are changing but not the full path (as given in below example). we have also highlighted those field in Bold
I want to write only one code where I can cover all kind of change which happening in file name and location, is it possible?
%let path='C:\Data\variationstring\empcat**A**\person**34**_**1212**\persondata_empcatA_**34**';
libname test "&path";
proc import datafile="&path\Accounts_**34**.xls"
out=mydata
sheet="thefile";
getnames=no;
run;
When another user run that program then above path will be changed:
%let path='C:\Data\variationstring\empcat**A**\person**49**_**1684**\persondata_empcatA_**49**';
libname test "&path";
proc import datafile="&path\Accouns_**49**.xls"
out=mydata
sheet="thefile";
getnames=no;
run;
Can anyone help me for this, please?
Thanks
Try to put it in a macro like this:
%macro import (macro_var1,macro_var2,macro_var3);
%let path="C:\Data\variationstring\empcat**A**\person**&macro_var1.**_**&macro_var2.**\persondata_empcatA_**&macro_var3.**";
libname test "&path";
proc import datafile="&path\Accouns_**49**.xls"
out=mydata
sheet="thefile";
getnames=no;
run;
%mend;
%import (34, 1212, 34);
%imoprt (49, 1684, 49);
etc.
when defining path, don't forget to put it in double quotes (") insted of single (')
In addition to Grigory's excellent suggestion, it looks to me like you're doing something that would be well served utilizing a data-driven programming approach.
If you have, say, an excel spreadsheet with all of your personnel records - let's say the first number is store and the second is employeeID - and you want to run one report per employeeID, then you write the macro like Grigory suggested; but you call the report from the first dataset.
So here:
proc import file="C:\Data\employeeID.xlsx"
out=employees dbms=xlsx
replace;
run;
%macro get_employee(store=, employeeID=, empCat=);
%let path="C:\Data\variationstring\empcat&empcat.\person&store._&employeeID.\persondata_empcat&empcat._&store.";
libname test "&path";
proc import datafile="&path\Accouns_&store..xls"
out=mydata
sheet="thefile";
getnames=no;
run;
%mend get_employee;
proc sql; *this generates macro calls, look at output to see what the macro variable contains;
select cats('%get_employee(employeeid=',employeeID,',store=',store,',empcat=',empcat,')')
into :get_emp_list separated by ' '
from employees;
quit;
&get_emp_list.; *This actually runs all those macro calls;
You can read my paper, Writing Code With Your Data for more details, or find other similar papers online.

How can I include a timestamp in a .xpt filename?

I have a SAS script that outputs a SAS .xpt file. I currently use the PROC COPY method of generating this because the required name includes dashes and is longer than eight characters (which I understand is the name limit when using xport).
My code is roughly as follows:
LIBNAME TempSrc "C:\Temp";
LIBNAME xportout xport 'C:\Temp\1234-AB-FileOut_Name_.xpt';
PROC IMPORT datafile="C:\Temp\FileIn.csv"
out=mydata
dbms=dlm replace;
DELIMITER= ",";
getnames=yes;
options ExtendObsCounter=no;
RUN;
DATA TempSrc.SasFile;
set work.mydata
RUN;
PROC COPY in=TempSrc out=xportout memtype=data;
select stdy7673;
RUN;
I have recently been required to include a timestamp in the output file name.
I have these macros to generate the date and time as required:
%let today=%sysfunc(date(), date9.);
%let now=%sysfunc(time(), time5.);
%let now=%sysfunc(compress(&now, :));
I have not been able to incorporate into the LIBNAME with any success, though.
Neither of the following has worked:
LIBNAME xportout xport 'C:\Temp\1234-AB-File_Name_&today.&now..xpt';
LIBNAME xportout xport 'C:\Temp\1234-AB-File_Name_' || &today. || &now. '.xpt';
How can I include the datetime in the .xpt filename?
Macro variables won't resolve in 'single quotes'. Use "double quotes" as follows:
LIBNAME xportout xport "C:\Temp\1234-AB-File_Name_&today.&now..xpt";

SAS proc import .xls with several spreadsheet and append

Situation: i have a workbook .xls with 4 spreadsheets named "SheetA", "SheetB", "SheetC", "SheetD".
For import one spreadsheet i do as following.
proc import
out = outputtableA
datafile = "C:\User\Desktop\excel.xls"
dbms = xls replace;
sheet = 'SheetA';
namerow = 3;
startrow = 5;
run;
All spreadsheet have same number of variables and format. I would like to combine all four outputtableX together using data step:
data combinedata;
set outputtableA outputtableB outputtableC outputtableD;
run;
I am new to SAS, i m thinking whether array and do-loop can help.
I would not use a do loop (as they're almost always overly complicated). Instead, I would make it data driven. I also would use Reese's solution if you can; but if you must use PROC IMPORT due to the namerow/datarow options, this works.
First, create the libname.
libname mylib excel "c:\blah\excelfile.xls";
We won't actually use it, if you prefer the xls options, but this lets us get the sheets.
proc sql;
select cats('%xlsimport(sheet=',substr(memname,1,length(memname)-1),')')
into :importlist separated by ' '
from dictionary.tables
where libname='MYLIB' and substr(memname,length(memname))='$';
quit;
libname mylib clear;
Now we've got a list of macro calls, one per sheet. (A sheet is a dataset but it has a '$' on the end.)
Now we need a macro. Good thing you wrote this already. Let's just substitute a few things in here.
%macro xlsimport(sheet=);
proc import
out = out&sheet.
datafile = "C:\User\Desktop\excel.xls"
dbms = xls replace;
sheet = "&sheet.";
namerow = 3;
startrow = 5;
run;
%mend xlsimport;
And now we call it.
&importlist.
I leave as an exercise for the viewers at home wrapping all of this in another macro that is able to run this given a filename as a macro parameter; once you have done so you have an entire macro that operates with little to no work to import an entire excel libname.
If you an xls file and are using a 32 bit version of SAS something like this would work:
libname inxls excel 'C:\User\Desktop\excel.xls';
proc datasets library=excel;
copy out=work;
run; quit;
libname inxls;
Then you can do your step above to append the files together. I'm not sure Proc Import with excel recognizes the option name row and start row so you may need to modify your code somehow to accommodate that, possibly using firstobs and then renaming the variables manually.
What you have will work assuming the variable names are the same. If they are not use the rename statement to make them all the same.
data combinedata;
set outputtableA(rename=(old_name1=new_name1 old_name2=new_name2 ... ))
outputtableB(...)
...
;
run;
Obviously, fill in the ellipses.

SAS - Creating variables from macro variables

I have a SAS dataset which has 20 character variables, all of which are names (e.g. Adam, Bob, Cathy etc..)
I would like a dynamic code to create variables called Adam_ref, Bob_ref etc.. which will work even if there a different dataset with different names (i.e. don't want to manually define each variable).
So far my approach has been to use proc contents to get all variable names and then use a macro to create macro variables Adam_ref, Bob_ref etc..
How do I create actual variables within the dataset from here? Do I need a different approach?
proc contents data=work.names
out=contents noprint;
run;
proc sort data = contents; by varnum; run;
data contents1;
set contents;
Name_Ref = compress(Name||"_Ref");
call symput (NAME, NAME_Ref);
%put _user_;
run;
If you want to create an empty dataset that has variables named like some values you have in a macro variables you could do something like this.
Save the values into macro variables that are named by some pattern, like v1, v2 ...
proc sql;
select compress(Name||"_Ref") into :v1-:v20 from contents;
quit;
If you don't know how many values there are, you have to count them first, I assumed there are only 20 of them.
Then, if all your variables are character variables of length 100, you create a dataset like this:
%macro create_dataset;
data want;
length %do i=1 %to 20; &&v&i $100 %end;
;
stop;
run;
%mend;
%create_dataset; run;
This is how you can do it if you have the values in macro variable, there is probably a better way to do it in general.
If you don't want to create an empty dataset but only change the variable names, you can do it like this:
proc sql;
select name into :v1-:v20 from contents;
quit;
%macro rename_dataset;
data new_names;
set have(rename=(%do i=1 %to 20; &&v&i = &&v&i.._ref %end;));
run;
%mend;
%rename_dataset; run;
You can use PROC TRANSPOSE with an ID statement.
This step creates an example dataset:
data names;
harry="sally";
dick="gordon";
joe="schmoe";
run;
This step is essentially a copy of your step above that produces a dataset of column names. I will reuse the dataset namerefs throughout.
proc contents data=names out=namerefs noprint;
run;
This step adds the "_Refs" to the names defined before and drops everything else. The variable "name" comes from the column attributes of the dataset output by PROC CONTENTS.
data namerefs;
set namerefs (keep=name);
name=compress(name||"_Ref");
run;
This step produces an empty dataset with the desired columns. The variable "name" is again obtained by looking at column attributes. You might get a harmless warning in the GUI if you try to view the dataset, but you can otherwise use it as you wish and you can confirm that it has the desired output.
proc transpose out=namerefs(drop=_name_) data=namerefs;
id name;
run;
Here is another approach which requires less coding. It does not require running proc contents, does not require knowing the number of variables, nor creating a macro function. It also can be extended to do some additional things.
Step 1 is to use built-in dictionary views to get the desired variable names. The appropriate view for this is dictionary.columns, which has alias of sashelp.vcolumn. The dictionary libref can be used only in proc sql, while th sashelp alias can be used anywhere. I tend to use sashelp alias since I work in windows with DMS and can always interactively view the sashelp library.
proc sql;
select compress(Name||"_Ref") into :name_list
separated by ' '
from sashelp.vcolumn
where libname = 'WORK'
and memname = 'NAMES';
quit;
This produces a space delimited macro vaiable with the desired names.
Step 2 To build the empty data set then this code will work:
Data New ;
length &name_list ;
run ;
You can avoid assuming lengths or create populated dataset with new variable names by using a slightly more complicated select statement.
For example
select compress(Name)||"_Ref $")||compress(put(length,best.))
into :name_list
separated by ' '
will generate a macro variable which retains the previous length for each variable. This will work with no changes to step 2 above.
To create populated data set for use with rename dataset option, replace the select statement as follows:
select compress(Name)||"= "||compress(_Ref")
into :name_list
separated by ' '
Then replace the Step 2 code with the following:
Data New ;
set names (rename = ( &name_list)) ;
run ;