How to import multiple .xls files into SAS? - sas

I have a folder that contains a number of .xls files. The names of the file could be random. The exact number is unknown. How can I import these datasets to SAS by knowing only the folder's directory? I would have to iterate ... i have done this using Java ... I am curious can SAS do this?

Once you get a list of excel files in the folder (using techniques suggested above), you can put it into a macrovariable and loop through them in a macro assigning them one-by-one to a library:
%DO i=1 %TO %SYSFUNC(COUNTW(&list_of_files));
LIBNAME xlibr EXCEL "&your_folder\%scan(&list_of_files,&i)";
DATA imported_file_&i;
SET xlibr.'Sheet1$'n;
RUN;
%END;
If name(s) and number of sheets in each file may vary, then you'll need to add one more, nested, loop to iterate through all sheets of each file. Something like this:
%DO i=1 %TO %SYSFUNC(COUNTW(&list_of_files));
LIBNAME xlibr EXCEL "&your_folder\%scan(&list_of_files,&i)";
PROC SQL noprint;
SELECT memname into :sheets separated by ' '
FROM sashelp.vtable
WHERE libname="XLIBR";
QUIT;
%DO j=1 %TO %SYSFUNC(COUNTW(&sheets));
DATA imported_file_&i&j;
SET xlibr."%scan(&sheets,&j)$"n;
RUN;
%END;
%END;

Related

How to trigger a macro for every data set containing a keyword

I'm doing a crash course on SAS macros and I'm stuck at one exercise. I have to create a macro, that will create a proc contents tables for every data set, that contains a keyword. I know how to do that using call execute, but I need this using proc sql and %do loop.
My attempt:
%macro contents(data=&syslast);
proc contents data=&data;
title "&data";
run;
%mend contents;
%macro ContentsAll(keyword);
select libname||'.'||memname
into :dsn1-
from sashelp.vstabvw
where upcase(memname) like %upcase("%quote(%)&&keyword%")
;
quit;
%do i=1 %to &sqlobs;
%contents(data=&&dsn&i);
%end;
%mend ContentsAll;
options mlogic mprint;
%ContentsAll(class);
options nomlogic nomprint;
I know there is some issue with a select statement, but I have no idea how to fix it. And where statement has an unprotected variable (my attempts at fixing it just break the where clause alltogether.
First of all, good job. It's so good that I'm almost sorry you're only missing the Proc SQL Statement :-)
%macro contents(data=&syslast);
proc contents data=&data;
title "&data";
run;
%mend contents;
%macro ContentsAll(keyword);
proc sql noprint;
select libname||'.'||memname
into :dsn1-
from sashelp.vstabvw
where upcase(memname) like %upcase("%quote(%)&&keyword%")
;
quit;
%do i=1 %to &sqlobs;
%contents(data=&&dsn&i);
%end;
%mend ContentsAll;
options mlogic mprint;
%ContentsAll(class);
options nomlogic nomprint;
There is no need to create all of those macro variables. Just keep the list of names in actual data. You can use CALL EXECUTE() to generate the code you want to run for each member in the list.
Note that the variables LIBNAME and MEMNAME will already be in uppercase when pulled from the DICTIONARY.MEMBERS metadata that the view SASHELP.VSTABVW uses. But the user passing in a value for the KEYWORD parameter might not have entered uppercase letters.
%macro ContentsAll(keyword);
data _null_;
set sashelp.vstabvw ;
where memname contains "%qupcase(&keyword)" ;
call execute(cats('%nrstr(%contents)(data=',libname,'.',memname,')'));
run;
%mend ContentsAll;

SAS - Exporting same dataset to multiple excel files

I would like to export the dataset to multiple excel files based on a certain variable:
proc sql;
create table try as
select distinct make from sashelp.cars;
quit;
proc sql;
create table try2 as
select count(make) as aaa from sashelp.cars;
quit;
data _null_;
set try;
by make;
call symputx ('make',compress(make,' .'),'g');
run;
data _null_;
set try2;
call symputx('n',aaa);
run;
%macro a;
%do i=1 %to &n;
%let var= %scan(&make,&i,"#");
proc export data=testing (where=(make="&make."))
outfile="C:\Users\&make..xlsx"
dbms=xlsx replace;
sheet="&make." ;
run;
%end;
%mend ;
%a;
My goal is to get all the 38 excel files with the maker name as the filename.
However, all I am able to get here is the last maker name's file.
Would you please point out where I am missing out here? Many thanks!!
Your first error is that you count the number of cars that have a make, while you should count the distinct makes of cars. Now let me also take the opportunity to explain you the into clause of sql, so you don't need that data step anymore
proc sql;
select count(distinct make)
into :make_count
from sashelp.cars;
quit;
You remove blanks and point from your make names, but you better remove all non-alphabetic characters at once, with compress(make, '', 'ka'), in which the options k stands for keep and a stands for alphabetic.
Your main error is that you think you append all make names in the macro variable make, but you actually overwrite make time and again: first you write "Cadillac" to it, then "Chevrolet" and by the time you ever use it, it became "Volvo".
I could explain you how to correct your datastep, but instead, I will learn you an option of that into statement:
proc sql;
select distinct compress(make, '', 'ka')`
into :make_list separated by ' '
from sashelp.cars;
quit;
The rest is easy.
%macro export_by_make;
%do make_nr=1 %to &make_count;
%let make= %scan(&make_list, &make_nr);
proc export data=sashelp.cars (where=(compress(make, '', 'ka')`="&make."))
outfile="C:\Users\&make..xlsx"
dbms=xlsx replace;
sheet="&make." ;
run;
%end;
%mend;
%export_by_make;
Note that you don't need to specify a separator for the %scan function, as we separated by blanks, but anyway, if you do, as you use the macro version of scan, you don't need the quotes around it.

SAS-Writing Multiple Tables to one XLSX Workbook w/ 2 tables per sheet

I am new to SAS and am having some issues exporting data. I have written a macro to generate some summary tables based on a certain ID. The macro creates two tables for each ID identified in a particular proc sql query. I can write out the last two tables but it overwrites all tables. I was wondering if there is a way to generate one sheet, containing the two summary tables, for each ID identified in my query. Below is the code I have to date for exporting data:
%macro output(x);
ods tagsets.excelxp file="W:\user\test.xls" options(sheet_interval='none');
proc print data=prov_&x;
run;
proc print data=prov_revcd_&x;
run;
ods tagsets.excelxp close;
%mend;
/*Run a loop for each IDcode. Each code will enter the document generation loop*/
%macro loopit(mylist);
%let else=;
%let n = %sysfunc(countw(&mylist)); /*let n=number of codes in the list*/
data
%do I=1 %to &n;
%let val = %scan(&mylist,&I); /*Let val= the ith code in the list*/
%end;
%do j=1 %to &n;
%let val = %scan(&mylist,&j); /*Let val= the jth code in the list*/
/*Run the macro loop to generate the required tables*/
%runtab(&val);
%output&val);
%end;
run;
%mend;
/*Run the macro loop over the list of significant procedure code values*/
%loopit(&varlist);
Any help for correcting this issue would be greatly appreciated! Thanks!
I would rewrite %output like so.
%macro output(x);
ods tagsets.excelxp options(sheet_interval='none' sheet_name="&x");
proc print data=prov_&x;
run;
proc print data=prov_revcd_&x;
run;
%mend;
Then as Reeza suggests put the original ods tagsets.excelxp file= ... and close outside the whole macro.
ods tagsets.excelxp file="c:\temp\test.xlsx";
%loopit(&varlist)
ods tagsets.excelxp close;
If you use PROC EXPORT, that does allow apending to a workbook without this step (and no ODS at al).
%macro output(x);
proc export data=prov_&x outfile="c:\temp\test.xlsx" dbms=excel replace;
sheet="&x._prov";
run;
%mend;
However, this only allows one dataset per sheet - so either you append them together first as a single dataset, or you use 2 sheets in this solution.
Move the ods tagsets.excelxp file= and ods tagsets.excelxp close to outside of the macro otherwise you're recreating the file each time.
You may want to explicitly name the sheets as well.

Set the labels of a SAS Dataset equal to their variable name

I'm working with a rather large several dataset that are provided to me as a CSV files. When I attempt to import one of the files the data will come in fine but, the number of variables in the file is too large for SAS, so it stops reading the variable names and starts assigning them sequential numbers. In order to maintain the variable names off of the data set I read in the file with the data row starting on 1 so it did not read the first row as variable names -
proc import file="X:\xxx\xxx\xxx\Extract\Live\Live.xlsx" out=raw_names dbms=xlsx replace;
SHEET="live";
GETNAMES=no;
DATAROW=1;
run;
I then run a macro to start breaking down the dataset and rename the variables based on the first observations in each variable -
%macro raw_sas_datasets(lib,output,start,end);
data raw_names2;
raw_names;
if _n_ ne 1 then delete;
keep A -- E &start. -- &end.;
run;
proc transpose data=raw_names2 out=raw_names2;
var A -- &end.;
run;
data raw_names2;
set raw_names2;
col1=compress(col1);
run;
data raw_values;
set raw;
keep A -- E &start. -- &end.;
run;
%macro rename(old,new);
data raw_values;
set raw_values;
rename &old.=&new.;
run;
%mend rename;
data _null_;
set raw_names2;
call execute('%rename('||_name_||","||col1||")");
run;
%macro freq(var);
proc freq data=raw_values noprint;
tables &var. / out=&var.;
run;
%mend freq;
data raw_names3;
set raw_names2;
if _n_ < 6 then delete;
run;
data _null_;
set raw_names3;
call execute('%freq('||col1||")");
run;
proc sort data=raw_values;
by StudySubjectID;
run;
data &lib..&output.;
set raw_values;
run;
%mend raw_sas_datasets;
The problem I'm running into is that the variable names are now all set properly and the data is lined up correctly, but the labels are still the original SAS assigned sequential numbers. Is there any way to set all of the labels equal to the variable names?
If you just want to remove the variable labels (at which point they default to the variable name), that's easy. From the SAS Documentation:
proc datasets lib=&lib.;
modify &output.;
attrib _all_ label=' ';
run;
I suspect you have a simpler solution than the above, though.
The actual renaming step needs to be done differently. Right now it's rewriting the entire dataset over and over again - for a lot of variables that is a terrible idea. Get your rename statements all into one datastep, or into a PROC DATASETS, or something else. Look up 'list processing SAS' for details on how to do that; on this site or on google you will find lots of solutions.
You likely can get SAS to read in the whole first line. The number of variables isn't the problem; it is probably the length of the line. There's another question that I'll find if I can on this site from a few months ago that deals with this exact problem.
My preferred option is not to use PROC IMPORT for CSVs anyway; I would suggest writing a metadata table that stores the variable names and the length/types for the variables, then using that to write import code. A little more work at first, but only has to be done once per study and you guarantee PROC IMPORT isn't making silly decisions for you.
In the library sashelp is a table vcolumn. vcolumn contains all the names of your variables for each library by table. You could write a macro that puts all your variable names into macro variables and then from there set the label.
Here's some code that I put together (not very pretty) but it does what you're looking for:
data test.label_var;
x=1;
y=1;
label x = 'xx';
label y = 'yy';
run;
proc sql noprint;
select count(*) into: cnt
from sashelp.vcolumn
where memname = 'LABEL_VAR';quit;
%let cnt = &cnt;
proc sql noprint;
select name into: name1 - :name&cnt
from sashelp.vcolumn
where memname = 'LABEL_VAR';quit;
%macro test;
%do i = 1 %to &cnt;
proc datasets library=test nolist;
modify label_var;
label &&name&i=&&name&i;
quit;
%end;
%mend test;
%test;

Get data filtered by dynamic column list in SAS stored process

My goal is to create a SAS stored process is to return data for a single dataset and to filter the columns in that dataset based on a multi-value input parameter passed into the stored process.
Is there a simple way to do this?
Is there a way to do this at all?
Here's what I have so far. I'm using a macro to dynamically generate the KEEP statement to define which columns to return. I've defined macro variables at the top to mimic what gets passed into the stored process when called through SAS BI Web Services, so unfortunately those have to remain as they are. That's why I've tried to use the VVALUEX method to turn the column name strings into variable names.
Note - I'm new to SAS
libname fetchlib meta library="lib01" metaserver="123.12.123.123"
password="password" port=1234
repname="myRepo" user="myUserName";
/* This data represents input parameters to stored process and
* is removed in the actual stored process*/
%let inccol0=3;
%let inccol='STREET';
%let inccol1='STREET';
%let inccol2='ADDRESS';
%let inccol3='POSTAL';
%let inccol_count=3;
%macro keepInputColumns;
%if &INCCOL_COUNT = 1 %then
&inccol;
%else
%do k=1 %to (&INCCOL_COUNT);
var&k = VVALUEX(&&inccol&k);
%end;
KEEP
%do k=1 %to (&INCCOL_COUNT);
var&k
%end;
;
%mend;
data test1;
SET fetchlib.Table1;
%keepInputColumns;
run;
/*I switch this output to _WEBOUT in the actual stored process*/
proc json out='C:\Logs\Log1.txt';
options firstobs=1 obs=10;
export test1 /nosastags;
run;
There are some problems with this. The ouput uses var1, var2 and var3 as the column names and not the actual column names. It also doesn't filter by any columns when I change the output to _webout and run it using BI Web Services.
OK, I think I have some understanding of what you're doing here.
You can use KEEP and RENAME in conjunction to get your variable names back.
KEEP
%do k=1 %to (&INCCOL_COUNT);
var&k
%end;
;
This has an equivalent
RENAME
%do k=1 %to (&INCCOL_COUNT);
var&k = &&inccol&k.
%end;
;
and now, as long as the user doesn't separately keep the original variables, you're okay. (If they do, then you will get a conflict and an error).
If this way doesn't work for your needs, and I don't have a solution for the _webout as I don't have a server to play with, you might consider trying this in a slightly different way.
proc format;
value agef
11-13 = '11-13'
14-16 = '14-16';
quit;
ods output report=mydata(drop=_BREAK_);
proc report data=sashelp.class nowd;
format age agef.;
columns name age;
run;
ods output close;
The first part is just a proc format to show that this grabs the formatted value not the underlying value. (I assume that's desired, as if it's not this is a LOT easier.)
Now you have the data in a dataset a bit more conveniently, I think, and can put it out to JSON however you want. In your example you'd do something like
ods output report=work.mydata(drop=_BREAK_);
proc report data=fetchlib.Table1 nowd;
columns
%do k=1 %to (&INCCOL_COUNT);
&&inccol&k.;
%end;
;
run;
ods output close;
And then you can send that dataset to JSON or whatever. It's actually possible that you might be able to go more directly than that even, but I don't know almost anything about PROC JSON.
Reading more about JSON, you may actually have an easier way to do this.
On the export line, you have the various format options. So, assuming we have a dataset that is just a subset of the original:
proc json out='C:\Logs\Log1.txt';
options firstobs=1 obs=10;
export fetchlib.Table1
(
%do k=1 %to (&INCCOL_COUNT);
&&inccol&k.;
%end;
)
/ nosastags FMTCHARACTER FMTDATETIME FMTNUMERIC ;
run;
This method doesn't allow for the variable order to be changed; if you need that, you can use an intermediate dataset:
data intermediate/view=intermediate;
set fetchlib.Table1;
retain
%do k=1 %to (&INCCOL_COUNT);
&&inccol&k.;
%end;
;
keep
%do k=1 %to (&INCCOL_COUNT);
&&inccol&k.;
%end;
;
run;
and then write that out. I'm just guessing that you can use a view in this context.
It turns out that the simplest way to implement this was to change the way that the columns (aka SAS variables) were passed into the stored process. Although Joe's answer was helpful, I ended up solving the problem by passing in the columns to the keep statement as a space-separated column list, which greatly simplified the SAS code because I didn't have to deal with a dynamic list of columns.
libname fetchlib meta library="lib01" metaserver="123.12.123.123"
password="password" port=1234
repname="myRepo" user="myUserName";"&repository" user="&user";
proc json out=_webout;
export fetchlib.&tablename(keep=&columns) /nosastags;
run;
Where &columns gets set to something like this:
Column1 Column2 Column3