Effectively read multiple excel sheets from an xlsx excel workbook in SAS 9.4M7 - sas

I am trying to read multiple sheets from an excel workbook (xlsx format) in SAS. Instead of using two separate proc imports, is there a way to simultaneously read multiple excel sheets from an excel workbook? My code thus far is as follows:
proc import datafile= "&loc.\&exid..xlsx"
out=exp
dbms=xlsx replace;
sheet="Sheet1";
run;
proc import datafile= "&loc.\&exid..xlsx"
out=dt
dbms=xlsx replace;
range="'Sheet5'$A2:AB10000";
getnames=yes;
run;
It is taking ~1.40 secs to read both of these excel sheets from one excel workbook, how do I reduce the time it takes to read xlsx workbook in SAS.

If you have SAS/CONNECT, you can run the two imports in parallel with rsubmit. I tested this out on an Excel file and it did not give any simultaneous access errors. Since each import is only about 1.4 seconds, this might take longer in aggregate since it needs to spin up two new SAS sessions to run each import.
options autosignon = yes
connectwait = no
sascmd = '!sascmd'
;
libname worklib "%sysfunc(getoption(work))";
/* Send over macro variables to rsubmit sessions */
%syslput _USER_ / remote=session1;
%syslput _USER_ / remote=session2;
rsubmit remote=session1 inheritlib=(worklib);
proc import datafile= "&loc.\&exid..xlsx"
out=worklib.exp
dbms=xlsx replace;
sheet="Sheet1";
run;
endrsubmit;
rsubmit remote=session2 inheritlib=(worklib);
proc import datafile= "&loc.\&exid..xlsx"
out=worklib.dt
dbms=xlsx replace;
range="'Sheet5'$A2:AB10000";
getnames=yes;
run;
endrsubmit;

Related

SAS Import XLS file without headings

I am importing an XLS file with dataset looking like this:
And my code was as below:
%let dirLSB=/folders/myfolders/sasuser.v94/;
proc import datafile="&dirLSB.OnionRing.xls" out=sales replace;
run;
proc print data=sales label;
run;
But the result showed the first row had been treated as headings and the
row data for the first row "Columbia Peaches" was missing.
It should have been four rows but in the end, only three rows were present.
Are there any suggestions?
Thanks a lot!!!
Just add the getnames=no; statement to your proc import step.
proc import datafile="&dirLSB.OnionRing.xls" out=sales replace;
getnames=no;
run;

Import XLSX file in SAS starting from the third row, using other option than RANGE

We can import an XLS file using namerow and startrow, like in this example :
%let dir_n=TheDir_name;
%let fichimp=file_name.xls;
PROC IMPORT DATAFILE= "&dir_n.\&file_name."
out=want
dbms=xls replace;
sheet=theSheet_name;
getnames=no;
namerow=2;
startrow=3;
run;
I have read : To import XLSX file, use RANGE if the data is not starting on the first line.
Is there similar option to STARTROW to import XLSX file starting from a specific row?
No, there is not. dbms=XLSX only has a limited set of options, listed in the documentation: GETNAMES, SHEET, and RANGE.
EXCEL has a few more options (including DBDSOPTS which opens up several database-type options), but still uses range to control what is read in.

export datasets into multiple sheets of one excel file in sas

I'm use this code
proc export data=goldsheet_invalid outfile="C:\Documents and Settings\sasadm\Desktop\gold.xls" dbms=xls replace;
sheet="gold";
run;
proc export data=platinumsheet_invalid outfile="C:\Documents and Settings\sasadm\Desktop\gold.xls" dbms=xls replace;
sheet="platinum";
run;
proc export data=titaniumsheet_invalid outfile="C:\Documents and Settings\sasadm\Desktop\gold.xls" dbms=xls replace;
sheet="titanium";
run;
Error:Statement is not valid or it is used out of proper order
Note:- already try dbms=xlsx or dbms=EXCELCS but not work
Instead of using a PROC EXPORT this can be accomplished with older versions of SAS using ODS (Output Delivery System) statements. Going this route is not as clean as the PROC EXPORT but if all you want is to get the data from these data sets to a single Excel workbook and have the results of each proc statement on a different worksheet this will do it.
In this case the code to accomplish what you are looking for would be:
ods tagsets.excelxp file='C:\temp\gold.xml' options(sheet_name = 'Gold' sheet_interval='proc');
proc print data=goldsheet_invalid;
run;
ods tagsets.excelxp options(sheet_name = 'Platinum');
proc print data=platinumsheet_invalid;
run;
ods tagsets.excelxp options(sheet_name = 'Titanium');
proc print data=titaniumsheet_invalid;
run;
ods tagsets.excelxp close;
You will notice that the file extension created is XML, this is a necessity. When you load the file in Excel is would appear as expected and feel free to update the file extension from there.
More details about SAS and ODS can be found at: https://support.sas.com/rnd/base/ods/odsmarkup/TipSheet_ods_xl_xp.pdf

PROC EXPORT outfile row 2

I'm trying to export the column names of a sas data to a xlsx file but need the data to be copied starting in the 2nd row of the excel file. What I have right now:
PROC EXPORT DATA= mylib.test
outfile = "exceltobemodified.xlsx"
dbms = excel replace;
sheet = "test1";
range = "test1$A2:BE2000";
run;
However, I get an error indicating that the RANGE statement is not supported and is ignored in Export procedure
Any suggestions?
Try the data set option FIRSTOBS.
PROC EXPORT DATA= mylib.test (firstobs=2)
outfile = "exceltobemodified.xlsx"
dbms = excel replace;
run;
Edit: If by"starting in the 2nd row" you mean to output the data without the variable names, then you have to use PUTNAMES=NO;
PROC EXPORT DATA= mylib.test
outfile = "exceltobemodified.xlsx"
dbms = excel replace;
PUTNAMES=NO;
run;
Load your table with a blank row as first row. Try writing the table to excel file then. It should work.
Proc sql
insert into test
values('',.,'')
quit;
Proc sort data=test;
by _all_;
run;
Options missing='';
proc export data=test outfile='/home/libname/new.xlsx'
dbms=excel replace;
putnames=no;
run;

How to import an excel to sas with getnames = no?

I want to specify new names so I use getnames=no property:
data mylib.test;
infile "C:\Users\test.xlsx" ;
input var1 $ Opened_Date mmddyy8. salary dollar9.2;
DBMS=EXCEL ;
range="Sheet5$";
getnames=no;
mixed=no;
scantext=yes;
usedate=yes;
scantime=yes;
datarow=3;
run;
But this does not import anything
PS the following code with getnames=yes works fine . This means that there is no problems with excel file . But i don't want to use yes I need getnames=no
PROC IMPORT OUT= WORK.TEST
DATAFILE= "C:\Userstest.xlsx"
DBMS=EXCEL REPLACE;
RANGE="Sheet5$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
PROC IMPORT OUT= WORK.TEST
DATAFILE= "C:\Userstest.xlsx"
DBMS=EXCEL REPLACE;
RANGE="Sheet5$";
GETNAMES=NO;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
The data step is not helpful in this case. You can't import an excel file that way (practically speaking).
After this, you then create a data step and can rename things however you want from the generic names initially assigned by the PROC IMPORT.
An alternative that looks like the data step method is libname access.
libname myexcel excel "c:\Usertest.xlsx" getnames=no scantext=yes mixed=no usedate=yes scantime=yes;
Then you can access the file like
data test;
set myexcel,'Sheet5$'n;
rename f1=var1 f2=opened_date (...more...);
run;
I tend to use PROC IMPORT as it's a bit easier to understand for others, but both are equivalent in how they work (PROC IMPORT creates this libname for you, basically).