I try to use datarow oprion
proc import
datafile = "C:....\Book1.xlsx"
out=test
dbms=excel replace
;
RANGE="'test$'";
datarow=2; /* this line throws an error Statement is not valid or it is used out of proper order */
getnames=no ;
run;
When i use the above code without datarow the file gets imported fine . How can i start importing form second line ? I also tried startrow ( I have sas 9.2 )
When you use getnames=no the default behaviour is to read from line 1. That's why you're getting your error.
Try setting getnames=yes which reads from line 2 by default.
If you are using a custom range, as you are in that code, you cannot prevent it from taking the first row (unless they're used by names, but you reject that). You need to re-create the range to not include the first row.
If they don't harm the import (ie, they don't cause the data types to change), you could always eliminate the first row via subsetting the dataset (how depends on the circumstances of the data).
Below is what worked for me; the data can be either xlsx or xlsm (at least that is what i tested):
PROC IMPORT OUT=Out_Table
DATAFILE=Import_File
DBMS=EXCEL REPLACE;
SHEET="Sheetname$";
RANGE="A2:S";
GETNAMES=YES;
MIXED=YES;
run;
Related
I have imported a dataset from a an excel sheet, and I want to delete some observations. Say, I have a variable which tells me if a student has passed or not (with strings "Passed" and "Failed"). I want to delete all the students which have failed from the dataset.
I do know that usually I would be able to do so with an if statement. However, I don't know how to access the temporary dataset. Do I have to open after importing it, and then check with an if statement?
This is how I have tried:
proc import datafile="C:\Users\User\Desktop\testresults.xlsx"
DBMS=XLSX;
if Status = "failed" then delete
run;
I know this won't work as the "if" condition only works when the data resides in PDV.
Is it possible to delete after importing instead of while importing?
Use a where clause on the output data set:
proc import file="my.xlsx"
out=work.myxlsx(where=(status^="failed"))
dbms=xlsx
replace;
run;
A where statement would modify the output dataset from PROC IMPORT, as DomPazz shows.
Alternately, you can use a data step.
proc import datafile="C:\Users\User\Desktop\testresults.xlsx" out=have DBMS=XLSX;
run;
data want;
set have;
if Status = "failed" then delete;
run;
That of course would work whether you did it immediately after importing (or in the same submit) or some time later.
I am trying to import a csv using proc import.
proc import datafile='/SourceFiles/UserTable.csv'
out=UserTable dbms=csv replace;
getnames=yes;
run;
The column names are captured correctly except for the last. The last column always changes to VARx. For testing purposes, I even change my dataset to have one column and one value so that it's like
USER
Johnson
But USER changes to Var1 as well. I'm pretty sure I'm not violating any naming conventions.
Anybody have any ideas?
try this, it worked for me using sas 9.3
proc import datafile="C:\Users\OldSalt\Desktop\test.csv"
out=mydata
dbms=csv
replace;
getnames=yes;
run;
So, this looks the same as your script. My is running just fine on a PC. It looks like you are running on a UNIX box. Check your input file. It might be corrupted.
I am trying to import an excel sheet into sas9.4. Interestingly, I can import most of the data without any problem. But for the date, there are lots of random missing values (there is no missing date in my excel file). Can anyone tell me how to improve my code please.
proc import out= sheet
datafile = 'D:\Date.xlsx'
dbms = excelcs replace;
sheet = "abc" ;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=NO;
run;
all date looks like this:21/06/2010, 22/06/2010.
Change your DBMS to XLSX and USEDATE to No. Then you'll import the field as a text field.
You can then use an input() function to create a new date variable.
Not ideal, but easily accomplished.
More than likely, your problem is that the automatic conversion is considering those mm/dd/yyyy, but of course they are actually dd/mm/yyyy.
One possible solution is to use the SASDATEFMT option, documented here:
proc import file="myfile.xlsx" out=dataset dbms=excel replace;
dbdsopts="sasdatefmt=(varname=DDMMYY10.)";
run;
That sets the SAS format, but is also alleged by the documentation to affect the informat used to convert it.
It's also possible, though, that your data is actually mixed character/numeric (as it would be if they were entered by hand into excel, in an excel that was expecting mm/dd/yy, and instead were dd/mm/yy). In that case, the simplest answer is to either change your registry to tell Microsoft to scan the whole column (see this SAS tech support note for example ), or to simply convert all of the values to character (or at least the first couple), and then add a mixed=yes; line to your proc import statement.
(The registry setting may not have an effect if you're using PC Files Server, which you may be given the excelcs dbms above. In that case, ignore that particular suggestion.)
I have no working knowledge of SAS, but I have an excel file that I need to import and work with. In the excel file there are about 100 rows (observations) and 7 columns (quantities). In some cases, a particular observation may not have any data in one column. I need to completely ignore that observation when reading my data into SAS. I'm wondering what the commands for this would be.
An obvious cheap solution would be to delete the rows in the excel file with missing data, but I want to do this with SAS commands, because I want to learn some SAS.
Thanks!
Import the data however you want, for example with the IMPORT procedure, as Stig Eide mentioned.
proc import
datafile = 'C:\...\file.xlsx'
dbms = xlsx
out = xldata
replace;
mixed = YES;
getnames = YES;
run;
Explanation:
The DBMS= option specifies how SAS will try to read the data. If your file is an Excel 2007+ file, i.e. xlsx, then you can use DBMS=XLSX as shown here. If your file is older, e.g. xls rather than xlsx, try DBMS=EXCEL.
The OUT= option names the output dataset.
If a single level name is specified, the dataset is written to the WORK library. That's the temporary library that's unique to each SAS session. It gets deleted when the session ends.
To create a permanent dataset, specify a two level name, like mylib.xldata, where mylib refers to a SAS library reference (libref) created with a LIBNAME statement.
REPLACE replaces the dataset created the first time you run this step.
MIXED=YES tells SAS that the data may be of mixed types.
GETNAMES=YES will name your SAS dataset variables based on the column names in Excel.
If I understand you correctly, you want to remove every observation in the dataset that has a missing value in any of the seven columns. There are fancier ways to do this, but I recommend a simple approach like this:
data xldata;
set xldata;
where cmiss(col1, col2, ..., col7) = 0;
run;
The CMISS function counts the number of missing values in the variables you specify at each observation, regardless of the data type. Since we're using WHERE CMISS()=0, the resulting dataset will contain only the records with no missing data for any of the seven columns.
When in doubt, try browsing the SAS online documentation. It's very thorough.
If you have "SAS/ACCESS Interface to PC Files" licensed (hint: proc setinit) you can import the Excel file with this code. The where option lets you select which rows you want to keep, in this example you will keep the rows where the column "name" is not blank:
proc import
DATAFILE="your file.xlsx"
DBMS=XLSX
OUT=resulttabel(where=(name ne ""))
REPLACE;
MIXED=YES;
QUIT;
I'm having large headers for my columns like the one below (it is only a sample as I have 2000 columns with as many headers).
Each column is separated by a semi-colon.
BAL_RT,ET-CAP,EXT_EA16,SEXL-SA,UK;BAL_RT,ET-CAP,EXT_EA16,IBON-SA,TA;BAL_RT,ET-CAP,EXT_EA16,TARO-SA,XR
1;7.2;3
35;8;0.99
I'm using the following command line in SAS to do the import
options macrogen symbolgen ;
PROC IMPORT OUT= Work.fic38_fic1
DATAFILE= "C:\cygwin\home\appEuro\pot\fic38.csv"
DBMS=DLM REPLACE;
DELIMITER='3B'x;
GETNAMES=YES;
DATAROW=2;
GUESSINGROWS=32767;
RUN;
proc sort data=Work.fic38_fic1 ; by date ; run ;
However, for some unknown reasons, the headers got truncated.
BAL_RT,ET-CAP,EXT_EA16,SEXL-SA;BAL_RT,ET-CAP,EXT_EA16,IBON-SA;BAL_RT,ET-CAP,EXT_EA16,TARO-SA
I read the internet and they were talking about the option LRCL.
Does it make sense to anyone?
Any help will be appreciated.
Cheers.
It sounds like the issue is actually that you have 34-50 character wide variable names. SAS has a maximum of 32 characters for variable names, so you will not be able to use the entire length in the variable name. You may be able to use it as a variable label, but likely you would need to code that yourself if PROC IMPORT isn't going to do it for you. You could take the code out of the log and use that code with the additional text added by hand if you like.