truncated headers with SAS during proc import - sas

I'm having large headers for my columns like the one below (it is only a sample as I have 2000 columns with as many headers).
Each column is separated by a semi-colon.
BAL_RT,ET-CAP,EXT_EA16,SEXL-SA,UK;BAL_RT,ET-CAP,EXT_EA16,IBON-SA,TA;BAL_RT,ET-CAP,EXT_EA16,TARO-SA,XR
1;7.2;3
35;8;0.99
I'm using the following command line in SAS to do the import
options macrogen symbolgen ;
PROC IMPORT OUT= Work.fic38_fic1
DATAFILE= "C:\cygwin\home\appEuro\pot\fic38.csv"
DBMS=DLM REPLACE;
DELIMITER='3B'x;
GETNAMES=YES;
DATAROW=2;
GUESSINGROWS=32767;
RUN;
proc sort data=Work.fic38_fic1 ; by date ; run ;
However, for some unknown reasons, the headers got truncated.
BAL_RT,ET-CAP,EXT_EA16,SEXL-SA;BAL_RT,ET-CAP,EXT_EA16,IBON-SA;BAL_RT,ET-CAP,EXT_EA16,TARO-SA
I read the internet and they were talking about the option LRCL.
Does it make sense to anyone?
Any help will be appreciated.
Cheers.

It sounds like the issue is actually that you have 34-50 character wide variable names. SAS has a maximum of 32 characters for variable names, so you will not be able to use the entire length in the variable name. You may be able to use it as a variable label, but likely you would need to code that yourself if PROC IMPORT isn't going to do it for you. You could take the code out of the log and use that code with the additional text added by hand if you like.

Related

Proc Import always changes last column name to VARx

I am trying to import a csv using proc import.
proc import datafile='/SourceFiles/UserTable.csv'
out=UserTable dbms=csv replace;
getnames=yes;
run;
The column names are captured correctly except for the last. The last column always changes to VARx. For testing purposes, I even change my dataset to have one column and one value so that it's like
USER
Johnson
But USER changes to Var1 as well. I'm pretty sure I'm not violating any naming conventions.
Anybody have any ideas?
try this, it worked for me using sas 9.3
proc import datafile="C:\Users\OldSalt\Desktop\test.csv"
out=mydata
dbms=csv
replace;
getnames=yes;
run;
So, this looks the same as your script. My is running just fine on a PC. It looks like you are running on a UNIX box. Check your input file. It might be corrupted.

how to overcome missing values when importing date from excel to sas 9.4

I am trying to import an excel sheet into sas9.4. Interestingly, I can import most of the data without any problem. But for the date, there are lots of random missing values (there is no missing date in my excel file). Can anyone tell me how to improve my code please.
proc import out= sheet
datafile = 'D:\Date.xlsx'
dbms = excelcs replace;
sheet = "abc" ;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=NO;
run;
all date looks like this:21/06/2010, 22/06/2010.
Change your DBMS to XLSX and USEDATE to No. Then you'll import the field as a text field.
You can then use an input() function to create a new date variable.
Not ideal, but easily accomplished.
More than likely, your problem is that the automatic conversion is considering those mm/dd/yyyy, but of course they are actually dd/mm/yyyy.
One possible solution is to use the SASDATEFMT option, documented here:
proc import file="myfile.xlsx" out=dataset dbms=excel replace;
dbdsopts="sasdatefmt=(varname=DDMMYY10.)";
run;
That sets the SAS format, but is also alleged by the documentation to affect the informat used to convert it.
It's also possible, though, that your data is actually mixed character/numeric (as it would be if they were entered by hand into excel, in an excel that was expecting mm/dd/yy, and instead were dd/mm/yy). In that case, the simplest answer is to either change your registry to tell Microsoft to scan the whole column (see this SAS tech support note for example ), or to simply convert all of the values to character (or at least the first couple), and then add a mixed=yes; line to your proc import statement.
(The registry setting may not have an effect if you're using PC Files Server, which you may be given the excelcs dbms above. In that case, ignore that particular suggestion.)

IGNORE DATA IN SAS IMPORT FROM EXCEL

I have no working knowledge of SAS, but I have an excel file that I need to import and work with. In the excel file there are about 100 rows (observations) and 7 columns (quantities). In some cases, a particular observation may not have any data in one column. I need to completely ignore that observation when reading my data into SAS. I'm wondering what the commands for this would be.
An obvious cheap solution would be to delete the rows in the excel file with missing data, but I want to do this with SAS commands, because I want to learn some SAS.
Thanks!
Import the data however you want, for example with the IMPORT procedure, as Stig Eide mentioned.
proc import
datafile = 'C:\...\file.xlsx'
dbms = xlsx
out = xldata
replace;
mixed = YES;
getnames = YES;
run;
Explanation:
The DBMS= option specifies how SAS will try to read the data. If your file is an Excel 2007+ file, i.e. xlsx, then you can use DBMS=XLSX as shown here. If your file is older, e.g. xls rather than xlsx, try DBMS=EXCEL.
The OUT= option names the output dataset.
If a single level name is specified, the dataset is written to the WORK library. That's the temporary library that's unique to each SAS session. It gets deleted when the session ends.
To create a permanent dataset, specify a two level name, like mylib.xldata, where mylib refers to a SAS library reference (libref) created with a LIBNAME statement.
REPLACE replaces the dataset created the first time you run this step.
MIXED=YES tells SAS that the data may be of mixed types.
GETNAMES=YES will name your SAS dataset variables based on the column names in Excel.
If I understand you correctly, you want to remove every observation in the dataset that has a missing value in any of the seven columns. There are fancier ways to do this, but I recommend a simple approach like this:
data xldata;
set xldata;
where cmiss(col1, col2, ..., col7) = 0;
run;
The CMISS function counts the number of missing values in the variables you specify at each observation, regardless of the data type. Since we're using WHERE CMISS()=0, the resulting dataset will contain only the records with no missing data for any of the seven columns.
When in doubt, try browsing the SAS online documentation. It's very thorough.
If you have "SAS/ACCESS Interface to PC Files" licensed (hint: proc setinit) you can import the Excel file with this code. The where option lets you select which rows you want to keep, in this example you will keep the rows where the column "name" is not blank:
proc import
DATAFILE="your file.xlsx"
DBMS=XLSX
OUT=resulttabel(where=(name ne ""))
REPLACE;
MIXED=YES;
QUIT;

How to start importing from second line in sas?

I try to use datarow oprion
proc import
datafile = "C:....\Book1.xlsx"
out=test
dbms=excel replace
;
RANGE="'test$'";
datarow=2; /* this line throws an error Statement is not valid or it is used out of proper order */
getnames=no ;
run;
When i use the above code without datarow the file gets imported fine . How can i start importing form second line ? I also tried startrow ( I have sas 9.2 )
When you use getnames=no the default behaviour is to read from line 1. That's why you're getting your error.
Try setting getnames=yes which reads from line 2 by default.
If you are using a custom range, as you are in that code, you cannot prevent it from taking the first row (unless they're used by names, but you reject that). You need to re-create the range to not include the first row.
If they don't harm the import (ie, they don't cause the data types to change), you could always eliminate the first row via subsetting the dataset (how depends on the circumstances of the data).
Below is what worked for me; the data can be either xlsx or xlsm (at least that is what i tested):
PROC IMPORT OUT=Out_Table
DATAFILE=Import_File
DBMS=EXCEL REPLACE;
SHEET="Sheetname$";
RANGE="A2:S";
GETNAMES=YES;
MIXED=YES;
run;

Import data from European Social Survey

I need to import data from European Social Survey databank to SAS.
I'm not very good at using SAS so I just naively tried importing the text file one gets but it stores it all in one variable.
Can someone maybe help me with what to do? Since there doesn't seem to be a guide on their webpage I reckon it has to be pretty easy.
It's free to register (and takes 5 secs) and I need all possible data for Denmark.
Edit: When downloading what they call a SAS file, what i get is a huge proc format and the same text file as one gets by choosing text.
The data in the text file isn't comma separated and the first row does not contain variable names.
Download it in SAS format. Save the text file in a location you can remember, and open the SAS file. It's not just one big proc format; it's a big proc format followed by a datastep with input code. It was probably created by SPSS (it fits the pattern of an SPSS saved .sas file anyhow). Look for:
DATA OUT.ESS1_4e01_0_F1;
Or something like that (that's what it is when I downloaded it). It's probably about 3/4 of the way down the page. You just need to change the code:
INFILE 'ESS1_4e01_0_F1.txt';
or similar, to be the directory you placed the text file in. Create a LIBNAME for OUT that goes to wherever you want to permanently save this, and do that at the start of the .sas file, replacing the top 3 lines like so.
Originally:
LIBNAME LIBRARY '';
LIBNAME OUT '';
PROC FORMAT LIBRARY=LIBRARY ;
Change these to:
libname out "c:\mystuff\"; *but probably not c:\mystuff :);
options fmtsearch=(out);
proc format lib=out;
Then run the entire thing.
This is the best solution if you want the formatted values (value labels) and variable labels. If you don't care about that, then it might be easier to deal with the CSV like Bob shows.
But the website says yu can download SAS format, why don't you?
You need a delimiter if all goes into one column.
data temp;
length ...;
infile 'file.csv' dlm=',';
input ...;
run;
As Dirk says, the web site says you can download a SAS dataset directly. However, if there's some reason you don't want to do that, choose a comma separated file (CSV) and use PROC IMPORT. Here is an example:
proc import out=some_data
datafile='c:\path\somedata.csv'
dbms=csv replace;
getnames=yes;
run;
Of course, this assumes the first row contains column names.