I have a macro that I use to import Excel files from a Windows directory to SAS (version 9.3) on a Linux server. In general the macro has worked fine, but now I'm trying to import an Excel file with a column that contains mostly numeric data with some character records thrown in.
The variable looks something like this:
Var2
1111111
2222222
3333333
4444444
Multiple
5555555
H6666-01
The variable is getting read in as numeric so I'm losing the data in the fifth and seventh records. I've tried a few of the suggestions listed in this answer, but nothing seems to change the variable type.
Here's a portion of the macro I have:
proc import replace
out=&d_set
dbms=excelcs
file="\\path\to\file\&xlsx_nm";
sheet="&sheet_nm";
server="Server";
port=0000;
serveruser="&sysget_USER";
serverpassword="&pw";
range="&rng";
DBDSOPTS = "DBTYPE=(Var2='CHAR(8)')";
run;
I just added the statement DBDSOPTS = "DBTYPE=(Var2='CHAR(8)')"; based on the suggestion on the link above, but the output in the log did not change.
I have also tried padding the original Excel file with a "dummy" record (which I'd like to avoid) with character data in the column that I'm having issues with, but this also did not work.
I'd like to solve this in the import procedure but I'm open to other suggestions.
Related
I am trying to import an excel sheet into sas9.4. Interestingly, I can import most of the data without any problem. But for the date, there are lots of random missing values (there is no missing date in my excel file). Can anyone tell me how to improve my code please.
proc import out= sheet
datafile = 'D:\Date.xlsx'
dbms = excelcs replace;
sheet = "abc" ;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=NO;
run;
all date looks like this:21/06/2010, 22/06/2010.
Change your DBMS to XLSX and USEDATE to No. Then you'll import the field as a text field.
You can then use an input() function to create a new date variable.
Not ideal, but easily accomplished.
More than likely, your problem is that the automatic conversion is considering those mm/dd/yyyy, but of course they are actually dd/mm/yyyy.
One possible solution is to use the SASDATEFMT option, documented here:
proc import file="myfile.xlsx" out=dataset dbms=excel replace;
dbdsopts="sasdatefmt=(varname=DDMMYY10.)";
run;
That sets the SAS format, but is also alleged by the documentation to affect the informat used to convert it.
It's also possible, though, that your data is actually mixed character/numeric (as it would be if they were entered by hand into excel, in an excel that was expecting mm/dd/yy, and instead were dd/mm/yy). In that case, the simplest answer is to either change your registry to tell Microsoft to scan the whole column (see this SAS tech support note for example ), or to simply convert all of the values to character (or at least the first couple), and then add a mixed=yes; line to your proc import statement.
(The registry setting may not have an effect if you're using PC Files Server, which you may be given the excelcs dbms above. In that case, ignore that particular suggestion.)
I have no working knowledge of SAS, but I have an excel file that I need to import and work with. In the excel file there are about 100 rows (observations) and 7 columns (quantities). In some cases, a particular observation may not have any data in one column. I need to completely ignore that observation when reading my data into SAS. I'm wondering what the commands for this would be.
An obvious cheap solution would be to delete the rows in the excel file with missing data, but I want to do this with SAS commands, because I want to learn some SAS.
Thanks!
Import the data however you want, for example with the IMPORT procedure, as Stig Eide mentioned.
proc import
datafile = 'C:\...\file.xlsx'
dbms = xlsx
out = xldata
replace;
mixed = YES;
getnames = YES;
run;
Explanation:
The DBMS= option specifies how SAS will try to read the data. If your file is an Excel 2007+ file, i.e. xlsx, then you can use DBMS=XLSX as shown here. If your file is older, e.g. xls rather than xlsx, try DBMS=EXCEL.
The OUT= option names the output dataset.
If a single level name is specified, the dataset is written to the WORK library. That's the temporary library that's unique to each SAS session. It gets deleted when the session ends.
To create a permanent dataset, specify a two level name, like mylib.xldata, where mylib refers to a SAS library reference (libref) created with a LIBNAME statement.
REPLACE replaces the dataset created the first time you run this step.
MIXED=YES tells SAS that the data may be of mixed types.
GETNAMES=YES will name your SAS dataset variables based on the column names in Excel.
If I understand you correctly, you want to remove every observation in the dataset that has a missing value in any of the seven columns. There are fancier ways to do this, but I recommend a simple approach like this:
data xldata;
set xldata;
where cmiss(col1, col2, ..., col7) = 0;
run;
The CMISS function counts the number of missing values in the variables you specify at each observation, regardless of the data type. Since we're using WHERE CMISS()=0, the resulting dataset will contain only the records with no missing data for any of the seven columns.
When in doubt, try browsing the SAS online documentation. It's very thorough.
If you have "SAS/ACCESS Interface to PC Files" licensed (hint: proc setinit) you can import the Excel file with this code. The where option lets you select which rows you want to keep, in this example you will keep the rows where the column "name" is not blank:
proc import
DATAFILE="your file.xlsx"
DBMS=XLSX
OUT=resulttabel(where=(name ne ""))
REPLACE;
MIXED=YES;
QUIT;
I have data inside SAS.
I want to store the datafile to SPSS format (*.sav)
I use the following program:
PROC export Data=SASdataToStoreInSPSS
FILE="Path\Filename_%sysfunc(today(),date9.).sav"
dbms=sav replace;
RUN;
This works great. Except when I open the file in SPSS the dates are strangly formatted.
For example:
156405 08:51:00
Should be
3-Jan-2011 08:51
I can manually change the data formats in SPSS. So the values are correct date values, except they are not automatically formatted in a readable format.
I tried to change the format in SAS before saving to DATETIME20. or DATETIME23.3. But this does not help.
I want this to work without having to open SPSS and run a Syntax there.
The SPSS files that SAS spits out have to be directly mailed to other users of the data.
I think this is either a bug with SAS's export, or an issue with SPSS where some default changed. What's happening is that SAS is storing it as a SPSS Date - but with width 16, which is not long enough to hold the complete datetime. I don't think you can use DBDSOPTS with DBMS=SPSS, so I don't know that there is a good workaround short of importing the file into SPSS.
You could do that automatically, though, using the SPSS Production facility; I've written an import script before and asked SAS to run spssprod with the batch file. That's an irritating workaround, but it might be the easiest, unless SAS Tech Support can help you (and certainly try that - they are usually only a few hours' turnaround for initial contact at least).
SAS mentioned it has to do with the SPSS driver they use. Apparently it is not an easy fix so they forwarded the issue to second-line tech support.
The workaround you will need is split the dates in two columns. One with date and one with time.
data SPSS2;
set SPSS;
date = put(datepart(DatumSPSS), date9.);
time = put(timepart(DatumSPSS), time8.);
run;
Or you can tell the end user how to change the format of the date in SPSS.
For an automated approach, try this .NET app. You need SPSS, but SAS is not required to convert a large collection of SAS files automatically.
Manual Process included code samples or Application Download
I have a "wide" SAS data sets that must be exported into a new Excel workbook every week. I want to preserve the column widths and other Excel attributes every week, but I'm having problems getting it to work. Here's what I'm attempting.
I used PROC EXPORT to create a new workbook (using sheet="New_TACs").
I manually adjusted the column widths and other sheet attributes
(like "filters", column widths, wrap, alignment, and "freeze panes").
I deleted all the data rows (leaving the first row with the column
names) and saved it as a new workbook named "template.xlsx".
Using a SAS system call, I copy "template.xlsx" to "this_week.xlsx".
I use PROC EXPORT again to try and update the new workbook, but I
get warnings. The result contains a sheet named "New_TACS1".
Here is the SAS log:
720 proc export data=new_tacs
721 outfile="\\server-path\this_week.xlsx"
722 replace;
723 sheet='New_TACs';
724 run;
WARNING: The target file may contain unmatched range name and sheet name.
WARNING: The target file may contain unmatched range name and sheet name.
WARNING: File _IMEX_.New_TACs.DATA does not exist.
WARNING: Table _IMEX_."New_TACs" has not been dropped.
NOTE: "New_TACs" range/sheet was successfully created.
NOTE: PROCEDURE EXPORT used (Total process time):
real time 23.88 seconds
cpu time 1.80 seconds
I'm at a loss as to what to do and would appreciate any ideas or suggestions.
I think the issue is that with zero rows, SAS isn't properly dealing with the data. I can't get PROC EXPORT to work at all, but with a single dummy row I can at least get it to behave with libname and PROC APPEND. I wouldn't be surprise if the filters are in part responsible for this.
After creating the blank excel file with the SASHELP.CLASS columns, adding a filter, adding one row of dummy data, and saving/closing, I do: (SCANTEXT=NO is mandatory here for update access)
libname newtac "c:\temp\test.xlsx" scantext=no getnames=yes;
proc append base=newtac.'New_TACs$_xlnm#_FilterDatabase'n data=sashelp.class force;
run;
libname newtac clear;
That gets close, at least. I'm getting some blank rows for some reason, perhaps due to other things I did in looking at this.
Your best solution may well be to wait for 9.4 TS1M0 and ODS EXCEL, which will let you do all these things from SAS directly; or to use DDE.
I would recommend checking out SaviCells. http://www.sascommunity.org/wiki/SaviCells. It provides much better SAS to Excel functionality, including creating a template with all your Excel formatting and using that with new data.
Use DDE in SAS to achieve this.
You can create your excel template the way you want it to appear.
Using DDE you would then:
Open Excel
Open the excel file you want to use as the template
Populate it with the updated data
Save the file as a new filename
It's a bit of an antiquated technology but it gets the job done.
Googling for SAS and DDE will find you plenty of code exmaples and tutorials.
I need to import data from European Social Survey databank to SAS.
I'm not very good at using SAS so I just naively tried importing the text file one gets but it stores it all in one variable.
Can someone maybe help me with what to do? Since there doesn't seem to be a guide on their webpage I reckon it has to be pretty easy.
It's free to register (and takes 5 secs) and I need all possible data for Denmark.
Edit: When downloading what they call a SAS file, what i get is a huge proc format and the same text file as one gets by choosing text.
The data in the text file isn't comma separated and the first row does not contain variable names.
Download it in SAS format. Save the text file in a location you can remember, and open the SAS file. It's not just one big proc format; it's a big proc format followed by a datastep with input code. It was probably created by SPSS (it fits the pattern of an SPSS saved .sas file anyhow). Look for:
DATA OUT.ESS1_4e01_0_F1;
Or something like that (that's what it is when I downloaded it). It's probably about 3/4 of the way down the page. You just need to change the code:
INFILE 'ESS1_4e01_0_F1.txt';
or similar, to be the directory you placed the text file in. Create a LIBNAME for OUT that goes to wherever you want to permanently save this, and do that at the start of the .sas file, replacing the top 3 lines like so.
Originally:
LIBNAME LIBRARY '';
LIBNAME OUT '';
PROC FORMAT LIBRARY=LIBRARY ;
Change these to:
libname out "c:\mystuff\"; *but probably not c:\mystuff :);
options fmtsearch=(out);
proc format lib=out;
Then run the entire thing.
This is the best solution if you want the formatted values (value labels) and variable labels. If you don't care about that, then it might be easier to deal with the CSV like Bob shows.
But the website says yu can download SAS format, why don't you?
You need a delimiter if all goes into one column.
data temp;
length ...;
infile 'file.csv' dlm=',';
input ...;
run;
As Dirk says, the web site says you can download a SAS dataset directly. However, if there's some reason you don't want to do that, choose a comma separated file (CSV) and use PROC IMPORT. Here is an example:
proc import out=some_data
datafile='c:\path\somedata.csv'
dbms=csv replace;
getnames=yes;
run;
Of course, this assumes the first row contains column names.