IGNORE DATA IN SAS IMPORT FROM EXCEL - sas

I have no working knowledge of SAS, but I have an excel file that I need to import and work with. In the excel file there are about 100 rows (observations) and 7 columns (quantities). In some cases, a particular observation may not have any data in one column. I need to completely ignore that observation when reading my data into SAS. I'm wondering what the commands for this would be.
An obvious cheap solution would be to delete the rows in the excel file with missing data, but I want to do this with SAS commands, because I want to learn some SAS.
Thanks!

Import the data however you want, for example with the IMPORT procedure, as Stig Eide mentioned.
proc import
datafile = 'C:\...\file.xlsx'
dbms = xlsx
out = xldata
replace;
mixed = YES;
getnames = YES;
run;
Explanation:
The DBMS= option specifies how SAS will try to read the data. If your file is an Excel 2007+ file, i.e. xlsx, then you can use DBMS=XLSX as shown here. If your file is older, e.g. xls rather than xlsx, try DBMS=EXCEL.
The OUT= option names the output dataset.
If a single level name is specified, the dataset is written to the WORK library. That's the temporary library that's unique to each SAS session. It gets deleted when the session ends.
To create a permanent dataset, specify a two level name, like mylib.xldata, where mylib refers to a SAS library reference (libref) created with a LIBNAME statement.
REPLACE replaces the dataset created the first time you run this step.
MIXED=YES tells SAS that the data may be of mixed types.
GETNAMES=YES will name your SAS dataset variables based on the column names in Excel.
If I understand you correctly, you want to remove every observation in the dataset that has a missing value in any of the seven columns. There are fancier ways to do this, but I recommend a simple approach like this:
data xldata;
set xldata;
where cmiss(col1, col2, ..., col7) = 0;
run;
The CMISS function counts the number of missing values in the variables you specify at each observation, regardless of the data type. Since we're using WHERE CMISS()=0, the resulting dataset will contain only the records with no missing data for any of the seven columns.
When in doubt, try browsing the SAS online documentation. It's very thorough.

If you have "SAS/ACCESS Interface to PC Files" licensed (hint: proc setinit) you can import the Excel file with this code. The where option lets you select which rows you want to keep, in this example you will keep the rows where the column "name" is not blank:
proc import
DATAFILE="your file.xlsx"
DBMS=XLSX
OUT=resulttabel(where=(name ne ""))
REPLACE;
MIXED=YES;
QUIT;

Related

SAS Enterprise Guide - Export Temporary "WORK" table Into Multiple Worksheets in Same Workbook Based on Criteria

I've searched and searched, but I can't seem to find a solution.
I'm on 8.2 Update 4 (8.2.4.1261) (32-bit).
Can someone help me export a table into an Excel file with multiple sheets?
Example: Table contains a list of USA professional sports teams (with their city/nickname and league they are in). Each tab would be separated by league (NBA, MLB, NFL, NHL, MLS). How would I go about this?
I'd also like to add the current date to the output Excel file name. From what I've read, I can create a variable with the sysdate. Just not sure how to incorporate it in the code.
Each sheet name is to be based on a group.
In SAS we use the BY statement to specify the variable(s) that form a group -- in your case, it is the one variable league
The BY statement makes special tokens available during output, #BYVAR<n> and #BYVAL<n>
You want the value of the league to be important (sheet name) during output so #BYVAL1
ODS EXCEL sheet name construction is specified in the OPTIONS option. You can use the BY token in the option.
You want ODS EXCEL ... OPTIONS (sheet_name="#BYVAL1") ...;
A BY statement in output procedures produces an extra line <BYVAR>=<BYVAL> which would be noise in your Excel output.
Prevent the noise with OPTIONS NOBYLINE;
Example:
ods excel file='output.xlsx' options(sheet_name="#BYVAL1");
options nobyline;
proc print noobs data=sashelp.cars;
by make;
run;
ods excel close;
Produces
More
If you want the BY variable to appear in the PROC PRINT output use a VAR _ALL_; statement, or list the columns wanted explicity.
Adding the BY variable to the output will make things easier if you need to import and combine data that was edited in Excel, or if you are making various charts and such. In reality you might be better of producing a single worksheet with all the data and then using autofilter (which is an ODS EXCEL option) and Excel pivot tables/charts if you are doing other work in Excel. (Proc TABULATE and REPORT can do a very large subset of what pivot tables can do.)

Assigning Headers from one file to multiple data files

I have a list of ~100 files. The first file contains header information for the other 98 data files. The information should be in table format, however each table is a different size (with regards to column and row number).
My goal is to import these files such that the column headers from the first file are correctly assigned.
Additional information:
I am told this list of files was generated using SAS (however I am not familiar with the file format) Furthermore, the "CIMPORT" command does not work on these files.
The files are "|" delineated
Thank you very much for any help.
This was a fun issue. I came up with following way:
First lets load up some data.
proc import datafile = "\\Datadrive\mydata.csv"
out=w_headers;
delimiter=";";
guessingrows=32767;
run;
proc import datafile = "\\Datadrive\no_headers.csv"
out=no_headers;
delimiter=";";
guessingrows=32767;
run;
Then I extract the names of the columns and variable number to a dataset.
proc contents data=w_headers out=meta(keep=NAME VARNUM) noprint ; run ;
Then I create commands to renaming the columns without names to have proper names based on the existing. ones.
data meta;
set meta;
cmd = cats('VAR',VARNUM,'=', name);
run;
Here comes the kicker, I put the the commends to a variable. Next the variable is fed to proc datasets for renaming the columns.
proc sql noprint;
select cmd into :cmd_list separated by ' ' from meta;
quit;
proc datasets library = work nolist;
modify no_headers;
rename &cmd_list;
quit;
At this point my two datasets have identical column names. the method is a bit tricky, but works. I'm sure there is another way, but this was fun one. :)

how to overcome missing values when importing date from excel to sas 9.4

I am trying to import an excel sheet into sas9.4. Interestingly, I can import most of the data without any problem. But for the date, there are lots of random missing values (there is no missing date in my excel file). Can anyone tell me how to improve my code please.
proc import out= sheet
datafile = 'D:\Date.xlsx'
dbms = excelcs replace;
sheet = "abc" ;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=NO;
run;
all date looks like this:21/06/2010, 22/06/2010.
Change your DBMS to XLSX and USEDATE to No. Then you'll import the field as a text field.
You can then use an input() function to create a new date variable.
Not ideal, but easily accomplished.
More than likely, your problem is that the automatic conversion is considering those mm/dd/yyyy, but of course they are actually dd/mm/yyyy.
One possible solution is to use the SASDATEFMT option, documented here:
proc import file="myfile.xlsx" out=dataset dbms=excel replace;
dbdsopts="sasdatefmt=(varname=DDMMYY10.)";
run;
That sets the SAS format, but is also alleged by the documentation to affect the informat used to convert it.
It's also possible, though, that your data is actually mixed character/numeric (as it would be if they were entered by hand into excel, in an excel that was expecting mm/dd/yy, and instead were dd/mm/yy). In that case, the simplest answer is to either change your registry to tell Microsoft to scan the whole column (see this SAS tech support note for example ), or to simply convert all of the values to character (or at least the first couple), and then add a mixed=yes; line to your proc import statement.
(The registry setting may not have an effect if you're using PC Files Server, which you may be given the excelcs dbms above. In that case, ignore that particular suggestion.)

How to create "standardized" Excel workbooks using SAS

I have a "wide" SAS data sets that must be exported into a new Excel workbook every week. I want to preserve the column widths and other Excel attributes every week, but I'm having problems getting it to work. Here's what I'm attempting.
I used PROC EXPORT to create a new workbook (using sheet="New_TACs").
I manually adjusted the column widths and other sheet attributes
(like "filters", column widths, wrap, alignment, and "freeze panes").
I deleted all the data rows (leaving the first row with the column
names) and saved it as a new workbook named "template.xlsx".
Using a SAS system call, I copy "template.xlsx" to "this_week.xlsx".
I use PROC EXPORT again to try and update the new workbook, but I
get warnings. The result contains a sheet named "New_TACS1".
Here is the SAS log:
720 proc export data=new_tacs
721 outfile="\\server-path\this_week.xlsx"
722 replace;
723 sheet='New_TACs';
724 run;
WARNING: The target file may contain unmatched range name and sheet name.
WARNING: The target file may contain unmatched range name and sheet name.
WARNING: File _IMEX_.New_TACs.DATA does not exist.
WARNING: Table _IMEX_."New_TACs" has not been dropped.
NOTE: "New_TACs" range/sheet was successfully created.
NOTE: PROCEDURE EXPORT used (Total process time):
real time 23.88 seconds
cpu time 1.80 seconds
I'm at a loss as to what to do and would appreciate any ideas or suggestions.
I think the issue is that with zero rows, SAS isn't properly dealing with the data. I can't get PROC EXPORT to work at all, but with a single dummy row I can at least get it to behave with libname and PROC APPEND. I wouldn't be surprise if the filters are in part responsible for this.
After creating the blank excel file with the SASHELP.CLASS columns, adding a filter, adding one row of dummy data, and saving/closing, I do: (SCANTEXT=NO is mandatory here for update access)
libname newtac "c:\temp\test.xlsx" scantext=no getnames=yes;
proc append base=newtac.'New_TACs$_xlnm#_FilterDatabase'n data=sashelp.class force;
run;
libname newtac clear;
That gets close, at least. I'm getting some blank rows for some reason, perhaps due to other things I did in looking at this.
Your best solution may well be to wait for 9.4 TS1M0 and ODS EXCEL, which will let you do all these things from SAS directly; or to use DDE.
I would recommend checking out SaviCells. http://www.sascommunity.org/wiki/SaviCells. It provides much better SAS to Excel functionality, including creating a template with all your Excel formatting and using that with new data.
Use DDE in SAS to achieve this.
You can create your excel template the way you want it to appear.
Using DDE you would then:
Open Excel
Open the excel file you want to use as the template
Populate it with the updated data
Save the file as a new filename
It's a bit of an antiquated technology but it gets the job done.
Googling for SAS and DDE will find you plenty of code exmaples and tutorials.

Import data from European Social Survey

I need to import data from European Social Survey databank to SAS.
I'm not very good at using SAS so I just naively tried importing the text file one gets but it stores it all in one variable.
Can someone maybe help me with what to do? Since there doesn't seem to be a guide on their webpage I reckon it has to be pretty easy.
It's free to register (and takes 5 secs) and I need all possible data for Denmark.
Edit: When downloading what they call a SAS file, what i get is a huge proc format and the same text file as one gets by choosing text.
The data in the text file isn't comma separated and the first row does not contain variable names.
Download it in SAS format. Save the text file in a location you can remember, and open the SAS file. It's not just one big proc format; it's a big proc format followed by a datastep with input code. It was probably created by SPSS (it fits the pattern of an SPSS saved .sas file anyhow). Look for:
DATA OUT.ESS1_4e01_0_F1;
Or something like that (that's what it is when I downloaded it). It's probably about 3/4 of the way down the page. You just need to change the code:
INFILE 'ESS1_4e01_0_F1.txt';
or similar, to be the directory you placed the text file in. Create a LIBNAME for OUT that goes to wherever you want to permanently save this, and do that at the start of the .sas file, replacing the top 3 lines like so.
Originally:
LIBNAME LIBRARY '';
LIBNAME OUT '';
PROC FORMAT LIBRARY=LIBRARY ;
Change these to:
libname out "c:\mystuff\"; *but probably not c:\mystuff :);
options fmtsearch=(out);
proc format lib=out;
Then run the entire thing.
This is the best solution if you want the formatted values (value labels) and variable labels. If you don't care about that, then it might be easier to deal with the CSV like Bob shows.
But the website says yu can download SAS format, why don't you?
You need a delimiter if all goes into one column.
data temp;
length ...;
infile 'file.csv' dlm=',';
input ...;
run;
As Dirk says, the web site says you can download a SAS dataset directly. However, if there's some reason you don't want to do that, choose a comma separated file (CSV) and use PROC IMPORT. Here is an example:
proc import out=some_data
datafile='c:\path\somedata.csv'
dbms=csv replace;
getnames=yes;
run;
Of course, this assumes the first row contains column names.