How can I import this file, in SAS, without error? - sas

I'm trying to import a file, with this code I receive the error message:
The table "WORK.HOTDOGS" cannot be opened because it does not contain
any columns.
Here is the code:
/* Import file, Print Data Table, Plot bar chart of Type vs Type Frequency */
DATA HotDogs;
INFILE '/folders/myfolders/hot_dogs.sas7bdat';
RUN;
PROC PRINT DATA=HotDogs;
RUN;
Thanks for your time.

You need to specify a library that refers to your SAS data set and/or use a SET statement. Since it looks like you're using SAS UE here's a link to their video tutorials on youtube.
https://www.youtube.com/playlist?list=PLVBcK_IpFVi9cajJtRel2uBLbtcLz-WIN
libname mydata '/folders/myfolders/';
data hotdogs;
set mydata.hot_dogs;
run;
OR
data hotdogs;
set '/folder/myfolders/hot_dogs.sas7bdat';
run;

I had the same error. It needs to be imported as follows:
data work.hotdogs;
INFILE '/folders/myfolders/hot_dogs.sas7bdat';
input make $ Model;
run;

I had the same problem in the SAS UE, solved it by deleting the dataset and re-uploading it. Somehow worked.

Related

Import xlsx file into SAS from static url link

I am new to SAS but used to working in python.
In python, I can read an xlsx file from a static Box.com link by calling pandas read excel function on the Box link.
pd.read_excel("https://box.com/shared/static/<url>.xlsx")
I'm hoping to do something similar in SAS. When I try
PROC IMPORT DATAFILE="https://box.com/shared/static/<url>.xlsx"
OUT=WORK.MYEXCEL
DBMS=XLSX
REPLACE;
RUN;
SAS tries to look for a document with the name "https://box.com/shared/static/.xlsx".
When I try
filename xlsxFile http "https://box.com/shared/static/<url>.xlsx";
PROC IMPORT DATAFILE=xlsxFile
OUT=WORK.MYEXCEL
DBMS=XLSX
REPLACE;
RUN;
I get
ERROR: This "filename URL" access method is not supported by "proc import". Please copy the file to local disk before running the
procedure.
Is there an easy way to have SAS access files from this type of URL? I've checked these few threads:
https://communities.sas.com/t5/General-SAS-Programming/import-excel-file-from-the-web/td-p/134158
https://communities.sas.com/t5/SAS-Programming/Importing-XLSX-from-URL-Issue/td-p/446758
https://communities.sas.com/t5/SAS-Programming/proc-import-xlsx-from-url/td-p/635834
And it seems like I may need to do some work to get SAS to create a file object from the URL but I don't quite understand how the code is working and when I naively try something similar:
filename xlsxFile http "https://box.com/shared/static/<url>.xlsx";
data file;
n=-1;
infile xlsxFile recfm=s nbyte=n length=len;
input;
file "file_name.xlsx" recfm=n;
put _infile_ $varying32767. len;
run;
PROC IMPORT OUT= input DATAFILE= "file_name.xlsx"
DBMS=xls REPLACE;
SHEET="sheet_name";
GETNAMES=YES;
RUN;
I get the following error:
Spreadsheet isn't from Excel V5 or later. Please open it in Excel and Save as V5 or later
Requested Input File Is Invalid
ERROR: Import unsuccessful. See SAS Log for details.
Any help would be appreciated. The reason I'd like to do it this way is because the files at the static links are updated daily and I want to avoid having to copy files to the SAS server every day. So if this won't work I am also interested in other work arounds that would accomplish the same thing.
I can obviously write a script that will fetch the updated files and write them to the server where SAS can access them as needed, but wanted to see if I could get this to work first.
Using the Box API from within SAS would also be another option if anyone has successfully gotten that to work. As I mentioned I am new to SAS so trying to access the API seemed like it would be too difficult at the moment.
Thank you!
You'll definitely need to download the file. As Tom notes in comments, make sure to check this is really an xlsx file; but assuming it is, some suggestions for dealing with it.
Copying it over by hand like you're doing is doable, and I'll put some code that works at the bottom of the answer, but it's way overkill nowadays - you're probably reading decade-old papers from before the modern day of SAS. In particular, you should use PROC HTTP to GET the file, rather than using the URL directly - it's just much easier that way.
Another possibility is you could use python for this! Regardless of your SAS setup, you can use python to script SAS, or use SAS to run Python, very easily nowadays. SASPy project (on github) for SAS 9, or the SAS SWAT package (also on github I believe) for connecting to SAS Viya, lets you very easily talk back and forth with Python and SAS, even in production - I do it all the time. And on top of that, you can even write SAS user-written functions in python now - so there're a bunch of ways to get your Python into SAS, if you want. I am a SAS (primary) developer, but I use Python for web-related stuff, since it's just better at it. See my paper for more details, or lots of other resources on the subject.
Assuming you just want to import the file and don't want to keep track of the excel file ultimately, you can just read it from a temp filename, which will clean itself up afterwards:
filename _httpin temp;
proc http method="get"
url="https://github.com/snoopy369/SASL/raw/master/Excel%20Precision.xlsx"
out=_httpin;
run;
proc import file=_httpin out=test dbms=xlsx replace;
run;
If you want the excel file saved somewhere (sounds like you don't, but if you do), then instead of temp assign that httpin filename to a real location.
filename _httpin "c:\temp\whatever.xlsx";
If you really want to do that binary file copy business, do it this way:
filename _httpin temp;
filename _httpout "c:\temp\Excel Precision.xlsx";
proc http method="get"
url="https://github.com/snoopy369/SASL/raw/master/Excel%20Precision.xlsx"
out=_httpin;
run;
data _null_;
_in = fopen('_httpin','I',1,'B');
_out= fopen('_httpout','O',1,'B');
rec='20'x;
do while (Fread(_in) eq 0);
rc = fget(_in,rec,1);
rc = fput(_out,rec);
rc = fwrite(_out);
end;
rc = fclose(_in);
rc = fclose(_out);
run;
proc import file=_httpout out=test dbms=xlsx replace;
run;

Generating new Variable in SAS results in ERROR 180-322

I am very new to SAS, which is why this question has probably a quite easy answer. I use the SAS university edition.
I have dataset containing socio-structural data in 31 variables and 1000000 observations. The data is stored in a Stata .dta file, which is why I used the following code to import in into SAS:
LIBNAME IN '/folders/myfolders/fake_data';
proc import out= fake_2017 datafile = "/folders/myfolders/fake_data/mz_2017.dta" replace;
run;
Now, I want to create new variables. At first, a year variable that takes the value 2017 for all observations. After, I have several other variables that I want to generate from the 31 existing variables. However, running my code I get the same error message for all my steps:
year = 2017;
run;
ERROR 180-322: Statement is not valid or it is used out of proper order.
I found many things online but nothing that'd help me. What am I doing wrong/forgetting? For me, the code looks like in all the SAS tutorial videos that I have already watched.
You cannot have an assignment statement outside of a data step. You used a PROC IMPORT step to create a dataset named fake_2017. So now you need to run a data step to make a new dataset where you can create your new variable. Let's call the new dataset fixed_2017.
data fixed_2017;
set fake_2017;
year=2017;
run;

SAS: how to check if a column, in an imported excel sheet, contains a string?

I have imported a dataset from a an excel sheet, and I want to delete some observations. Say, I have a variable which tells me if a student has passed or not (with strings "Passed" and "Failed"). I want to delete all the students which have failed from the dataset.
I do know that usually I would be able to do so with an if statement. However, I don't know how to access the temporary dataset. Do I have to open after importing it, and then check with an if statement?
This is how I have tried:
proc import datafile="C:\Users\User\Desktop\testresults.xlsx"
DBMS=XLSX;
if Status = "failed" then delete
run;
I know this won't work as the "if" condition only works when the data resides in PDV.
Is it possible to delete after importing instead of while importing?
Use a where clause on the output data set:
proc import file="my.xlsx"
out=work.myxlsx(where=(status^="failed"))
dbms=xlsx
replace;
run;
A where statement would modify the output dataset from PROC IMPORT, as DomPazz shows.
Alternately, you can use a data step.
proc import datafile="C:\Users\User\Desktop\testresults.xlsx" out=have DBMS=XLSX;
run;
data want;
set have;
if Status = "failed" then delete;
run;
That of course would work whether you did it immediately after importing (or in the same submit) or some time later.

IGNORE DATA IN SAS IMPORT FROM EXCEL

I have no working knowledge of SAS, but I have an excel file that I need to import and work with. In the excel file there are about 100 rows (observations) and 7 columns (quantities). In some cases, a particular observation may not have any data in one column. I need to completely ignore that observation when reading my data into SAS. I'm wondering what the commands for this would be.
An obvious cheap solution would be to delete the rows in the excel file with missing data, but I want to do this with SAS commands, because I want to learn some SAS.
Thanks!
Import the data however you want, for example with the IMPORT procedure, as Stig Eide mentioned.
proc import
datafile = 'C:\...\file.xlsx'
dbms = xlsx
out = xldata
replace;
mixed = YES;
getnames = YES;
run;
Explanation:
The DBMS= option specifies how SAS will try to read the data. If your file is an Excel 2007+ file, i.e. xlsx, then you can use DBMS=XLSX as shown here. If your file is older, e.g. xls rather than xlsx, try DBMS=EXCEL.
The OUT= option names the output dataset.
If a single level name is specified, the dataset is written to the WORK library. That's the temporary library that's unique to each SAS session. It gets deleted when the session ends.
To create a permanent dataset, specify a two level name, like mylib.xldata, where mylib refers to a SAS library reference (libref) created with a LIBNAME statement.
REPLACE replaces the dataset created the first time you run this step.
MIXED=YES tells SAS that the data may be of mixed types.
GETNAMES=YES will name your SAS dataset variables based on the column names in Excel.
If I understand you correctly, you want to remove every observation in the dataset that has a missing value in any of the seven columns. There are fancier ways to do this, but I recommend a simple approach like this:
data xldata;
set xldata;
where cmiss(col1, col2, ..., col7) = 0;
run;
The CMISS function counts the number of missing values in the variables you specify at each observation, regardless of the data type. Since we're using WHERE CMISS()=0, the resulting dataset will contain only the records with no missing data for any of the seven columns.
When in doubt, try browsing the SAS online documentation. It's very thorough.
If you have "SAS/ACCESS Interface to PC Files" licensed (hint: proc setinit) you can import the Excel file with this code. The where option lets you select which rows you want to keep, in this example you will keep the rows where the column "name" is not blank:
proc import
DATAFILE="your file.xlsx"
DBMS=XLSX
OUT=resulttabel(where=(name ne ""))
REPLACE;
MIXED=YES;
QUIT;

Proc Import from SPSS

Relatively new to SAS and I'm using proc import on an SPSS (.sav) data file and it runs fine but I noticed that it brings in only the SPSS value labels rather than the numeric equivalent. As an example in the Gender column 1='male', 2='female' and in the SAS data set 'male' and 'female' show up rather than 1 or 2.
Any insight would be appreciated. Current code...
proc import datafile = "C:\Data\workload_20130314.sav"
out=library.workload_20130314
dbms = sav
replace;
run;
You probably have the underlying values there, they're probably just formatted. Try opening the dataset and viewing the column properties of one of the columns you're looking at; it probably has a format that's like Q49F. or something that does that. It still works with PROC MEANS or whatever as a numeric variable.
You can run, I think, something like
proc datasets;
modify my_dataset;
format _all_;
quit;
to remove the overlay. You can also do that on a case by case basis.