How to import a zipped ".sas7bdat" file? - sas

I have a sas7bdat format file, but it's zipped.
I could unzip the file and work on it, but this makes me lose hard disk space and time.
So I tried this code on SAS :
filename myfile ZIP 'C:\...\data.zip' member="data.sas7bdat" ;
data yoyo;
infile myfile (data.sas7bdat);
input;
put _infile_;
run;
But I get an empty yoyo table in the WORK library.
How can I successfully import the data.sas7bdat ?
Thank you,

You need to uncompress the dataset before SAS can use it. So you need to find a place that has enough space for the fully expanded file.
Note that your code is trying to specify the member name of the file within the ZIP file twice. You should only do that once. Either point the fileref to the aggregate location and use member name in the reference. Or point the fileref to the individual member and just use the fileref.
Here is a method to expand the file into your current WORK folder.
%let member=data.sas7bdat;
filename in zip 'C:\...\data.zip' member="&member" recfm=n;
filename out "%sysfunc(pathname(work))/&member" recfm=n;
data _null_;
rc=fcopy('in','out');
run;
You can now work with the file using the name WORK.DATA.
proc print data=work.data(obs=1); run;
If you want to read data from a ZIP file directly then it either needs to be raw (text) data or in a streaming format, like a SAS V5 XPORT file.

Related

How do I get PC SAS to read a list from a text file and perform an action on each element in the list

I have a PowerShell program that searches a folder on my PC for several text files.  If the file is not in the folder, it writes the filename to another text file.  When the procedure finishes, I have a text file with a list of files (one column) that are missing from the folder. 
Next I would like PC SAS to read the list from the text file and launch the corresponding SAS program that I have already written that retrieves each file from our FTP server. 
I am not sure how to go about having SAS read the filenames and launch my FTP programs.  Any suggestions on how to accomplish this task?
Sounds like the first problem is you need to modify the SAS program to use a dataset with the list of filenames to process. One easy way to do that is to create a macro that downloads one file and takes the filename as an input parameter. So convert your code that downloads a file to a macro.
%macro mf_download(file);
* Code that downloads &FILE from mainframe. ;
%mend ;
Then it is easy to write a program that reads the names from a file and calls the macro for each name in the file. So say the file with the list of names if named filelist.txt then that part of the program might look like this:
data names;
infile 'filelist.txt' truncover ;
input filename :$256. ;
call execute(cats('%nrstr(%mf_download)(',filename,')'));
run;

Correct Syntax for Filename Statement using a dataset variable

I'm a SAS newbie and I don't seem to be able to work out the correct syntax for this. I have a dataset (fileList) with one field (projFile) which holds a filename. I wish to open the file and read the contents into a second field that will be created. The file is zipped (it's a SAS-EG project file) and so I'm told that I should use Filename statement with the zip option to read the file. However, no matter how I reference projFile it doesn't like it.
data fileList;
set fileList;
filename inzip zip "&projFile" member="project.xml";
infile inzip;
input fileContent $char2000.;
output;
run;
I may also have the input statement wrong, but until I can get past this issue, I don't know. Thanks.
If you are always reading the same file (member) from the ZIP file you can use the FILEVAR= option on the infile statement to switch which ZIP file you are reading that member from.
So if I have three ZIP files that each has a file named example.txt in it and a dataset like this with the list of filenames.
data fnames ;
input filename $80.;
cards;
c:\downloads\file1.zip
c:\downloads\file2.zip
c:\downloads\file3.zip
;
Then I can use that dataset to drive the creation of a new dataset that has the information from those files.
data test;
set fnames ;
fname=filename;
infile in zip filevar=fname member='example.txt' end=eof truncover;
do while (not eof);
input line $100. ;
output;
end;
run;
If the driving dataset has the list of members in the ZIP file to read then you can use the MEMVAR= option on the INFILE statement also.
data members ;
infile cards dsd dlm='|' truncover ;
input filename :$80. memname :$80.;
cards;
c:\downloads\file1.zip|example.txt
c:\downloads\file2.zip|example.txt
c:\downloads\file3.zip|example.txt
;
data test;
set members ;
filevar=filename;
memvar=memname;
infile in zip filevar=filevar memvar=memvar end=eof truncover;
do while (not eof);
input line $100. ;
output;
end;
run;
There's a few issues here.
First - you probably shouldn't use data filelist; set filelist; if you're doing something like this. Make a new dataset.
Second - filename is not executable. It is declarative. You can place it inside the data step, but you shouldn't, and for precisely this reason: it makes you think it's doing something inside the data step. It's not. It's doing something, period, and then the data step happens, later (even when placed here).
Third - you aren't using infile properly, but that's really a consequence of Second. You need the filevar option on infile to allow it to do something different here.
Fourth - you probably don't really want to just read arbitrarily from the project.xml. Really this whole thing is probably not what you want to do... I've done what you're doing, and it's doable, but not this way. But that's probably a bigger question.
If this were to work, what you'd do is this:
filename a zip "c:\doesntmatter.egp" member="project.xml";
data files;
length fname $255;
infile datalines truncover;
input #1 fname $255.;
datalines;
c:\myfile.egp
c:\myfile2.egp
c:\myfile3.egp
;;;;
run;
data egp;
set files;
infile a filevar=fname pad truncover;
input #1 first_line $512. #;
put first_line;
run;
The filename statement doesn't really do anything, but I show you where it would go. You see the filevar on the infile statement - that points to the fname variable on files. Then it reads in from there.
My general suggestion is that you should probably use the xml libname engine here; figure out what you want to do on a per-xml basis, write that out as a macro, then call the macro for each line in the file name dataset (using call execute probably, or if you must, dosubl). You don't have to use the xml libname engine, but it'll simplify things most likely.
If you're only using one file, then you can specify it directly in the filename statement I showed above, and just use infile with that filename (infile a; here, but please call it something more sensible than a). But again, it's silly to read it in this way - use the libname engine, as it'll parse out the xml for you.
Edits, to remove incorrect information confirmed by Tom's answer. Even though it does work, I don't recommend using infile here - read it with the libname engine, it'll save you loads of time.

read large txt file stored in sas

I've large txt file stored in sas enterprise guide(sas is connected to Winscp this is where th txt file is stored). How to read it and convert it to sas data as output.
When I check in SAS community I've get the code sample to read txt file (see bellow) is it same to read txt stored in sas?
proc import datafile='path'
out=NAME
dbms=dlm
replace;
datarow=5;
delimiter='09'x;
run;
There is another method I see also which use infile.
Which method shoul I use for me case?
I’ve not tried any method yet. Because I do not understand parameters. the path should it be the one in sas (in server) or in winscp?
Proc IMPORT works only on operating file 'references' that deliver the file directly.
WinSCP is a ftp client, so you two options:
Use WinSCP to copy the remote file to the local operating system, then you can use IMPORT or DATA step with INFILE
Use filename FTP access method and DATA step that reads data lines retrieved by SAS FTP engine
filename offsite ftp 'remote-filename` user=… pass=… host=… cd=… ;
data gotit;
infile offsite;
input var1 var2 var3 etc … ;
run;
The specific input statement might need informats and pointer control options, all dependent on the data file layout. Other infile options might be needed depending on field delimiters and content.

SAS: PROC IMPORT on a FILENAME SFTP fileref

In SAS (9.4, if it matters) I would like to grab a CSV file from a remote host via SFTP, parse the CSV, and drop the result into a SAS data table.
I set up SFTP using PuTTY as described in the SAS docs. Binding a fileref to SFTP works okay, something like:
FILENAME mysftpfileref SFTP 'location/on/host/file.csv' HOST='myhost' USER='mysuser';
DATA _null_;
INFILE mysftpfileref TRUNCOVER;
INPUT a $25.;
RUN;
Will successfully read data.
However, I can't seem to figure out to use PROC IMPORT to actually parse the data. The docs for that proc state
"The IMPORT procedure does not support device types or access methods
for the FILENAME statement except for DISK. For example, the IMPORT
procedure does not support the TEMP device type, which creates a
temporary external file."
Is there a workaround?
You'll need to either:
Write the import code yourself (using the data step)
Download the file in some fashion and then run PROC IMPORT on the downloaded file
If you choose the second option, you can do this a few ways. The easiest is probably to write something like the above data step, read the entire line in or use the _INFILE_ automatic variable, and then write it out locally. Something along these lines (define these filenames or change them, of course):
data _null_;
infile Sftpfile;
file localf;
input #;
put _infile_;
run;

Import data from European Social Survey

I need to import data from European Social Survey databank to SAS.
I'm not very good at using SAS so I just naively tried importing the text file one gets but it stores it all in one variable.
Can someone maybe help me with what to do? Since there doesn't seem to be a guide on their webpage I reckon it has to be pretty easy.
It's free to register (and takes 5 secs) and I need all possible data for Denmark.
Edit: When downloading what they call a SAS file, what i get is a huge proc format and the same text file as one gets by choosing text.
The data in the text file isn't comma separated and the first row does not contain variable names.
Download it in SAS format. Save the text file in a location you can remember, and open the SAS file. It's not just one big proc format; it's a big proc format followed by a datastep with input code. It was probably created by SPSS (it fits the pattern of an SPSS saved .sas file anyhow). Look for:
DATA OUT.ESS1_4e01_0_F1;
Or something like that (that's what it is when I downloaded it). It's probably about 3/4 of the way down the page. You just need to change the code:
INFILE 'ESS1_4e01_0_F1.txt';
or similar, to be the directory you placed the text file in. Create a LIBNAME for OUT that goes to wherever you want to permanently save this, and do that at the start of the .sas file, replacing the top 3 lines like so.
Originally:
LIBNAME LIBRARY '';
LIBNAME OUT '';
PROC FORMAT LIBRARY=LIBRARY ;
Change these to:
libname out "c:\mystuff\"; *but probably not c:\mystuff :);
options fmtsearch=(out);
proc format lib=out;
Then run the entire thing.
This is the best solution if you want the formatted values (value labels) and variable labels. If you don't care about that, then it might be easier to deal with the CSV like Bob shows.
But the website says yu can download SAS format, why don't you?
You need a delimiter if all goes into one column.
data temp;
length ...;
infile 'file.csv' dlm=',';
input ...;
run;
As Dirk says, the web site says you can download a SAS dataset directly. However, if there's some reason you don't want to do that, choose a comma separated file (CSV) and use PROC IMPORT. Here is an example:
proc import out=some_data
datafile='c:\path\somedata.csv'
dbms=csv replace;
getnames=yes;
run;
Of course, this assumes the first row contains column names.