Import data from European Social Survey - sas

I need to import data from European Social Survey databank to SAS.
I'm not very good at using SAS so I just naively tried importing the text file one gets but it stores it all in one variable.
Can someone maybe help me with what to do? Since there doesn't seem to be a guide on their webpage I reckon it has to be pretty easy.
It's free to register (and takes 5 secs) and I need all possible data for Denmark.
Edit: When downloading what they call a SAS file, what i get is a huge proc format and the same text file as one gets by choosing text.
The data in the text file isn't comma separated and the first row does not contain variable names.

Download it in SAS format. Save the text file in a location you can remember, and open the SAS file. It's not just one big proc format; it's a big proc format followed by a datastep with input code. It was probably created by SPSS (it fits the pattern of an SPSS saved .sas file anyhow). Look for:
DATA OUT.ESS1_4e01_0_F1;
Or something like that (that's what it is when I downloaded it). It's probably about 3/4 of the way down the page. You just need to change the code:
INFILE 'ESS1_4e01_0_F1.txt';
or similar, to be the directory you placed the text file in. Create a LIBNAME for OUT that goes to wherever you want to permanently save this, and do that at the start of the .sas file, replacing the top 3 lines like so.
Originally:
LIBNAME LIBRARY '';
LIBNAME OUT '';
PROC FORMAT LIBRARY=LIBRARY ;
Change these to:
libname out "c:\mystuff\"; *but probably not c:\mystuff :);
options fmtsearch=(out);
proc format lib=out;
Then run the entire thing.
This is the best solution if you want the formatted values (value labels) and variable labels. If you don't care about that, then it might be easier to deal with the CSV like Bob shows.

But the website says yu can download SAS format, why don't you?
You need a delimiter if all goes into one column.
data temp;
length ...;
infile 'file.csv' dlm=',';
input ...;
run;

As Dirk says, the web site says you can download a SAS dataset directly. However, if there's some reason you don't want to do that, choose a comma separated file (CSV) and use PROC IMPORT. Here is an example:
proc import out=some_data
datafile='c:\path\somedata.csv'
dbms=csv replace;
getnames=yes;
run;
Of course, this assumes the first row contains column names.

Related

Import xlsx file into SAS from static url link

I am new to SAS but used to working in python.
In python, I can read an xlsx file from a static Box.com link by calling pandas read excel function on the Box link.
pd.read_excel("https://box.com/shared/static/<url>.xlsx")
I'm hoping to do something similar in SAS. When I try
PROC IMPORT DATAFILE="https://box.com/shared/static/<url>.xlsx"
OUT=WORK.MYEXCEL
DBMS=XLSX
REPLACE;
RUN;
SAS tries to look for a document with the name "https://box.com/shared/static/.xlsx".
When I try
filename xlsxFile http "https://box.com/shared/static/<url>.xlsx";
PROC IMPORT DATAFILE=xlsxFile
OUT=WORK.MYEXCEL
DBMS=XLSX
REPLACE;
RUN;
I get
ERROR: This "filename URL" access method is not supported by "proc import". Please copy the file to local disk before running the
procedure.
Is there an easy way to have SAS access files from this type of URL? I've checked these few threads:
https://communities.sas.com/t5/General-SAS-Programming/import-excel-file-from-the-web/td-p/134158
https://communities.sas.com/t5/SAS-Programming/Importing-XLSX-from-URL-Issue/td-p/446758
https://communities.sas.com/t5/SAS-Programming/proc-import-xlsx-from-url/td-p/635834
And it seems like I may need to do some work to get SAS to create a file object from the URL but I don't quite understand how the code is working and when I naively try something similar:
filename xlsxFile http "https://box.com/shared/static/<url>.xlsx";
data file;
n=-1;
infile xlsxFile recfm=s nbyte=n length=len;
input;
file "file_name.xlsx" recfm=n;
put _infile_ $varying32767. len;
run;
PROC IMPORT OUT= input DATAFILE= "file_name.xlsx"
DBMS=xls REPLACE;
SHEET="sheet_name";
GETNAMES=YES;
RUN;
I get the following error:
Spreadsheet isn't from Excel V5 or later. Please open it in Excel and Save as V5 or later
Requested Input File Is Invalid
ERROR: Import unsuccessful. See SAS Log for details.
Any help would be appreciated. The reason I'd like to do it this way is because the files at the static links are updated daily and I want to avoid having to copy files to the SAS server every day. So if this won't work I am also interested in other work arounds that would accomplish the same thing.
I can obviously write a script that will fetch the updated files and write them to the server where SAS can access them as needed, but wanted to see if I could get this to work first.
Using the Box API from within SAS would also be another option if anyone has successfully gotten that to work. As I mentioned I am new to SAS so trying to access the API seemed like it would be too difficult at the moment.
Thank you!
You'll definitely need to download the file. As Tom notes in comments, make sure to check this is really an xlsx file; but assuming it is, some suggestions for dealing with it.
Copying it over by hand like you're doing is doable, and I'll put some code that works at the bottom of the answer, but it's way overkill nowadays - you're probably reading decade-old papers from before the modern day of SAS. In particular, you should use PROC HTTP to GET the file, rather than using the URL directly - it's just much easier that way.
Another possibility is you could use python for this! Regardless of your SAS setup, you can use python to script SAS, or use SAS to run Python, very easily nowadays. SASPy project (on github) for SAS 9, or the SAS SWAT package (also on github I believe) for connecting to SAS Viya, lets you very easily talk back and forth with Python and SAS, even in production - I do it all the time. And on top of that, you can even write SAS user-written functions in python now - so there're a bunch of ways to get your Python into SAS, if you want. I am a SAS (primary) developer, but I use Python for web-related stuff, since it's just better at it. See my paper for more details, or lots of other resources on the subject.
Assuming you just want to import the file and don't want to keep track of the excel file ultimately, you can just read it from a temp filename, which will clean itself up afterwards:
filename _httpin temp;
proc http method="get"
url="https://github.com/snoopy369/SASL/raw/master/Excel%20Precision.xlsx"
out=_httpin;
run;
proc import file=_httpin out=test dbms=xlsx replace;
run;
If you want the excel file saved somewhere (sounds like you don't, but if you do), then instead of temp assign that httpin filename to a real location.
filename _httpin "c:\temp\whatever.xlsx";
If you really want to do that binary file copy business, do it this way:
filename _httpin temp;
filename _httpout "c:\temp\Excel Precision.xlsx";
proc http method="get"
url="https://github.com/snoopy369/SASL/raw/master/Excel%20Precision.xlsx"
out=_httpin;
run;
data _null_;
_in = fopen('_httpin','I',1,'B');
_out= fopen('_httpout','O',1,'B');
rec='20'x;
do while (Fread(_in) eq 0);
rc = fget(_in,rec,1);
rc = fput(_out,rec);
rc = fwrite(_out);
end;
rc = fclose(_in);
rc = fclose(_out);
run;
proc import file=_httpout out=test dbms=xlsx replace;
run;

How to copy data afrer "cards"/"datalines" in SAS

I have to perform statistical analysis on a file with hundreds of observations and 7 variables(columns)on SAS. I know that it is necessary to insert all the observations after "cards" or "datalines". But I can't write them all obviously. How can I do? Moreover, the given data file already is .sas7bdat.
Then, since (in my case) the multiple correspondence analysis requires only six of the seven variables, does this affect what I have to write in INPUT or/and in CARDS?
You only use CARDS when you're trying to manually write a data set. If you already have a SAS data set (sas7bdat) you can usually use that directly (there are some exceptions but likely don't apply here).
First create a libname to the folder where the file is:
libname myFiles 'path to fodler with sas file';
Then load it into your work library - this is a temporary space that is cleaned up when you're done so no files here are saved permanently.
This copies it over to that library - which is often faster.
data myFileName;
set myFiles.myFileName;
run;
You can just work with the file from that library by referencing it as myFiles.myFileName in your code.
proc means data=myFiles.myFileName;
run;
This should get you started, but you should take the SAS free e-course to understand the basics, it will save you time overall.
Just tell SAS to use the dataset. INPUT statement (and CARDS/DATALINES or INFILE statement) are for reading from text files.
proc corresp data='/my directory/mydataset.sas7bdat' .... ;
...
run;
You could also make a libref that points to the directory and use two level name to reference the dataset.
libname myfiles '/my directory/';
proc corresp data=myfiles.mydataset .... ;
...
run;

IGNORE DATA IN SAS IMPORT FROM EXCEL

I have no working knowledge of SAS, but I have an excel file that I need to import and work with. In the excel file there are about 100 rows (observations) and 7 columns (quantities). In some cases, a particular observation may not have any data in one column. I need to completely ignore that observation when reading my data into SAS. I'm wondering what the commands for this would be.
An obvious cheap solution would be to delete the rows in the excel file with missing data, but I want to do this with SAS commands, because I want to learn some SAS.
Thanks!
Import the data however you want, for example with the IMPORT procedure, as Stig Eide mentioned.
proc import
datafile = 'C:\...\file.xlsx'
dbms = xlsx
out = xldata
replace;
mixed = YES;
getnames = YES;
run;
Explanation:
The DBMS= option specifies how SAS will try to read the data. If your file is an Excel 2007+ file, i.e. xlsx, then you can use DBMS=XLSX as shown here. If your file is older, e.g. xls rather than xlsx, try DBMS=EXCEL.
The OUT= option names the output dataset.
If a single level name is specified, the dataset is written to the WORK library. That's the temporary library that's unique to each SAS session. It gets deleted when the session ends.
To create a permanent dataset, specify a two level name, like mylib.xldata, where mylib refers to a SAS library reference (libref) created with a LIBNAME statement.
REPLACE replaces the dataset created the first time you run this step.
MIXED=YES tells SAS that the data may be of mixed types.
GETNAMES=YES will name your SAS dataset variables based on the column names in Excel.
If I understand you correctly, you want to remove every observation in the dataset that has a missing value in any of the seven columns. There are fancier ways to do this, but I recommend a simple approach like this:
data xldata;
set xldata;
where cmiss(col1, col2, ..., col7) = 0;
run;
The CMISS function counts the number of missing values in the variables you specify at each observation, regardless of the data type. Since we're using WHERE CMISS()=0, the resulting dataset will contain only the records with no missing data for any of the seven columns.
When in doubt, try browsing the SAS online documentation. It's very thorough.
If you have "SAS/ACCESS Interface to PC Files" licensed (hint: proc setinit) you can import the Excel file with this code. The where option lets you select which rows you want to keep, in this example you will keep the rows where the column "name" is not blank:
proc import
DATAFILE="your file.xlsx"
DBMS=XLSX
OUT=resulttabel(where=(name ne ""))
REPLACE;
MIXED=YES;
QUIT;

How to create "standardized" Excel workbooks using SAS

I have a "wide" SAS data sets that must be exported into a new Excel workbook every week. I want to preserve the column widths and other Excel attributes every week, but I'm having problems getting it to work. Here's what I'm attempting.
I used PROC EXPORT to create a new workbook (using sheet="New_TACs").
I manually adjusted the column widths and other sheet attributes
(like "filters", column widths, wrap, alignment, and "freeze panes").
I deleted all the data rows (leaving the first row with the column
names) and saved it as a new workbook named "template.xlsx".
Using a SAS system call, I copy "template.xlsx" to "this_week.xlsx".
I use PROC EXPORT again to try and update the new workbook, but I
get warnings. The result contains a sheet named "New_TACS1".
Here is the SAS log:
720 proc export data=new_tacs
721 outfile="\\server-path\this_week.xlsx"
722 replace;
723 sheet='New_TACs';
724 run;
WARNING: The target file may contain unmatched range name and sheet name.
WARNING: The target file may contain unmatched range name and sheet name.
WARNING: File _IMEX_.New_TACs.DATA does not exist.
WARNING: Table _IMEX_."New_TACs" has not been dropped.
NOTE: "New_TACs" range/sheet was successfully created.
NOTE: PROCEDURE EXPORT used (Total process time):
real time 23.88 seconds
cpu time 1.80 seconds
I'm at a loss as to what to do and would appreciate any ideas or suggestions.
I think the issue is that with zero rows, SAS isn't properly dealing with the data. I can't get PROC EXPORT to work at all, but with a single dummy row I can at least get it to behave with libname and PROC APPEND. I wouldn't be surprise if the filters are in part responsible for this.
After creating the blank excel file with the SASHELP.CLASS columns, adding a filter, adding one row of dummy data, and saving/closing, I do: (SCANTEXT=NO is mandatory here for update access)
libname newtac "c:\temp\test.xlsx" scantext=no getnames=yes;
proc append base=newtac.'New_TACs$_xlnm#_FilterDatabase'n data=sashelp.class force;
run;
libname newtac clear;
That gets close, at least. I'm getting some blank rows for some reason, perhaps due to other things I did in looking at this.
Your best solution may well be to wait for 9.4 TS1M0 and ODS EXCEL, which will let you do all these things from SAS directly; or to use DDE.
I would recommend checking out SaviCells. http://www.sascommunity.org/wiki/SaviCells. It provides much better SAS to Excel functionality, including creating a template with all your Excel formatting and using that with new data.
Use DDE in SAS to achieve this.
You can create your excel template the way you want it to appear.
Using DDE you would then:
Open Excel
Open the excel file you want to use as the template
Populate it with the updated data
Save the file as a new filename
It's a bit of an antiquated technology but it gets the job done.
Googling for SAS and DDE will find you plenty of code exmaples and tutorials.

how to read only specific coulmns from external tab delimited files using proc import into sas dataset

is it possible to read only specific variabiles from text files into sas as sas dataset using proc import? I have very big data in my text file which contains like 1000 observations and more than 42,000 variables. I tried reading this file into sas using proc import but i failed in doing so i thought may be because of size issues. Now i decided to read only specific variables (columns) really i do not need all of them from this big text file so that i can reduce size of file to read into sas system. is there any ideas to read like this by using data step? Suggestion or help would be appreciated
Thanking you very much,
If you're talking about a delimited text file, you can read in specific variables, contingent on them being consecutive from the first variable. You can't use PROC IMPORT to do this, however; you'd have to write the datastep yourself, though you could conceivably have PROC IMPORT help you write it.
For example, if you have 10 variables and only want the first 3, then you can read it in like this:
data want;
infile "mydata.txt" dlm=',' lrecl=255 missover dsd;
input x $ y $ z $;
run;
However, you must read in all variables from the first variable to the last variable you are interested in, even if you aren't interested in all of the intermediate variables. You can't "skip" them with a delimited text file.
If you have a fixed width text file, you can read in whatever columns you want (but you can't use PROC IMPORT to read in a fixed width text file).