Sas data step keep statement from a text file - sas

I have a table cust_base with 1000 variables. And I have a text file contents1 containing the names of 250 variables separated by tab, that I actually need to work with. I want to do something similar to:
%include "/location/contents1.txt";
data new_cust_base(keep = &contents1.txt);
set cust_base;
run;
Is this the correct approach/syntax? Or is there a better way to go about it? I tried digging online, but couldn't find much. Thanks a lot.

You can %include source code as the interior of a keep statement.
set …;
KEEP
%include "/location/contents1.txt";
;
Working example:
data _null_;
file 'c:\temp\keeplist.tab';
put 'name' "09"x 'age' "09"x 'weight';
run;
data work.class;
set sashelp.class;
KEEP
%include 'c:\temp\keeplist.tab';
;
run;

Related

SAS include another SAS script within a macro

I am looking to include one sas program inside the macro written in another sas program.
So:
sas_prog1.sas:
data test;
a=1;
run;
sas_prog2.sas:
%macro m2;
%include sas_prog1.sas;
%mend;
%m2;
Does the data step in sas_prog1.sas also need to be wrapped inside a macro?
No - you don't need to. When you use an %include statement, it just essentially writes out all contents in the included file at that location. In your case it just dumps the datastep code and hence it effectively becomes:
%macro m2;
data test;
a=1;
run;
%mend;
%m2;
So you should be good to go.
you can include a code in another in writting it to a temp file as character.
filename exec_code temp;
data _null_;
file exec_code;
put ' your sas instruction'
put 'your sas instruction'
run;
and in your macro use an include
%macro mymacro();
%include exec_code;
%mend;
Assuming that sas_prog1.sas is a being used as a module and you will have multiple modules for the entire code, you can simply use a %include to execute the program. There is no need to execute it inside of a macro in sas_prog2, but it can be.
contents of file saved as sas_prog1.sas:
data test;
a=1;
run;
contents of sas_prog2.sas:
%include "[prog_dir]\sas_prog2.sas";

Running specified lines of a SAS program

Assume I have this program:
1 data temp;
2 set _null_;
3 run;
4
5 %put Hello world;
and I want to add two lines to it, one that runs lines 1-3 of the program, and another that runs line 5.
The second example here suggests that %include may be what I'm looking for, but %include 1-3 and %include 5 do not work. %include [path] 1-3 gets me into an infinite loop, which is undesirable.
What is the best way to accomplish this? Thanks!
EDIT: this only works for lines that have previously been submitted.
You need SPOOL option. I used RESETLINE statement to reset the line number useful when using SAS/EG. I would like to know how you intend to use it.
options spool=1;
resetline;
data temp;
set _null_;
run;
%put Hello world;
%include 1-3 / source2;
%include 5 / source2;
Macros, perhaps?
%macro one(datasetname= );
data &datasetname;
set _null_;
run;
%mend one;
%macro two(textstring= );
%put &textstring;
%mend two;
%one(datasetname= temp1);
%two(textstring= Hello world);
%one(datasetname= temp2);
%two(textstring= Hello new world);
You could feed macro variables into the process from a dataset rather than multiple macro calls. See the examples starting on p 11 here:
First & Ronk, SGI 130-30, SAS® Macro Variables and Simple Macro Programs
A single extra return could screw up your line references...If you need to add a header, or a new line of code somewhere or something are you willing to go back and fix all your line number references?
I recommend using the macro or %include options.
%macro repeat_code();
***sas code goes here;
%mend;
%repeat_code
For the %include you can create the lines inside a new file and then reference them in your code with a %include.
Depending on what those lines are actually doing, you may have other options. For example if it's a lookup or re-coding a variable I would use a format instead.

Stop sas macro from overwriting different imported csv files as the same sas dataset

I found a macro and have been using it to import datasets that are given to me in csv format. Now I need to edit it because I have datasets that have an id number in them and I want sas datasets with the same name.
THE csvs are named things like IDSTUDY233_first.csv So I want the sas dataset to be IDSTUDY233_first. It should appear in my work folder.
I thought it would just create a sas dataset for each csv named IDSTUDY233_first or something like that. (and so on and so forth for each additional study). However it's naming this way.
IDSTUDY_FIRST
and over rights itself for every ID. I am newer to macros and have been trying to figure out WHY it does this and how to fix it. Suggestions?
%let subdir=Y:\filepath\; *MACRO VARIABLE FOR FILEPATH;
filename dir "&subdir.*.csv "; *give the file the name from the path that your at whatever the csv is named;
data new; *create the dataset new it has all those filepath names csv names;
length filename fname $ 200;
infile dir eof=last filename=fname;
input ;
last: filename=fname;
run;
proc sort data=new nodupkey; *sort but don't keep duplicate files;
by filename;
run;
data null; *create the dataset null;
set new;
call symputx(cats('filename',_n_),filename); *call the file name for this observation n;
call symputx(cats('dsn',_n_),compress(scan(filename,-2,'\.'), ,'ka')); *call the dataset for this file compress then read the file;
call symputx('nobs',_n_); *call for the number of observations;
run;
%put &nobs.; *but each observation in;
%macro import; *start the macro import;
%do i=1 %to &nobs; *Do for each fie to number of observations;
proc import datafile="&&filename&i" out=&&dsn&i dbms=csv replace;
getnames=yes;
run;
%end;
%mend import;
%import
*call import macro;
As you can see I added my comments of my understanding. Like I said macros are new to me. I may be incorrect in my understanding. I am guessing the problem is either in
call symputx(cats('dsn',_n_),compress(scan(filename,-2,'\.'), ,'ka'));
or it is in the import statement probably out=&&dsn&i since it rapidly over writes the previous SAS files until it does every one. It's just I need all the sas files not just the last 1.
My guess is that you are right, it is to do with this line:
call symputx(cats('dsn',_n_),compress(scan(filename,-2,'\.'), ,'ka'));
The gotcha is in the arguments passed to compress. Compress can be used to remove or keep certain characters in a string. In the above example, they are using it to just keep alphabetic characters by passing in the 'ka' modifiers. This is effectively causing files with different names (because they have different numbers) to be treated as the same file.
You can modify this behaviour to keep alphabetic characters, digits, and the underscore character by changing the parameters from ka to kn.
This change does mean that you also need to make sure that none of your file names begin with a number (as SAS datasets can't begin with a number).
The documentation for the compress function is here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212246.htm
An easy way to debug this would be to take the dataset with all of the call symput statements, and in addition to storing these values in macro variables, write them to variables in the dataset. Also change it from a data _null_ to a data tmp statement. You can then see for each file what the destination table name will be.

Importing single value from a CSV file not working in SAS

I'm trying to use a Macro that retrieves a single value from a CSV file. I've written a MACRO that works perfectly fine if there is only 1 CSV file, but does not deliver the expected results when I have to run it against more than one file. If there is more than one file it returns the value of the last file in each iteration.
%macro reporting_import( full_file_route );
%PUT The Source file route is: &full_file_route;
%PUT ##############################################################;
PROC IMPORT datafile = "&full_file_route"
out = file_indicator_tmp
dbms = csv
replace;
datarow = 3;
RUN;
data file_indicator_tmp (KEEP= lbl);
set file_indicator_tmp;
if _N_ = 1;
lbl = "_410 - ACCOUNTS"n;
run;
proc sql noprint ;
select lbl
into :file_indicator
from file_indicator_tmp;
quit;
%PUT The Source Reporting period states: &file_indicator;
%PUT ##############################################################;
%mend;
This is where I execute the Macro. Each excel file's full route exists as a seperate record in a dataset called "HELPERS.RAW_WAITLIST".
data _NULL_;
set HELPERS.RAW_WAITLIST;
call execute('%reporting_import('||filename||')');
run;
In the one example I just ran, The one file contains 01-JUN-2015 and the other 02-JUN-2015. But what the code returns in the LOG file is:
The Source file route is: <route...>\FOO1.csv
##############################################################
The Source Reporting period states: Reporting Date:02-JUN-2015
##############################################################
The Source file route is: <route...>\FOO2.csv
##############################################################
The Source Reporting period states: Reporting Date:02-JUN-2015
##############################################################
Does anybody understand why this is happening? Or is there perhaps a better way to solve this?
UPDATE:
If I remove the code from the MACRO and run it manually for each input file, It works perfectly. So it must have something to do with the MACRO overwriting values.
CALL EXECUTE has tricky timing issues. When it invokes a macro, if that macro generates macro variables from data set variables, it's a good idea to wrap the macro call in %NRSTR(). That way call execute generates the macro call, but doesn't actually execute the macro. So try changing your call execute statement to:
call execute('%nrstr(%%)reporting_import('||filename||')');
I posted a much longer explanation here.
I'm not too clear on the connections between your files. But instead of importing the CSV files and then searching for your string, couldn't you use a pipe command to save the results of a grep search on your CSV files to a dataset and then read just in the results?
Update:
I tried replicating your issue locally and it works for me if I set file_indicator with a call symput as below instead of your into :file_indicator:
data file_indicator_tmp (KEEP= lbl);
set file_indicator_tmp;
if _N_ = 1;
lbl = "_410 - ACCOUNTS"n;
data _null_ ;
set file_indicator_tmp ;
if _n_=1 then call symput('file_indicator',lbl) ;
run;

Using SAS macro to import multiple txt files with sequential names

I have 4 txt files that need to be loaded to SAS and save them as 4 sas files. Here are how the text files look like: cle20130805.txt, cle20130812.txt, cle20130819.txt and cle20130826.txt . I used a % Do loop under % Macro in order to get the 4 files imported with only one invoke of the Macro. So Here is my code:
%macro cle;
%do i=20130805 %to 20130826 %by 7;
Data cleaug.cle&i;
infile "home/abc/cle&i..txt" dlm= '|' dsd firstobs=1 obs=100;
input a_no b_no c_no;
run;
%end;
%mend cle;
%cle
I am expect to have 4 sas file saved with only invoke the marco once. However it just can't run successfully. Any ideas where am I doing wrong in the code?
Thanks,
I don't recommend you try to write one macro to import all four files. Either it will be a specific macro you only ever use once - in which case you could just write this by hand and save the time you've already spent - or it will be something you have to modify every single month or whatever that you use it.
Instead, make the macro something that does precisely one file, but includes the information needed to call it easily. In this case, it sounds like you need one parameter: the date, so 20130805 or whatnot. Then give it a reasonable name that really says what it does.
%macro import_files(date=);
Data cleaug.cle&date.;
infile "home/abc/cle&date..txt" dlm= '|' dsd firstobs=1 obs=100;
input a_no b_no c_no;
run;
%mend import_files;
Now you call it:
%import_files(date=20130805)
%import_files(date=20130812)
%import_files(date=20130819)
%import_files(date=20130826)
Just as easy as the macro you wrote above, even hardcoding the four dates. If the dates are predictable in some fashion, you can generate the macro calls very easily as well (if there are more than 4, for example). You could do a directory listing of the location where the files are, or call the macro from a data step using CALL EXECUTE if you really like looping.