how to pass a string to a macro sas without special char - sas

I wrote this macro in sas to have some information about some files :
%macro info_1(cmd);
filename dirList pipe &cmd.;
data work.dirList;
infile dirList length=reclen;
input file $varying200. reclen;
permission=scan(file,1,"");
if input(scan(file,2,""), 8.)=1;
user=scan(file,3,"");
group=scan(file,4,"");
file_size_KB=round(input(scan(file,5,""), 8.)/1024,1);
file_size_MB=round(input(scan(file,5,""), 8.)/1024/1024,1);
modified_time=input(catx(" ",scan(file,6," "),scan(file,7,"")),anydtdtm.);
date_mod = datepart(modified_time);
time_zone=scan(file,8,"");
file_name=scan(file,-1,"");
format modified_time datetime19.;
format file_size_MB comma9.;
format date_mod date9.;
run;
%mend info_1;
then i declare this macro variable:
%let cmd = ls --full-time;
%let sep1= %quote( );
%let path = /data/projects/flat_file/meteo ;
%let file = *.json ;
%let sep2 = %quote(/) ;
%let fullpath = &cmd.&sep1.&path.&sep2.&file;
Then i try to execute the macro :
%info_1(&fullpath.);
And i see a strange thing. I post a image because it is impossible to describe it. I guess that there are special chars.
How to fix that thing?

Inside your macro I believe you just need to add quotes around this line:
filename dirList pipe &cmd.;
Right now after the macro resolution is happening it is treating it like:
filename dirList pipe ls --full-time;
... which is treating the ls part onwards like parameters to the filename statement. When fixed the code should appear as:
filename dirList pipe "&cmd";
Which will then be resolved as:
filename dirList pipe "ls --full-time";

Related

Create a table from one line CSV data on SAS

I try to import data from a csv with only one line data formatted like this :
CAS$#$#$LLT_CODE$#$#$PT_CODE$#$#$HLT_CODE$#$#$HLGT_CODE$#$#$SOC_CODE$#$#$LLT$#$#$PT$#$#$HLT$#$#$HLGT$#$#$SOC$#$#$SOC_ABB#$#$#DJ20210005-0$#$#$10001896$#$#$10012271$#$#$10001897$#$#$10057167$#$#$10029205$#$#$Maladie d'Alzheimer$#$#$Démence de type Alzheimer$#$#$Maladie d'Alzheimer (incl sous-types)$#$#$Déficiences mentales$#$#$Affections du système nerveux$#$#$Nerv#$#$#DJ20210005-0$#$#$10019308$#$#$10003664$#$#$10007607$#$#$10007510$#$#$10010331$#$#$Communication interauriculaire$#$#$Communication interauriculaire$#$#$Défauts congénitaux du septum cardiaque$#$#$Troubles congénitaux cardiovasculaires$#$#$Affections congénitales, familiales et génétiques$#$#$Cong#$#$#
"#$#$#" determine end of line and "$#$#$" separe columns.
How can i do to import it ?
Here's my code :
data a; infile "C:/Users/Papa Yatma/Documents/My SAS Files/9.4/ATCD.txt" dlm="$" dsd ; input var1 $ var2 $ var3 $ var4 $ var5 $ var6 $ var7 $ var8 $ var9 $ var10 $ var11 $ var12 $ ##; run;
Thank you for your help.
As long as the actual "records" are not too long I would use the DLMSTR= option to process the file twice. First to parse the "records" into lines. Then to read the fields from the lines.
So first make a new text file that has one line per record.
filename new temp;
data _null_;
infile have recfm=n lrecl=1000000 dlmstr='#$#$#';
file new ;
input line :$32767. #;
put line ;
run;
Now you can read the file NEW using the other delimiter string.
For example you could convert it to a real CSV file.
filename csv temp;
data _null_;
infile new dlmstr='$#$#$' length=ll column=cc truncover ;
file csv dsd ;
do until(cc>=ll);
input word :$32767. # ;
put word #;
end;
put;
run;
Results:
CAS,LLT_CODE,PT_CODE,HLT_CODE,HLGT_CODE,SOC_CODE,LLT,PT,HLT,HLGT,SOC,SOC_ABB
DJ20210005-0,10001896,10012271,10001897,10057167,10029205,Maladie d'Alzheimer,Démence de type Alzheimer,Maladie d'Alzheimer (incl sous-types),Déficiences mentales,Affections du système nerveux,Nerv
DJ20210005-0,10019308,10003664,10007607,10007510,10010331,Communication interauriculaire,Communication interauriculaire,Défauts congénitaux du septum cardiaque,Troubles congénitaux cardiovasculaires,"Affections congénitales, familiales et génétiques",Cong
This CSV file is then easy to read:
data test;
infile csv dsd firstobs=2 truncover ;
length CAS LLT_CODE PT_CODE HLT_CODE HLGT_CODE SOC_CODE LLT PT HLT HLGT SOC SOC_ABB $100;
input CAS -- SOC_ABB;
run;
If it is possible any of the values might include end of line characters then you should add code to replace those in the first step. For example you might add this line to replace CRLF strings with pipe characters.
line = tranwrd(line,'0D0A'x,'|');

Is there a way to instantly resolve macro variable created in a data step in the same data step?

Background is that I need to use filename command to execute grep and use the result as input.
Here is my input data set named test
firstname lastname filename
<blank> <blank> cus_01.txt
<blank> <blank> cus_02.txt
Filename values are actual files which I need to grep because I need certain string inside those files to fill up the firstname and lastname
Here is the code:
data work.test;
set work.test;
call symputx('file', filename);
filename fname pipe "grep ""Firstname"" <path>/&file.";
filename lname pipe "grep ""Lastname"" <path>/&file.";
infile fname;
input firstname;
infile lname;
input lastname;
run;
However, macro variables created inside a data step can't be used until after the data step procedure is completed. So, that means, &file. can't be resolved and can't be used in filename.
Is there a way to for resolve the macro variable?
Thanks!
This is not tested. You need to use the INFILE statement option FILEVAR.
data test;
input (firstname lastname filename) (:$20.);
cards;
<blank> <blank> cus_01.txt
<blank> <blank> cus_02.txt
;;;;
run;
data work.grep;
set work.test;
length cmd $128;
cmd = catx(' ','grep',quote(strip(firstname)),filename);
putlog 'NOTE: ' cmd=;
infile dummy pipe filevar=cmd end=eof;
do while(not eof);
input;
*something;
output;
end;
run;
If you have many customer files the use of pipe to grep can be an expensive operating system action, and on SAS servers potentially disallowed (pipe, x, system, etc...)
You can read all pattern-named files in a single data step using the wildcard feature of infile and the filename= option to capture the active file being read from.
Sample:
%let sandbox_path = %sysfunc(pathname(WORK));
* create 99 customer files, each with 20 customers;
data _null_;
length outfile $125;
do index = 1 to 99;
outfile = "&sandbox_path./" || 'cust_' || put(index,z2.) || '.txt';
file huzzah filevar=outfile;
putlog outfile=;
do _n_ = 1 to 20;
custid+1;
put custid=;
put "firstname=Joe" custid;
put "lastname=Schmoe" custid;
put "street=";
put "city=";
put "zip=";
put "----------";
end;
end;
run;
* read all the customer files in the path;
* scan each line for 'landmarks' -- either 'lastname' or 'firstname';
data want;
length from_whence source $128;
infile "&sandbox_path./cust_*.txt" filename=from_whence ;
source = from_whence;
input;
select;
when (index(_infile_,"firstname")) topic="firstname";
when (index(_infile_,"lastname")) topic="lastname";
otherwise;
end;
if not missing(topic);
line_read = _infile_;
run;

How to read file which has delimitor with in the double quotes

I have to read a file with a tab delimited x'05'c (dlm='0C'x). For few records the delimiter is present with in the string which has a double quotes. when I'm using '&' in the input statement it is working fine but records with more than one space is giving error.
Data I have to read:
1.AIRWORLDWIDE.z1234565
2.MEDICAL.y121546
3."INPUTTTFAM.ILY TRUST"
Output desired:
ID text text_ref
-----------------------------------
1 AIRWORLDWIDE z1234565
2 MEDICAL y121546
3 "INPUTTTFAM ILY TRUST"
My program :
Data Want;
format id $char1.
text $char12.
text_ref $char12.;
informat id $char1.
text $char12.
text_ref $char12.;
length id text text_ref;
infile have dlm='0C'x dsd END=eof missover ;
input id text text_ref;
/* input id (text text_ref) (& $12.); */
run;
thanks in advance
DSD is not the INFILE option you want here.
filename FT15F001 temp;
data want;
infile FT15F001 dlm='.' missover;
informat id $char1. text $char12. text_ref $char12.;
input (_all_)(:);
list;
parmcards;
1.AIRWORLDWIDE.z1234565
2.MEDICAL.y121546
3."INPUTTTFAM.ILY TRUST"
;;;;
run;
proc contents varnum;
run;
proc print;
run;

SAS Input from .txt where input spans multiple lines

everyone.
I have a question that is driving me crazy.
Say I have 2 text files that look like this:
File_one.txt:
Name_sample_f1 *spans one line
File_sample_f1 *spans one line
String_sample_f1 *spans multiple, varying lines until the end of the file
String_sample_f1
File_two.txt:
Name_sample_f2 *spans one line
File_sample_f2 *spans one line
String_sample_f2 *spans multiple, varying lines until the end of the file
String_sample_f2
String_sample_f2
String_sample_f2
I would like to input both of them into a dataset named test and take the following form:
Name File String
---- ---- ------
1 Name_sample_f1 File_sample_f1 String_sample_f1
String_sample_f1
2 Name_sample_f2 File_sample_f2 String_sample_f2
String_sample_f2
String_sample_f2
String_sample_f2
I appreciate it ahead of time if anyone can help!
Thanks
You don't have to do it quite as complicatedly as three datasteps (especially if you're going to do N files). It's pretty easy, really. Use the EOV indicator (End of Volume) to see when you're at the start of a new file [EOV is tripped after ending a volume/file] and each time you're at the start of a new file, read the name and filename in the first two lines.
data test;
format name filename $100.;
retain name filename line;
infile '("c:\temp\file1.txt", "c:\temp\file2.txt")' eov=end lrecl=100 pad truncover; *or use wildcards, like infile "c:\temp\file*.txt";
input a $ #;
put _all_;
if (_n_=1) or (end=1) then do;
end=0;
line=1;
end;
else line+1;
if line=1 then do;
input #1 name $100.;
end;
else if line=2 then do;
input #1 filename $100.;
end;
else do;
input #1 string $100.;
output;
end;
run;
filename file1 'testfile1.txt';
filename file2 'testfile2.txt';
DATA file1;
LENGTH thisname thisfile thistext $ 200;
RETAIN thisname thisfile;
linecounter=0;
DO UNTIL(eof);
INFILE file1 end = eof;
INPUT;
linecounter+1;
IF (linecounter eq 1) THEN thisname=_infile_;
ELSE IF (linecounter eq 2) then thisfile=_infile_;
ELSE DO;
thistext=_infile_;
output;
END;
END;
RUN;
DATA file2;
LENGTH thisname thisfile thistext $ 200;
RETAIN thisname thisfile;
linecounter=0;
DO UNTIL(eof);
INFILE file2 end = eof;
INPUT;
linecounter+1;
IF (linecounter eq 1) THEN thisname=_infile_;
ELSE IF (linecounter eq 2) then thisfile=_infile_;
ELSE DO;
thistext=_infile_;
output;
END;
END;
RUN;
DATA all_files;
SET file1 file2;
RUN;
PROC PRINT DATA=all_files; RUN;

In SAS, outside of a data step, what is the best way to replace a character in a macro variable with a blank?

In SAS, outside of a data step, what is the best way to replace a character in a macro variable with a blank?
It seems that TRANSLATE would be a good function to use. However when using %SYSFUNC with this function, the parameters are not surrounded with quotes. How do you indicate a blank should be used as replacement?
The %str( ) (with a blank between the parens) can be used to indicate a blank for this parameter. Also be careful with TRANSLATE...the 2nd param is the replacement char...however in TRANWRD it is reversed.
%macro test ;
%let original= translate_this_var ;
%let replaceWithThis= %str( ) ;
%let findThis= _ ;
%let translated= %sysfunc(translate(&original, &replaceWithThis, &findThis)) ;
%put Original: &original ***** TRANSLATEd: &translated ;
%mend ;
%test;
%macro test2 ;
%let original= translate_this_var ;
%let replaceWithThis= %str( ) ;
%let findThis= _ ;
%let tranwrded= %sysfunc(tranwrd(&original, &findThis, &replaceWithThis)) ;
%put Original: &original ***** TRANWRDed: &tranwrded ;
%mend ;
%test2
There are no quotes in macro language. The only quotes characters that are in use are the & , % etc. to indicate that the text should be interpreted as a macro "operator". A blank is represented by %str( ) as indicated above in Carolina's post.
you can use perl reg ex instead, like:
%put ***%sysfunc(prxchange(s/x/ /, -1, abxcdxxf))***;
/* on log
***ab cd f***
*/