This code works on PC SAS 9.4 but not on SAS Enterprise Guide. Is there a way to make this work on EG?
FILENAME SASCBTBL CATALOG "work.temp.attrfile.source";
DATA _NULL_;
FILE SASCBTBL;
PUT "ROUTINE WNetGetConnectionA MODULE=MPR MINARG=3 MAXARG=3 STACKPOP=CALLED RETURNS=LONG;";
PUT " ARG 1 CHAR INPUT BYADDR FORMAT=$CSTR200.;";
PUT " ARG 2 CHAR UPDATE BYADDR FORMAT=$CSTR200.;";
PUT " ARG 3 NUM UPDATE BYADDR FORMAT=PIB4.;";
RUN;
%MACRO getUNC;
DATA zz1;
length input_dir $200 output_dir $200;
* The input directory can only be a drive letter + colon ONLY e.g. j: ;
input_dir = 'O:';
output_dir = ' ';
output_len = 200;
call module('*IE',"WNetGetConnectionA", input_dir, output_dir, output_len);
call symputx('dir',input_dir,'l');
call symputx('path',output_dir,'l');
RUN;
%put drive letter is &dir;
%put path is &path;
%MEND getunc;
%getunc;
When I try to run it on SAS EG I get the following note:
ERROR: Module MPR could not be loaded.
NOTE: Invalid argument to function MODULE('WNetGetConne'[12 of 18 characters shown],'O: '[12 of 200 characters shown],'
'[12 of 200 characters shown],200) at line 54 column 186.
input_dir=O: output_dir= output_len=200 ERROR=1 N=1
Related
I have a dataset of address standardizations -- take STREET and replace it with ST -- and wish to write code that does the substitution. When testing out the code, it appears as intended in the LOG file but extra spaces are added when I write to the text file. I don't want the extra spaces.
-=-=-=-=-=- SAS CODE
data std ;
length pre $16 post $8 ;
infile datalines delimiter=',' ;
input pre $ post $ ;
pre = strip(pre);
post = strip(post);
datalines;
AVENUES , AVE
AVENUE , AVE
BOULEVARD , BLVD
CIRCLE , CIR
;
run;
data _null_ ;
file "&test.txt";
set std ;
p1 = trim(pre) ;
p2 = trim(post);
put '&var = strip( prxchange("s/(^|\s)' p1 +(-1) '\s/ ' p2 +(-1) ' /i",-1,&var) );' ;
run;
-=-=-=-=-=-=-=- END OF CODE
The SAS code produces the following ...
&var = strip( prxchange("s/(^|\s)AVENUES\s/ AVE /i",-1,&var) );
&var = strip( prxchange("s/(^|\s)AVENUE\s/ AVE /i",-1,&var) );
&var = strip( prxchange("s/(^|\s)BOULEVARD\s/ BLVD /i",-1,&var) );
... in the LOG file when I remove the file statement, but writes ...
&var = strip( prxchange("s/(^|\s)AVENUES \s/ AVE /i",-1,&var) );
&var = strip( prxchange("s/(^|\s)AVENUE \s/ AVE /i",-1,&var) );
&var = strip( prxchange("s/(^|\s)BOULEVARD \s/ BLVD /i",-1,&var) );
... with extra spaces inside the REGEX function in the file test.txt.
This is SAS 9.4 which I'm using through a web-based SAS Studio.
So, your problem is based on how SAS stores character variables.
A character variable is always equal to the characters stored in that variable, followed by as many space ('20'x) characters as needed to fill the length of the data storage. This differs from (mostly newer) languages that have a string terminator character or similar; SAS has no such character, it just fills the space with spaces. So if the variable is 8 bytes long, and contains Avenue, then it actually contains Avenue .
You cannot change that in code, outside of a single line of code. So, your lines:
p1 = trim(pre) ;
p2 = trim(post);
Are meaningless - they do nothing except waste CPU time (sadly, not optimized away from what I can tell).
You need to trim in the line you use the value, as there it can be trimmed away. Now, you can't put a trim(...), so you need to compose your line to be written elsewhere, or else use the $varying. format.
Here's one example:
filename tempfile temp;
data _null_ ;
file tempfile;
set std ;
result = cats('&var = strip( prxchange("s/(^|\s)',pre,catx(' ','\s/',post,'/i",-1,&var) );'));
put result ;
run;
data _null_;
infile tempfile;
input #;
put _infile_;
run;
Here's an example using $varying.:
data _null_ ;
file tempfile;
set std ;
varlen_pre = length(pre);
varlen_post = length(post);
put '&var = strip( prxchange("s/(^|\s)' pre $varying16. varlen_pre '\s/ ' post $varying8. varlen_post ' /i",-1,&var) );' ;
run;
As to why your log doesn't match the file, that's because SAS has slightly different rules for when it writes to logs than when it writes to files. It's much more exact about what it writes to a file; you say it, it writes it. For logs it has a few places where it removes spaces for you, presumably to make logs more readable, as it's not as necessary to be precise. This can be a pain when you DO want the precision in the log, and of course in your case where you want the log to match what you're seeing...
Finally, a note on what you're doing. I don't highly recommend using a regex the way you're using it. It's very slow. Unless you're only doing a handful of replacements, or only have a small dataset size, or really don't care how long this takes...
If it's just 1:1 replacements, I'd recommend tranwrd, which is much faster. See just this small comparison:
data in_data;
do _n_ = 1 to 1e5;
address = catx(' ',rand('Integer',1,9999),'Avenue');
output;
address = catx(' ',rand('Integer',1,9999),'Street');
output;
address = catx(' ',rand('Integer',1,9999),'Boulevard');
output;
address = catx(' ',rand('Integer',1,9999),'Circle');
output;
address = catx(' ',rand('Integer',1,9999),'Route');
output;
end;
run;
data want;
set in_data;
rx_ave = prxparse('s/(^|\s)Avenue\s/ Ave /ios');
rx_st = prxparse('s/(^|\s)Street\s/ St/ios');
rx_blvd = prxparse('s/(^|\s)Boulevard\s/ Blvd /ios');
rx_cir = prxparse('s/(^|\s)Circle\s/ Cir /ios');
do i = 1 to 4;
address = prxchange(i,-1,address);
end;
run;
data want;
set in_data;
address = tranwrd(Address,'Avenue','Ave');
address = tranwrd(address,'Street','St');
address = tranwrd(address,'Boulevard','Blvd');
address = tranwrd(address,'Circle','Cir');
run;
Both work the same - given you're already using 'words' anyway - but the second works in basically the time it takes to write out the dataset (0.15s for me), while the first takes 20s of CPU time on my SAS server. Loading the regex library is really slow.
I have a sample data set like below.
data d01;
infile datalines dlm='#';
input Name & $15. IdNumber & $4. Salary & $5. Site & $3.;
datalines;
アイ# 2355# 21163# BR1
アイウエオ# 5889# 20976# BR1
カキクケ# 3878# 19571# BR2
;
data _null_ ;
set d01 ;
file "/folders/myfolders/test.csv" lrecl=1000 ;
length filler $3;
filler = ' ';
w_out = ksubstr(Name, 1, 5) || IdNumber || Salary || Site || filler;
put w_out;
run ;
I want to export this data set to csv (fixed-width format) and every line will has the length of 20 byte (20 1-byte-character).
But SAS auto remove my trailing spaces. So the result would be 17 byte for each line. (the filler is truncated)
I know I can insert the filler like this.
put w_out filler $3.;
But this won't work in case the `site' column is empty, SAS will truncate its column and the result also not be 20 byte for each line.
I didn't quite understand what you are trying to do with ksubstr, but if you want to add padding to get the total length to 20 characters, you may have to write some extra logic:
data _null_ ;
set d01 ;
file "/folders/myfolders/test.csv" lrecl=1000 ;
length filler $20;
w_out = ksubstr(Name,1,5) || IdNumber || Salary || Site;
len = 20 - klength(w_out) - 1;
put w_out #;
if len > 0 then do;
filler = repeat(" ", len);
put filler $varying20. len;
end;
else put;
run ;
You probably do not want to write a fixed column file using a multi-byte character set. Instead look into seeing if your can adjust your process to use a delimited file instead. Like you did in your example input data.
If you want the PUT function to write a specific number of bytes just use formatted PUT statement. To have the number of bytes written vary based on the strings value you can use the $VARYING format. The syntax when using $VARYING is slightly different than when using normal formats. You add a second variable reference after the format specification that contains the actual number of bytes to write.
You can use the LENGTH() function to calculate how many bytes your name values take. Since it normally ignores the trailing space just add another character to the end and subtract one from the overall length.
To pad the end with three blanks you could just add three to the width used in the format for the last variable.
data d01;
infile datalines dlm='#';
length Name $15 IdNumber $4 Salary $5 Site $3 ;
input Name -- Site;
datalines;
アイ# 2355# 21163# BR1
アイウエオ# 5889# 20976# BR1
カキクケ# 3878# 19571# BR2
Sam#1#2#3
;
filename out temp;
data _null_;
set d01;
file out;
nbytes=length(ksubstr(name,1,5)||'#')-1;
put name $varying15. nbytes IdNumber $4. Salary $5. Site $6. ;
run;
Results:
67 data _null_ ;
68 infile out;
69 input ;
70 list;
71 run;
NOTE: The infile OUT is:
Filename=...\#LN00059,
RECFM=V,LRECL=32767,File Size (bytes)=110,
Last Modified=15Aug2019:09:01:44,
Create Time=15Aug2019:09:01:44
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1 アイ 235521163BR1 24
2 アイウエオ588920976BR1 30
3 カキクケ 387819571BR2 28
4 Sam 1 2 3 20
NOTE: 4 records were read from the infile OUT.
The minimum record length was 20.
The maximum record length was 30.
By default SAS sets an option of NOPAD on a FILE statement, it also sets each line to 'variable format', which means lengths of lines can vary according to the data written. To explicitly ask SAS to pad your records out with spaces, don't use a filler variable, just:
Set the LRECL to the width of file you need (20)
Set the PAD option, or set RECFM=F
Sample code:
data _null_ ;
set d01 ;
file "/folders/myfolders/test.csv" lrecl=20 PAD;
w_out = Name || IdNumber || Salary || Site;
put w_out;
run ;
More info here: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000171874.htm#a000220987
I am trying to read the folder with zip files using Pipe Command. But I get error saying ls command not recognized. There are actually 2 zip files(ABC_*.zip) in the folder /PROD/
Can anybody help me in this?
%let extl_dir=/PROD/ ;
filename zl pipe "ls &extl_dir.ABC_*.zip";
data ziplist_a;
infile zl end=last;
length path $200 zipnm $50 filedt $15;
input path $;
zipnm=scan(path,-1,"/");
filedt=scan(scan(path,-1,"_"),1,".");
call symput('zip'||left(_n_), zipnm);
call symput('path'||left(_n_), path);
call symput('filedt'||left(_n_),filedt);
if last then call symput('num_zip',_n_);
*call symput('flenm',filenm);
run;
SAS has published a convenient macro to list files within a directory that does not rely upon running external commands. It can be found here. I prefer this approach as it does not introduce external sources of possible error such as user permissions, pipe permissions etc.
The macro uses datastep functions (through %sysfunc) and the commands can be called in the same manner from a datastep. Below is an example which extracts tile information.
%let dir = /some/folder;
%let fType = csv;
data want (drop = _:);
_rc = filename("dRef", "&dir.");
_id = dopen("dRef");
_n = dnum(_id);
do _i = 1 to _n;
name = dread(_id, _i);
if upcase(scan(name, -1, ".")) = upcase("&fType.") then do;
_rc = filename("fRef", "&dir./" || strip(name));
_fid = fopen("fRef");
size = finfo(_fid, "File Size (bytes)");
dateCreate = finfo(_fid, "Create Time");
dateModify = finfo(_fid, "Last Modified");
_rc = fclose(_fid);
output;
end;
end;
_rc = dclose(_id);
run;
I have an issue with an unresolved macro variable in the following (part of a) macro:
DATA _NULL_;
SET TempVarFormat END=Last;
LENGTH FormatValues $10000;
RETAIN FormatValues;
IF &OnlyNumeric = 1 THEN
FormatValues = CATX(" ",FormatValues,STRIP(LookUpValue)||
" = "||CATQ("A",TRIM(LookupDescription)));
ELSE
FormatValues = CATX(" ",FormatValues,CATQ("A"
,LookUpValue)||" = "||CATQ("A"
,TRIM(LookupDescription)));
Test = STRIP(FormatValues);
PUT Test /* To test buildup of variable */;
IF Last THEN CALL SYMPUT('FormatValuesM',STRIP(FormatValues));
IF Last THEN CALL SYMPUT('DataCollectionFK',DataCollectionFK);
RUN;
/* Make format with PROC FORMAT*/
%IF &OnlyNumeric = 1 %THEN %DO;
PROC FORMAT LIB=WORK;
VALUE DC&DataCollectionFK.A&AttributeFK.Format &FormatValuesM;
RUN;
%END;
%ELSE %IF &OnlyNumeric = 0 %THEN %DO;
PROC FORMAT LIB=WORK;
VALUE $DC&DataCollectionFK.A&AttributeFK.Format &FormatValuesM;
RUN;
%END;
I get the following warning
Apparent symbolic reference FORMATVALUESM not resolved.
And if I look in the log &DataCollectionFK is resolved but &FormatValues is not.
PROC FORMAT LIB=WORK; VALUE DC170A570Format &FormatValuesM;
Could someone advice? It is driving me nuts.
I tested it also without the STRIP() function and replacing the CALL SYMPUT with PUT to see if the variable is assigned a value. This all works fine.
Log copy (as requested in comment)
4 +
DATA _NULL_; SET TempVarFormat END=Last; LENGTH
5 + FormatValues $10000; RETAIN FormatValues; IF 1 = 1 THEN FormatValues = CATX("
",FormatValues,STRIP(LookUpValue)|| " = "||CATQ("A",TRIM(LookupDescription))); ELSE
FormatValues = CATX(" ",FormatValues,CATQ("A" ,LookUpValue)||" = "||CATQ("A" ,TRIM
6 +(LookupDescription))); Test = STRIP(FormatValues); PUT Test ; IF Last THEN CALL
SYMPUT('DataCollectionFK',DataCollectionFK); IF Last THEN CALL SYMPUT('FormatValuesM',Test);
RUN;
NOTE: Numeric values have been converted to character values at the places given by:
(Line):(Column).
6:107
1 = "Ja"
1 = "Ja" 0 = "Nee"
NOTE: There were 2 observations read from the data set WORK.TEMPVARFORMAT.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
6 +
PROC FORMAT LIB=WORK; VALUE DC170A1483Format &FormatValuesM; RUN;;
NOTE: Format DC170A1483FORMAT is already on the library.
NOTE: Format DC170A1483FORMAT has been output.
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
MPRINT LOG
MPRINT(CONSTRUCTVARIABLEFORMAT): DATA TestDataSetFormat;
MPRINT(CONSTRUCTVARIABLEFORMAT): SET TempVarFormat END=Last;
MPRINT(CONSTRUCTVARIABLEFORMAT): LENGTH FormatValues $10000;
MPRINT(CONSTRUCTVARIABLEFORMAT): RETAIN FormatValues;
MPRINT(CONSTRUCTVARIABLEFORMAT): IF 1 = 1 THEN FormatValues = CATX("
",FormatValues,STRIP(LookUpValue)|| " = "||CATQ("A",TRIM(LookupDescription)));
MPRINT(CONSTRUCTVARIABLEFORMAT): ELSE FormatValues = CATX(" ",FormatValues,CATQ("A"
,LookUpValue)||" = "||CATQ("A" ,TRIM(LookupDescription)));
MPRINT(CONSTRUCTVARIABLEFORMAT): Test = STRIP(FormatValues);
MPRINT(CONSTRUCTVARIABLEFORMAT): PUT Test ;
MPRINT(CONSTRUCTVARIABLEFORMAT): IF Last THEN CALL
SYMPUT('DataCollectionFK',DataCollectionFK);
MPRINT(CONSTRUCTVARIABLEFORMAT): IF Last THEN CALL SYMPUT('FormatValuesM',Test);
MPRINT(CONSTRUCTVARIABLEFORMAT): RUN;
MPRINT(CONSTRUCTVARIABLEFORMAT): PROC FORMAT LIB=WORK;
WARNING: Apparent symbolic reference FORMATVALUESM not resolved.
MPRINT(CONSTRUCTVARIABLEFORMAT): VALUE DC170A1483Format &FormatValuesM;
MPRINT(CONSTRUCTVARIABLEFORMAT): RUN;
EDIT with some more attemps:
The problem lies in that the macro variable is not getting a value during the datastep, for some reason. Loading the macrovariable with an empty value before I run the macro, makes
that the script does not give an error. But the variable is resolved as an empty variable.
removing the IF Last THEN parts, also does not alter the outcome.
Surely it'll be easier/simpler to use the cntlin= option of PROC FORMAT to pass in a dataset containing the relevant format name, start, end, label values...
A simple example...
/* Create dummy format data */
data formats ;
fmtname = 'MYCHARFMT' ;
type = 'C' ;
do n1 = 'A','B','C','D','E' ;
start = n1 ;
label = repeat(n1,5) ;
output ;
end ;
fmtname = 'MYNUMFMT' ;
type = 'N' ;
do n2 = 1 to 5 ;
start = n2 ;
label = repeat(strip(n2),5) ;
output ;
end ;
drop n1 n2 ;
run ;
/* dummy data looks like this... */
fmtname type start label
MYCHARFMT C A AAAAAA
MYCHARFMT C B BBBBBB
MYCHARFMT C C CCCCCC
MYCHARFMT C D DDDDDD
MYCHARFMT C E EEEEEE
MYNUMFMT N 1 111111
MYNUMFMT N 2 222222
MYNUMFMT N 3 333333
MYNUMFMT N 4 444444
MYNUMFMT N 5 555555
/* Build formats from dataset */
proc format cntlin=formats library=work ; run ;
There are several other fields which can be defined in your format dataset to cater for low/high/missing values, ranges, etc.
See the SAS documentation > http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473464.htm
I have a PROC EXPORT question that I am wondering if you can answer.
I have a SAS dataset with 800+ variables and over 200K observations and I am trying to export a subset of the variables to a CSV file (i.e. I need all records; I just don’t want all 800+ variables). I can always create a temporary dataset “KEEP”ing just the fields I need and run the EXPORT on that temp dataset, but I am trying to avoid the additional step because I have a large number of records.
To demonstrate this, consider a dataset that has three variables named x, y and z. But, I want the text file generated through PROC EXPORT to only contain x and y. My attempt at a solution below does not quite work.
The SAS Code
When I run the following code, I don’t get exactly what I need. If you run this code and look at the text file that was generated, it has a comma at the end of every line and the header includes all variables in the dataset anyway. Also, I get some messages in the log that I shouldnt be getting.
data ds1;
do x = 1 to 100;
y = x * x;
z = x * x * x;
output;
end;
run;
proc export data=ds1(keep=x y)
file='c:\test.csv'
dbms=csv
replace;
quit;
Here are the first few lines of the text file that was generated ("C:\test.csv")
x,y,z
1,1,
2,4,
3,9,
4,16,
The SAS Log
9343 proc export data=ds1(keep=x y)
9344 file='c:\test.csv'
9345 dbms=csv
9346 replace;
9347 quit;
9348 /**********************************************************************
9349 * PRODUCT: SAS
9350 * VERSION: 9.2
9351 * CREATOR: External File Interface
9352 * DATE: 30JUL12
9353 * DESC: Generated SAS Datastep Code
9354 * TEMPLATE SOURCE: (None Specified.)
9355 ***********************************************************************/
9356 data _null_;
9357 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
9358 %let _EFIREC_ = 0; /* clear export record count macro variable */
9359 file 'c:\test.csv' delimiter=',' DSD DROPOVER lrecl=32767;
9360 if _n_ = 1 then /* write column names or labels */
9361 do;
9362 put
9363 "x"
9364 ','
9365 "y"
9366 ','
9367 "z"
9368 ;
9369 end;
9370 set DS1(keep=x y) end=EFIEOD;
9371 format x best12. ;
9372 format y best12. ;
9373 format z best12. ;
9374 do;
9375 EFIOUT + 1;
9376 put x #;
9377 put y #;
9378 put z ;
9379 ;
9380 end;
9381 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
9382 if EFIEOD then call symputx('_EFIREC_',EFIOUT);
9383 run;
NOTE: Variable z is uninitialized.
NOTE: The file 'c:\test.csv' is:
Filename=c:\test.csv,
RECFM=V,LRECL=32767,File Size (bytes)=0,
Last Modified=30Jul2012:12:05:02,
Create Time=30Jul2012:12:05:02
NOTE: 101 records were written to the file 'c:\test.csv'.
The minimum record length was 4.
The maximum record length was 10.
NOTE: There were 100 observations read from the data set WORK.DS1.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.01 seconds
100 records created in c:\test.csv from DS1.
NOTE: "c:\test.csv" file was successfully created.
NOTE: PROCEDURE EXPORT used (Total process time):
real time 0.12 seconds
cpu time 0.06 seconds
Any ideas how I can solve this problem? I am running SAS 9.2 on windows 7.
Any help would be appreciated. Thanks.
Karthik
Based in Itzy's comment to my question, here is the answer and this does exactly what I need.
proc sql;
create view vw_ds1 as
select x, y from ds1;
quit;
proc export data=vw_ds1
file='c:\test.csv'
dbms=csv
replace;
quit;
Thanks for the help!