This issue drives me crazy. In SAS, when I want to concat a string, the variable which will be assigned to the result cannot be used in the input.
DATA test1;
LENGTH x $20;
x = "a";
x = x || "b";
RUN;
Result: x = "a";
DATA test2;
LENGTH x $20;
y = "a";
x = y || "b";
RUN;
Result: x = "ab";
DATA test3;
LENGTH x $20;
x = "a";
y = x;
x = y || "b";
RUN;
Result: x = "a";
The last one is so strange. x is not even involved in concat directly.
This does not make sense. Because 1) you can do other operations in this way, e.g. transtrn, substr. 2) SAS does not give any warning message.
Why?
It's because the length of X is initially set 20, so it has 19 trailing blanks. If you add b, there isn't room for it, because of the trailing blanks. Either trim the x before cat operator or use catt. You can use lengthc to see the length of the character variable.
DATA test1;
LENGTH x $20;
x = "a";
len=lengthc(x);
x = trim(x) || "b";
*x = catt(x, b);
RUN;
proc print data=test1;
run;
You could also use substr() on the left hand side of the equation. Something like:
substr(x,10,1) = 'a';
to set the 10th car to 'a'. Then loop over each character in x (where the 10 is).
Related
I created a SAS function using fcmp to calculate the jaccard distance between two strings. I do not want to use macros, as I'm going to use it through a large dataset for multiples variables. the substrings I have are missing others.
proc fcmp outlib=work.functions.func;
function distance_jaccard(string1 $, string2 $);
n = length(string1);
m = length(string2);
ngrams1 = "";
do i = 1 to (n-1);
ngrams1 = cats(ngrams1, substr(string1, i, 2) || '*');
end;
/*ngrams1= ngrams1||'*';*/
put ngrams1=;
ngrams2 = "";
do j = 1 to (m-1);
ngrams2 = cats(ngrams2, substr(string2, j, 2) || '*');
end;
endsub;
options cmplib=(work.functions);
data test;
string1 = "joubrel";
string2 = "farjoubrel";
jaccard_distance = distance_jaccard(string1, string2);
run;
I expected ngrams1 and ngrams2 to contain all the substrings of length 2 instead I got this
ngrams1=jo*ou*ub
ngrams2=fa*ar*rj
If you want real help with your algorithm you need to explain in words what you want to do.
I suspect your problem is that you never defined how long you new character variables NGRAM1 and NGRAM2 should be. From the output you show it appears that FCMP defaulted them to length $8.
To define a variable you need use a LENGTH statement (or an ATTRIB statement with the LENGTH= option) before you start referencing the variable.
I wrote this to reverse array in OCaml like how I usually do in Java:
let reversearray array = let len=Array.length array in
for i=0 to (len/2) do
let temp = array.(i) in
array.(i) <- array.(len-i-1);
array.(len-i-1) <- temp done;
array;;
However, this does not seem to work sometimes when there is an even number array.
# let a2 = [|"a"; "b"; "c"; "d"; "e"; "f"|];;
val a2 : string array = [|"a"; "b"; "c"; "d"; "e"; "f"|]
# reversearray a2;;
- : string array = [|"f"; "e"; "c"; "d"; "b"; "a"|]
Could someone explain what's wrong?
Let's say the length is 2. Your for loop is executing for i = 0 and 1. This reverses the elements twice.
Commonly one writes loops in C-influenced languages like this:
for (i = 0; i < len/2; i++)
This executes just for i = 0, which does the right thing. OCaml executes the loop for all values including the upper value.
I am trying to concatenate three string variables.
Data X ;
a = "A" ;
b = "B" ;
c = "C" ;
z = catx ( '0D0A'x, a, b, c ) ;
run;
I am trying to display the string values like this in the final dataset, so that the values appear one below the other -
A
B
C
But by using '0D0A'x option, the string appears as ABC. I have to display the z variable in an excel. If I had to output the same into a HTML file then I would have used "\n" as an option in CATX function. Is there a way where I can introduce new line characters.
I adapted your example slightly:
libname test excel "%sysfunc(pathname(work))\text.xls";
Data test.X ;
a = "A" ;
b = "B" ;
c = "C" ;
z = catx ( '0D0A'x, a, b, c ) ;
run;
libname test clear;
x "explorer ""%sysfunc(pathname(work))\text.xls""";
If you use this approach, you get a value in cell D2 which contains the line breaks as expected. However, in order for them to display correctly, you have to enable the 'wrap text' option for the cell formatting.
I'm trying to create a code to run a simple perceptron in SAS base.
I'd like to print in each iteration (or store in a table) the result and the target, but I get an error when I try to print y[i,]:
proc iml;
use percept; read all var{x1 X2} into X;
read all var{Y} into Y;
W={0,0}; b=0; k=0; L=nrow(X); eta=.8; o=0;
print w b k L eta;
do step = 1 to 6;
mistakes=0;
do i=1 to L;
o=(X[i, ]*W + b);
if Y[i, ]*o <= 0 then do;
W = W + eta*(Y[i, ]-o)*X[i,]`;
b = b + eta*(Y[i, ]-o)*1;
k=k+1; mistakes=mistakes+1;
print o Y[i, ] W b k mistakes;
end;
end;
end;
I get the error:
Syntax error, expecting one of the following: C, COLNAME, F, FORMAT,
L, LABEL, R,
ROWNAME, ], |). The option or parameter is not recognized and will be ignored.
Do I have any other form to print the target?
Thanks a lot!
Per the documentation on PRINT, you need to do it like this:
print(Y[i,])
This is because they overload the [ ] to indicate formatting, rownames/colnames, etc., which is rather silly (but presumably to imitate some other language?). So you just need to wrap (Y[i,]) like so.
Here's a silly example.
proc iml;
use sashelp.class;
read all var{name,sex} into class;
read all var{height,weight,age} into classN;
y = mean(classN[,2]);
print class;
print (class[1:2,]);
print y (class[1:2,]);
quit;
I may be missing something obvious, but how do you calculate 'powers' in SAS?
Eg X squared, or Y cubed?
what I need is to have variable1 ^ variable2, but cannot find the syntax... (I am using SAS 9.1.3)
got it! there is no function.
you need to do:
variable1 ** variable2;
data t;
num = 5;
pow = 2;
res = num**pow;
run;
proc print data = t;
run;
Use the POWER function and, if necessary, the CONSTANT function.
nbr_squared = power(nbr, 2);
nbr_cubed = power(nbr, 3);
E_to_the_power_2 = power(constant('E'),2);