SAS Proc IML: Do Loop to Populate a Matrix - sas

I have the following code that works great in MATLAB and I which to transpose in SAS/PROC IML:
[row col] = size(coeff);
A_temp = zeros(row,col);
for i = 1: row/6
A_temp(6*(i-1)+1:6*i,:) = coeff(6*(i-1)+1:6*i,4:col);end;
In Proc IML I do the following:
proc iml;
use i.coeff;
read all var {...} into coeff;
print coeff;
row=NROW(coeff);
print row;
col=NCOL(coeff);
print col;
A_temp=J(row,col,0); *create zero matrix;
print A_temp;
Do i=1 TO row/6;
A_temp[(6*(i-1)+1):(6*i),]=coeff[(6*(i-1)+1):(6*i),(4:col)];
END;
quit;
The code breaks down at the DO loop "(execution) Matrices do not conform to the operation.
"...why? If I understand correctly in PROC IML if I wish to select all column (in MATLAB this would be ":") but in SAS IML I simply leave it blank

You should specify it correctly. A[rows,] means ALL columns of A, not just any number of them. See this simplified example:
proc iml;
/* use i.coeff;
read all var {...} into coeff;
print coeff;
*/
coeff = J(15,10,3);
row=NROW(coeff);
print row;
col=NCOL(coeff);
print col;
A_temp=J(row,col,0); *create zero matrix;
print A_temp;
Do i=1 TO row;
* does not work; *A_temp[i,]=coeff[i,(4:col)];
A_temp[i,1:col-3]=coeff[i,(4:col)];
END;
quit;

Related

How to replace a join with a datastep in SAS

I am trying to calculate some statistics for a given variable based on the client id and the time horizon. My current solution is show below, however, I would like to know if there is a way to reformat the code into a datastep instead of an sql join, because the join takes a very long time to execute on my real dataset.
data have1(drop=t);
id = 1;
dt = '31dec2020'd;
do t=1 to 10;
dt = dt + 1;
var = rand('uniform');
output;
end;
format dt ddmmyyp10.;
run;
data have2(drop=t);
id = 2;
dt = '31dec2020'd;
do t=1 to 10;
dt = dt + 1;
var = rand('uniform');
output;
end;
format dt ddmmyyp10.;
run;
data have_fin;
set have1 have2;
run;
Proc sql;
create table want1 as
select a.id, a.dt,a.var, mean(b.var) as mean_var_3d
from have_fin as a
left join have_fin as b
on a.id = b.id and intnx('day',a.dt,-3,'S') < b.dt <= a.dt
group by 1,2,3;
Quit;
Proc sql;
create table want2 as
select a.id, a.dt,a.var, mean(b.var) as mean_var_3d
from have_fin as a
left join have_fin as b
on a.id = b.id and intnx('day',a.dt,-6,'S') < b.dt <= a.dt
group by 1,2,3;
Quit;
Use temporary arrays and a single data step instead.
This does the same thing in a single step.
Sort data to ensure order is correct
Declare a temporary array for each set of moving average you want to calculate.
Ensure the array is empty at the start of each ID
Assign values to array in correct index. MOD() allows you to dynamically index the data without have to include a separate counter variable.
Take the average of the array. If you want the array to ignore the first two values - because it has only 1/2 data points you can conditionally calculate this as well.
*sort to ensure data is in correct order (Step 1);
proc sort data=have_fin;
by id dt;
run;
data want;
*Step 2;
array p3{0:2} _temporary_;
array p6(0:5) _temporary_;
set have_fin;
by ID;
*clear values at the start of each ID for the array Step3;
if first.ID then call missing(of p3{*}, of p6(*));
*assign the value to the array, the mod function indexes the array so it's continuously the most recent 3/6 values;
*Step 4;
p3{mod(_n_,3)} = var;
p6{mod(_n_,6)} = var;
*Step 5 - calculates statistic of interest, average in this case;
mean3d = mean(of p3(*));
mean6d = mean(of p6(*));
;
run;
And if you have SAS/ETS licensed this is super trivial.
*prints product to log - check if you have SAS/ETS licensed;
proc product_status;run;
*sorts data;
proc sort data=have_fin;
by id dt;
run;
*calculates moving average;
proc expand data=have_fin out=want_expand;
by ID;
id dt;
convert var = mean_3d / method=none transformout= (movave 3);
convert var = mean_6d / method=none transformout= (movave 6);
run;

Error when assign value for matrix in iml procedure

I simply want to assign a row of matrix by a vector in iml procedure, but it returns error. The code is as follow, how can I fixed it?
proc iml;
za=repeat(0,4,3);
a=123;
b=321;
c=222;
za[1,]={a,b,c};
run;
print(za);
proc print;run;
There are a few problems with your code. {x, y, z} is a column vector. {x y z} is a row vector. Which means you attempt to insert a column in a row. Also, the syntax is a bit off.
Using your own code, you can do this.
proc iml;
za=repeat(0,4,3);
a=123;
b=321;
c=222;
za[1,] = a || b || c;
print(za);
quit;
A simpler way would be
proc iml;
za = j(4, 3, 0);
v = {123 321 222};
za[1, ] = v;
print za;
quit;

How to reuse use SAS output in other procedures

How do I use the output variable of one PROC into another PROC.
I'm new to SAS and have spend many hours trying to solve this problem in the program below.
DATA FA2;
SET FA2;
proc iml;
start main;
use FA2;
read all var {Close};
s = Close;
u = j(nrow(s)-1,1,0);
do i=2 to nrow(s);
u[i-1]=log(s[i]/s[i-1]);
end;
n=nrow(s)-1;
rsigma=sqrt(252/n*(u'*u));
mu = mean(u);
call qntl(q, u);
print q[rowname={"P05", "P95"}];
s = quartile(u);
PRINT mu, rsigma, s;
finish;
run;
PROC UNIVARIATE DATA = FA2;
var u;
run;
ERROR
PROC UNIVARIATE DATA = FA2;
var u;
Variable U not found.
run;
Your first two lines accomplish nothing and are not useful - remove them.
data fa2;
set fa2;
You are not creating any output data set within the IML statement, and you really shouldn't reuse the same data set name (FA2) so many times. It makes it harder to debug your code. See the example below.
proc iml;
/** create SAS data set from vectors **/
y = {1,0,3,2,0,3}; /** 6 x 1 vector **/
z = {8,7,6,5,6}; /** 5 x 1 vector **/
c = {A A A B B B}; /** 1 x 6 character vector **/
create MyData var {y z c}; /** create data set **/
append; /** write data in vectors **/
close MyData; /** close the data set **/
quit;
IML ends with a QUIT; at the end.
PROC UNIVARIATE is referencing your FA2 data set which does not have U, because you did not save the variable to the data set so it doesn't exist. So it is correctly generating an error. Fixing the above issues should resolve this.

Extracting column names in SAS IML

I was hoping to write a macro using IML that would be able to extract the column names of the dataset to use as names later.
Some pseudocode:
proc iml;
read all dataset into matrix_a [colname = varnames];
(...)
names = varnames;
create new_data_set [rownames = names];
quit;
Is this possible?
Sure is.
data test;
array x[10];
do i=1 to 10;
do j=1 to 10;
x[j] = (i-1)*10 + j;
end;
output;
end;
drop i j;
run;
proc iml;
use test;
read all var _num_ into test[colname=names];
close test;
test = test`;
names = names`;
create test_t from test[rowname=names];
append from test[rowname=names];
close test_t;
quit;

proc transpose with duplicate ID values

I need help with proc transpose procedure in SAS. My code initially was:
proc transpose data=temp out=temp1;
by patid;
var text;
Id datanumber;
run;
This gave me error "The ID value " " occurs twice in the same BY group". I modified the code to this:
proc sort data = temp;
by patid text datanumber;
run;
data temp;
set temp by patid text datanumber;
if first.datanunmber then n = 0;
n+1;
run;
proc sort data = temp;
by patid text datanumber n;
run;
proc transpose out=temp1 (drop=n) let;
by patid;
var text;
id datanumber;
run;
This is giving me error: variable n is not recognized. Adding a let option is giving a lot of error "occurs twice in the same BY group". I want to keep all id values.
Please help me in this.
Data Example:
Patid Text
When you get that error it is telling you that you have multiple data points for one or more variables that you are trying to create. SAS can force the transpose and delete the extra datapoints if you add "let" to the proc transpose line.
Your data is possibly not unique? I created a dataset (with unique values of patid and datanumber) and the transpose works:
data temp (drop=x y);
do x=1 to 4;
PATID='PATID'||left(x);
do y=1 to 3;
DATANUMBER='DATA'||left(y);
TEXT='TEXT'||left(x*y);
output;
end;
end;
proc sort; by _all_;
proc transpose out=temp2 (drop=_name_);
by patid;
var text;
id datanumber;
run;
my recommendation would be to forget the 'n' fix and focus on making the data unique for patid and datanumber, a dirty approach would be:
proc sort data = temp nodupkey;
by patid datanumber;
run;
at the start of your code..
Try to sort your dataset by patid text n datanumber, (n before datanumber).
Try to sort your dataset by patid n datanumber, (n before datanumber). and proc transpose "by patib n ";