Transform Incanter matrix to nested vector - clojure

Consider a function that outputs an Incanter matrix.
Here is an example matrix containing output from the function:
A 6x4 matrix
-4.77e-01 8.45e-01 1.39e-01 -9.83e-18
8.55e-01 2.49e-01 1.33e-01 2.57e-17
-2.94e-03 6.60e-03 -9.63e-01 1.16e-16
...
6.64e-09 2.55e-08 1.16e-07 -1.11e-16
-1.44e-01 -3.33e-01 1.32e-01 -7.07e-01
-1.44e-01 -3.33e-01 1.32e-01 7.07e-01
I'd like to continue analyzing the rows of the matrix, which represent points. The function that I want to feed the Incanter matrix to takes nested vectors as inputs.
So the function would need the above data in the form
[[-4.77e-01 8.45e-01 1.39e-01 -9.83e-18] [8.55e-01 2.49e-01 1.33e-01 2.57e-17]
[-2.94e-03 6.60e-03 -9.63e-01 1.16e-16] [6.64e-09 2.55e-08 1.16e-07 -1.11e-16]
[-1.44e-01 -3.33e-01 1.32e-01 -7.07e-01] [-1.44e-01 -3.33e-01 1.32e-01 7.07e-01]]
It is the transformation from the Incanter matrix representation to the nested vector structure that I am unsure how to perform. Is there a simple way to convert the data's representation?

You can do it with build-in to-vect function:
(to-vect m)
or with build-in to-list function:
(to-list m)
Both functions will produce vector-of-vectors when given a matrix:
=> (def m (matrix [[1 2] [3 4]]))
A 2x2 matrix
-------------
1.00e+00 2.00e+00
3.00e+00 4.00e+00
=> (to-vect m)
[[1.0 2.0] [3.0 4.0]]
=> (to-list m)
[[1.0 2.0] [3.0 4.0]]

Related

are there any functions in BLAS that can perform skew-symmetric matrix-vector products?

I'm thinking of performing some calculations with Intel-MKL, specifically the matrix-vector Sparse BLAS functions for a program in Fortran.
I can express my calculations in matrices that happen to be sparse and skew-symmetric
From what I can see, Sparse BLAS has sparse functions for general and symmetric matrices, so I wanted to know if I there was a way to work with a sparse skew-symmetric matrix instead, because I imagine it would reduce memory footprint.
TLDR; MKL Sparse BLAS can do matrix-vector multiplications with a sparse matrix expressed as the upper/lower triangle by the mkl_scsrmv subroutine subroutine and supplying 'A' to the first element in the matrix descriptor array.
Ok I managed to find the answer to my question when I started testing the general MKL Sparse BLAS matrix-vector multiplication in CSR format (mkl_?csrmv)
I learnt that there is a character array that is used to describe the input matrix (matdescra). The first character in this array can be set to 'A' which causes the subroutine to interpret the input matrix as skew-symmetric. For example (not necessarily a good one),
Given a matrix, A, and and vector, x,:
A = [ 0 1 2 x = [ 1
-1 0 3 2
-2 -3 0 ] 3 ]
The upper-triangle of A can be represented as
val = [1, 2, 3]
col = [2, 3, 3]
rowstart = [1, 3, 3]
rowend = [3 3 4]
and with the character array matdescra = ['A', 'U', 'N', 'F'],
The matrix-vector product is obtained by
call mkl_scsrmv('n', 3, 3, 1., matdescra, val, rowstart, rowend, x, 1., y
where the output (a vector) is added to the vector-array, y.

What does lu_factorize return?

boost::number::ublas contains the M::size_type lu_factorize(M& m) function. Its name suggests that it performs the LU decomposition of a given matrix m, i.e. should produce two matrices that m = L*U. There seems to be no documentation provided for this function.
It is easy to deduce that it returns 0 to indicate successful decomposition, and a non-zero value when the matrix is singular. However, it is completely unclear where is the result. Taking the matrix by reference suggests that it works in-place, however it should produce two matrices (L and U) not one. So what does it do?
There is no documentation in boost, but looking at the documentation of SciPy's lu_factor one can see, that it's not uncommon to return one result for the LU decomposition.
This is enough, because in a typical approach to LU decomposition, L's diagonal consists of ones only, as presented in this answer from Mathematics, for example.
So, it is possible to fit both L and U into one matrix, putting L in result's lower part, omitting the diagonal (which is assumed to contain only ones), and U in the upper part. For example, for a 3x3 problem the result is:
u11 u12 u13
m = l21 u22 u23
l31 l32 u33
which implies:
1 0 0
L = l21 1 0
l31 l32 1
and
u11 u12 u13
U = 0 u22 u23
0 0 u33
Inspecting boost's void lu_substitute(const M& m, vector_expression<E>& e) function, from the same namespace seems to confirm this. It solves the equation LUx = e, where both L and U are contained in its m argument in two steps.
First solve Lz = e for z, where z = Ux, using lower part of m:
inplace_solve(m, e, unit_lower_tag ());
then, having computed z = Ux (with e modified in place), Ux = e can be solved, using upper part of m:
inplace_solve(m, e, upper_tag ());
inplace_solve is mentioned in the documentation, and it:
Solves a system of linear equations with triangular form, i.e. A is triangular.
So everything seems to make sense.
The boost doesn't have document of LU factorization (a lower triangular matrix L and upper triangular matrix U), but the source code shared with the public.
If the code is hard to follow, please check the webpage by Nick Higham. It had an detailed explanation. Here are an example from the link:
Let's say we need to solve Ax = b.
  (1) Make LU from input matrix, A
[3 -1 1  1]
[-1  3 1 -1] ->
[-1 -1 3  1]
[1  1 1  3]
Low
[1     0    0    0]
[-1/3   1   0   0]
[-1/3 -1/2 1 0]
[1/3    1/2  0 1]
Upper
[3    -1   1   1]
[0 8/3 4/3 -2/3]
[0   0   4    1]
[0   0   0    3]
   This example looks straight forward to human but algorithm wise could be numerous steps. This is why LU Factorization came. Methodically, Relation with Gaussian Elimination, Schur Complements, and Block Implementations are some.
  (2) Solve the triangular systems Ly = b and Ux = y, since then b = L(Ux).

Python: Solving equation system (coefficients are arrays)

I can solve a system equation (using NumPY) like this:
>>> a = np.array([[3,1], [1,2]])
>>> b = np.array([9,8])
>>> y = np.linalg.solve(a, b)
>>> y
array([ 2., 3.])
But, if I got something like this:
>>> x = np.linspace(1,10)
>>> a = np.array([[3*x,1-x], [1/x,2]])
>>> b = np.array([x**2,8*x])
>>> y = np.linalg.solve(a, b)
It doesnt work, where the matrix's coefficients are arrays and I want calculate the array solution "y" for each element of the array "x". Also, I cant calculate
>>> det(a)
The question is: How can do that?
Check out the docs page. If you want to solve multiple systems of linear equations you can send in multiple arrays but they have to have shape (N,M,M). That will be considered a stack of N MxM arrays. A quote from the docs page below,
Several of the linear algebra routines listed above are able to compute results for several matrices at once, if they are stacked into the same array. This is indicated in the documentation via input parameter specifications such as a : (..., M, M) array_like. This means that if for instance given an input array a.shape == (N, M, M), it is interpreted as a “stack” of N matrices, each of size M-by-M. Similar specification applies to return values, for instance the determinant has det : (...) and will in this case return an array of shape det(a).shape == (N,). This generalizes to linear algebra operations on higher-dimensional arrays: the last 1 or 2 dimensions of a multidimensional array are interpreted as vectors or matrices, as appropriate for each operation.
When I run your code I get,
>>> a.shape
(2, 2)
>>> b.shape
(2, 50)
Not sure exactly what problem you're trying to solve, but you need to rethink your inputs. You want a to have shape (N,M,M) and b to have shape (N,M). You will then get back an array of shape (N,M) (i.e. N solution vectors).

Struggling with BFGS minimization algorithm for Logistic regression in Clojure with Incanter

I'm trying to implement a simple logistic regression example in Clojure using the Incanter data analysis library. I've successfully coded the Sigmoid and Cost functions, but Incanter's BFGS minimization function seems to be causing me quite some trouble.
(ns ml-clj.logistic
(:require [incanter.core :refer :all]
[incanter.optimize :refer :all]))
(defn sigmoid
"compute the inverse logit function, large positive numbers should be
close to 1, large negative numbers near 0,
z can be a scalar, vector or matrix.
sanity check: (sigmoid 0) should always evaluate to 0.5"
[z]
(div 1 (plus 1 (exp (minus z)))))
(defn cost-func
"computes the cost function (J) that will be minimized
inputs:params theta X matrix and Y vector"
[X y]
(let
[m (nrow X)
init-vals (matrix (take (ncol X) (repeat 0)))
z (mmult X init-vals)
h (sigmoid z)
f-half (mult (matrix (map - y)) (log (sigmoid (mmult X init-vals))))
s-half (mult (minus 1 y) (log (minus 1 (sigmoid (mmult X init-vals)))))
sub-tmp (minus f-half s-half)
J (mmult (/ 1 m) (reduce + sub-tmp))]
J))
When I try (minimize (cost-func X y) (matrix [0 0])) giving minimize a function and starting params the REPL throws an error.
ArityException Wrong number of args (2) passed to: optimize$minimize clojure.lang.AFn.throwArity (AFn.java:437)
I'm very confused as to what exactly the minimize function is expecting.
For reference, I rewrote it all in python, and all of the code runs as expected, using the same minimization algorithm.
import numpy as np
import scipy as sp
data = np.loadtxt('testSet.txt', delimiter='\t')
X = data[:,0:2]
y = data[:, 2]
def sigmoid(X):
return 1.0 / (1.0 + np.e**(-1.0 * X))
def compute_cost(theta, X, y):
m = y.shape[0]
h = sigmoid(X.dot(theta.T))
J = y.T.dot(np.log(h)) + (1.0 - y.T).dot(np.log(1.0 - h))
cost = (-1.0 / m) * J.sum()
return cost
def fit_logistic(X,y):
initial_thetas = np.zeros((len(X[0]), 1))
myargs = (X, y)
theta = sp.optimize.fmin_bfgs(compute_cost, x0=initial_thetas,
args=myargs)
return theta
outputting
Current function value: 0.594902
Iterations: 6
Function evaluations: 36
Gradient evaluations: 9
array([ 0.08108673, -0.12334958])
I don't understand why the Python code can run successfully, but my Clojure implementation fails. Any suggestions?
Update
rereading the docstring for minimize i've been trying to calculate the derivative of cost-func which throws a new error.
(def grad (gradient cost-func (matrix [0 0])))
(minimize cost-func (matrix [0 0]) (grad (matrix [0 0]) X))
ExceptionInfo throw+: {:exception "Matrices of different sizes cannot be differenced.", :asize [2 1], :bsize [1 2]} clatrix.core/- (core.clj:950)
using trans to convert the 1xn col matrix to a nx1 row matrix just yields the same error with opposite errors.
:asize [1 2], :bsize [2 1]}
I'm pretty lost here.
I can't say anything about your implementation, but incanter.optimize/minimize expects (at least) three parameters and you're giving it only two:
Arguments:
f -- Objective function. Takes a collection of values and returns a scalar
of the value of the function.
start -- Collection of initial guesses for the minimum
f-prime -- partial derivative of the objective function. Takes
a collection of values and returns a collection of partial
derivatives with respect to each variable. If this is not
provided it will be estimated using gradient-fn.
Unfortunately, I'm not able to tell you directly what to supply (for f-prime?) here but maybe someone else is. Btw, I think the ArityException Wrong number of args (2) passed to [...] is actually quite helpful here.
Edit: Actually, I think the docstring above is not correct, since the source code does not use gradient-fn to estimate f-prime. Maybe, you can use incanter.optimize/gradient to generate your own?
First, Your cost function should have a parameter for theta like your python implementation however your implementation has fixed as initial-value.
Second, if your cost-func is correct, then you can call optimize like this
(optimize (fn [theta] (cost-func theta X y)) [0 0 0])
Hope this can help you.

OpenCV estimateAffine3D breaks for coplanar points

I am trying to use OpenCV's estimateAffine3D() function to get the affine transformation between two sets of coplanar points in 3D. If I hold one variable constant, I find there is a constant error in the translation component of that variable.
My test code is:
std::vector<cv::Point3f> first, second;
std::vector<uchar> inliers;
cv::Mat aff(3,4,CV_64F);
for (int i = 0; i <6; i++)
{
first.push_back(cv::Point3f(i,i%3,1));
second.push_back(cv::Point3f(i,i%3,1));
}
int ret = cv::estimateAffine3D(first, second, aff, inliers);
std::cout << aff << std::endl;
The output I expect is:
[1 0 0 0]
[0 1 0 0]
[0 0 1 0]
Edit: My expectation is incorrect. The matrix does not decompose into [R|t] for the case of constant z-coordinates.
but what I get (with some rounding for readability) is:
[1 0 0 0]
[0 1 0 0]
[0 0 0.5 0.5]
Is there a way to fix this behavior? Is there a function which does the same on sets of 2D points?
No matter how I run your code I get fine output. For example when I run it exactly as you posted it I get.
[1,0,0 ,0]
[0,1,0 ,0]
[0,0,.5,.5]
which is correct because the 4th element of a homogeneous coordinate is assumed to be 1. When I run it with 2 as the z value I get
[1,0,0 ,0]
[0,1,0 ,0]
[0,0,.8,.4]
which also works (.8*2+.4 = 2). Are you sure you didn't just read aff(2,2) wrong?
The key problem is:
Your purpose is to estimate the rotation and translation between two sets of 3D points, but the OpenCV function estimateAffine3D() is not for that purpose. As its name suggests, this function is to compute the affine transformation between two sets of 3D points. When computing the affine transformation, the constraints on the rotation matrix is not considered. Of course, the result is not correct. To obtain the rotation and translation, you need to implement the SVD based algorithm.You may search "absolute orientation" in google. This is a classic and closed-form algorithm.