c++ armadillo - calculating null space - c++

this is my first post here...
Is there any way to calculate a vector in the null space of another vector? I don't need the basis, just one vector in the null space will do.
I already tried using the solve() method -
colvec x(3);
x = solve(A,B);
where A is a 3x3 matrix of type mat -
2 2 2
3 3 3
4 4 4
and B is the zero vector of type colvec -
0
0
0
But the program terminates throwing the following error -
error: solve(): solution not found
terminate called after throwing an instance of 'std::runtime_error'
what():
I have used the solve() method before and got perfect results, but it doesn't seem to work in this simple case. Is this because the equation has multiple solutions? If so, is there any workaround to this, any other way that I could get a vector in the null space?
Any help would be appreciated.
Edit :
I tried the svd(mat U, vec s, mat V, mat X, method = "standard") method and I got the null space of X from the columns of V. I was just wondering if there is any way to improve the precision of the answer.
Thanks!

In recent version of the armadillo library you can find the orthonormal basis of the null space of matrix using the null() function. See the documentation at http://arma.sourceforge.net/docs.html#null. The functionality was added in version 5.400 (August 2015).

Related

Problem about value assignment in Arrayfire

I'm using Arrayfire and Flashlight evaluating a network.
auto tmp = output(af::seq(2, 10), af::span, af::span, af::span);
auto softmax_tmp = fl::softmax(tmp, 0);
output(af::seq(2,10),af::span,af::span,af::span)=softmax_tmp;
output is a tensor with the shape of (12,100,1,1). Now I want to pull out the (2,10) dims of the tensor and for the extracted 100 9-dim vectors, apply softmax operation to them. Then put them back. Codes above.
Problem is that the 3rd line doesn't work. softmax_tmp is the right value, but the assignment operator in the 3rd line just failed. Exactly it can pass the compilation successfully, but output remains the old value as in 1st line.
Who could help me? A lot thanks really.

K-Means Algorithm not working properly

I was trying to write my own K-Means clustering algorithm however it is not working.Can someone take a look and help me finding what mistake I am committing.I am fairly new.
I expect the data to be clustered in 2 groups since K=2.However I am not getting the expected result.I think mean assignment is not working properly.Can someone give a look?
https://github.com/DivJ/Robo_Lab/blob/master/K_Means.py
dist=[]
lab=[]
x_sum,y_sum=0,0
x_sum1,y_sum1=0,0
k=2
mean=pt[:k]
def assignment():
global dist
global lab
for i in range(0,100):
for j in range(0,k):
dist.append(math.hypot(pt[i,0]-mean[j,0],pt[i,1]-mean[j,1]))
lab.append(dist.index(min(dist)))
dist=[]
def mean_shift():
global x_sum,x_sum1,y_sum,y_sum1,lab
for i in range(0,100):
if(lab[i]==0):
plt.scatter(pt[i,0],pt[i,1],c='r')
x_sum=pt[i,0]+x_sum
y_sum=pt[i,1]+y_sum
elif(lab[i]==1):
plt.scatter(pt[i,0],pt[i,1],c='b')
x_sum1=pt[i,0]+x_sum1
y_sum1=pt[i,1]+y_sum1
mean[0,0]=x_sum/lab.count(0)
mean[0,1]=y_sum/lab.count(0)
mean[1,0]=x_sum1/lab.count(1)
mean[1,1]=y_sum1/lab.count(1)
lab=[]
def k_means(itr):
for z in range(0,itr):
assignment()
mean_shift()
k_means(100)
Here's what's wrong with your code:
1) You initialize means as pt[:k], however later you reassign means which leads to the first two points being reassigned unintentionally since means merely is a pointer to these points. You need to create a copy of the first to points to avoid changing them:
import copy
means=copy.copy(pt[:k])
2) You initialize x_sum, y_sum, x_sum1 and y_sum1 outside of mean_shift() which causes the sums to grow bigger and bigger with each iteration. Set them to 0 every time you call mean_shift().

vision.internal.disparityParser in MATLAB

I am working with the computer Vision toolbox in MATLAB 2014b
there is a function for Semi-global Matching (SGM )
I am trying to generate a disparity map of a stereo images. However, the disparity range needs to be quite large for some experiments.
Here is the function call:
Dmap = disparity(I1 I2, 'BlockSize', 15, 'DisparityRange', [-2466, 2466]);
The problem is the DisparityRange is limited to be in the range of [-2464, 2464]. Thus, I am getting an error msg like this one bellow.
Error using disparity
The value of 'DisparityRange' is invalid. Expected DisparityRange to be an array with all of the values >
-2466.
Error in vision.internal.disparityParser (line 38)
parser.parse(varargin{:});
Error in disparity>parseOptionalInputs (line 264)
r = vision.internal.disparityParser(imageSize, getDefaultParameters(),...
Error in disparity>parseInputs (line 244)
r = parseOptionalInputs(imageSize, varargin{:});
Error in disparity (line 137)
r = parseInputs(I1, I2, varargin{:});
My questions:
1. I could not find the function (vision.internal.disparityParser). Where is should be located.
2. I would like to modify the code to work for rainges beyond the specified limit. Is that possible?
3. For anyone who worked with the C++ version of the SGM function (OpenCV), does the same problem exist (i.e., the disparity range limits).
Thank you!
:)
I could only answer the first question. The function vision.internal.disparityParser is located at $MATLAB/toolbox/vision/vision/+vision/+internal/disparityParser.m .

Cuda: least square solving , poor in speed

Recently ,I use Cuda to write an algorithm called 'orthogonal matching pursuit' . In my ugly Cuda code the entire iteration takes 60 sec , and Eigen lib takes just 3 sec...
In my code Matrix A is [640,1024] and y is [640,1] , in each step I select some vectors from A to compose a new Matrix called A_temp [640,itera], iter=1:500 . I new a array MaxDex_Host[] in cpu to tell which column to select .
I want to get x_temp[itera,1] from A_temp*x_temp=y using least-square , I use a cula API 'culaDeviceSgels' and cublas matrix-vector multiplication API.
So the culaDeviceSgels would call 500 times , and I think this would be faster than Eigen lib's QR.Sovler .
I check the Nisight performence anlysis , I found the custreamdestory takes a long time . I initial cublas before iteration and destory it after I get the result . So I want to know the what is the custreamdestory , different with cublasdestory?
The main problem is memcpy and function 'gemm_kernel1x1val' . I think this function is from 'culaDeviceSgels'
while(itera<500): I use cublasSgemv and cublasIsamax to get MaxDex_Host[itera] , then
MaxDex_Host[itera]=pos;
itera++;
float* A_temp_cpu=new float[M*itera]; // matrix all in col-major
for (int j=0;j<itera;j++) // to get A_temp [M,itera] , the MaxDex_Host[] shows the positon of which column of A to chose ,
{
for (int i=0;i<M;i++) //M=640 , and A is 640*1024 ,itera is add 1 each step
{
A_temp_cpu[j*M+i]=A[MaxDex_Host[j]*M+i];
}
}
// I must allocate one more array because culaDeviceSgels will decompose the one input Array , and I want to use A_temp after least-square solving.
float* A_temp_gpu;
float* A_temp2_gpu;
cudaMalloc((void**)&A_temp_gpu,Size_float*M*itera);
cudaMalloc((void**)&A_temp2_gpu,Size_float*M*itera);
cudaMemcpy(A_temp_gpu,A_temp_cpu,Size_float*M*itera,cudaMemcpyHostToDevice);
cudaMemcpy(A_temp2_gpu,A_temp_gpu,Size_float*M*itera,cudaMemcpyDeviceToDevice);
culaDeviceSgels('N',M,itera,1,A_temp_gpu,M,y_Gpu_temp,M);// the x_temp I want is in y_Gpu_temp's return value , stored in the y_Gpu_temp[0]——y_Gpu_temp[itera-1]
float* x_temp;
cudaMalloc((void**)&x_temp,Size_float*itera);
cudaMemcpy(x_temp,y_Gpu_temp,Size_float*itera,cudaMemcpyDeviceToDevice);
Cuda's memory manage seems too complex , is there any other convenience method to solve least-square?
I think that custreamdestory and gemm_kernel1x1val are internally called by the APIs you are using, so there is not much to do with them.
To improve your code, I would suggest to do the following.
You can get rid of A_temp_cpu by keeping a device copy of the matrix A. Then you can copy the rows of A into the rows of A_temp_gpu and A_temp2_gpu by a kernel assignment. This would avoid performing the first two cudaMemcpys.
You can preallocate A_temp_gpu and A_temp2_gpu outside the while loop by using the maximum possible value of itera instead of itera. This will avoid the first two cudaMallocs inside the loop. The same applies to x_temp.
As long as I know, culaDeviceSgels solves a linear system of equations. I think you can do the same also by using cuBLAS APIs only. For example, you can perform an LU factorization first by cublasDgetrfBatched() and then use cublasStrsv() two times to solve the two arising linear systems. You may wish to see if this solution leads to a faster algorithm.

arrayfire flip throws exception

I'm try to flip a matrix of size [249 1 50 20], this is the code:
array flipped_delta = flip(delta, 0);
I get the following exception:
Unhandled exception at 0x00000001801FCA92 (libafcu.dll) in r.exe: 0xC0000094: Integer division by zero.
I try to flip with flip(delta, 2) then I get:
c:\var\lib\hudson\workspace\build-win64-master\jacket\src\cuda\../common/flip.cp
p:47: CUDA runtime error: invalid configuration argument (9)
What am I doing wrong?
thanks.
I don't know ArrayFire, but a quick peek at the documentation suggests that dimension 0 is along the vertical axis, but you have only one row so there's nothing to flip. Consequently this could be a bug in handling that case, where I'd expect a no-op instead.
Try with dimension 1 (horizontal):
array flipped_delta = flip(delta, 1);
Disclaimer: this may or may not actually be how dimension indexes work in ArrayFire.