Eigen code fail in release mode but work in debug mode - c++

Hi everyone who use Eigen, I encountered a strange question here.
I implemented a Unscented Kalman Filter with Eigen.
It works very well on my PC, but the same piece of code will generate segmentation fault on my embedded system, Odroid XU (Armv7 Architecture).
After hours of debugging, I found the problem was with this part:
qrSolver.compute(OS.transpose());
m_q=qrSolver.householderQ();
m_r = qrSolver.matrixQR().triangularView<Upper>();
S_pre = m_r.block(0,0,n,n).transpose();
if (w_c0 < 0)
internal::llt_inplace<float,Upper>::rankUpdate(S_pre,
sqrt(-w_c0)*(sigmaPoints.col(0) - state_pre),
-1);
else
internal::llt_inplace<float,Upper>::rankUpdate(S_pre,
sqrt(w_c0)*(sigmaPoints.col(0) - state_pre),
1);
where I first compute the QR decomposition of matrix OS (dimension n-by-3n), and then perform rank update of its R component (dimension n-by-n). internal::llt_inplace::rankUpdate is a function in Eigen library which is not documented. It just perform rank-1 update to its first argument. This function can be found in ~/path_to_Eigen/Cholesky/LLT.h
The most strange thing of this piece of code is, with -DCMAKE_BUILD_TYPE=Debug it works perfectly, while if I compile with -DCMAKE_BUILD_TYPE=Release, this code fails.
I would like to ask can anyone understand this or does anyone have similar issue before. Please help, thanks a lot.

Related

ArrayFire convolution issue with Cuda backend

I've been having an issue with a certain function call in the
dphaseWeighted = af::convolve(dphaseWeighted, m_slowTimeFilter);
which seem to produce nothing but nan's.
The back ground is we have recently switched from using AF OpenCL to AF Cuda and the problem that we are seeing happens in the function.
dphaseWeighted = af::convolve(dphaseWeighted, m_slowTimeFilter);
This seems to work well when using OpenCL.
Unfortunatley, I can't give you the whole function because of IP. only a couple of snippets.
This convolve lies deep with in a phase extract piece of code. and is actualy the second part of that code which uses the af::convolve funtion.
The first function seems to behave as expected, with sensible floating point data out.
but then when it comes to the second function all I'm seeing is nan's coming out ( I view that with af_print amd dumping the data to a file.
in the CMakeList I include
include_directories(${ArrayFire_INCLUDE_DIRS})
and
target_link_libraries(DASPhaseInternalLib ${ArrayFire_CUDA_LIBRARIES})
and it builds as expected.
Has anyone experience any think like this before?

Assertion sv_count !=0 failed - Function train_auto, SVM type - EPS_SVR

The question is related to the OpenCV library, version 2.4.13.2.
I am using n dimensional feature vectors from images for training and performing regression. The output values range between 0 and 255.
The function CvSVM::train works without an error, but requires a manual setting of parameters. So, I would prefer using the function CvSVM::train_auto to perform cross validation and determine the best parameters for the situation.
But I am facing the error:
OpenCV Error: Assertion failed (sv_count != 0) in CvSVM::do_train.
On changing the type to NU_SVR, it works well. The problem is only with type EPS_SVR.
I would appreciate any help I could receive to fix this.
EDIT: I was able to pinpoint the problem to line Number 1786 in the file-
opencv-master\sources\modules\ml\src\svm.cpp
FOR_IN_GRID(p, p_grid)
Upon commenting it, the code runs without errors. I am unaware of the reasons possible.
Facing the same bug. Found out that this bug was caused by svm.setP(x) and svm.setTermCriteria((cv2.TERM_CRITERIA_EPS, y)) where x and y values more than 0.1 (10^-1).

CMake Release made my code stop working properly

I have a C++ program which works well when I compile with no additional options. However, whenever I use cmake -DCMAKE_BUILD_TYPE=Release there is a very specific part of the code which stops working.
Concretely, I have an interface for a Boost Fibonacci Heap. I am calling this function:
narrow_band_.push(myObject);
And this function does the following:
inline void myHeap::push (myStruct & c) {
handles_[c.getIndex()] = heap_.push(c);
}
where heap_ is:
boost::heap::fibonacci_heap<myStruct, boost::heap::compare<compare_func>> heap_;
For some reason the heap_size is not being modified, therefore the rest of the code does not work because the next step is to extract the minimum from the heap and it is always empty.
In Debug mode it works ok. Thank you for your help.
EDIT - Additional info
I have also found this problem: I have a set of code which do simple math operations. In Release mode the results are incorrect. If I just do cout of a couple of variables to check their values, the result changes, which is still incorrect but quite strange.
I solved the problem. It was funny but in this case it was not initialization/timing issues common in the Release mode. It was that the return statement was missing in a couple of functions. In debug mode the output was perfect but in the release mode that failed. I had warnings deactivated, that is why I did not see it before.

FFT 2D kernel runtime =0 in OpenCL

I’m working on a homework project compare performance of Fast Fourier Transform on CPU vs GPU . I’m done with the CPU part , but with GPU , I have a problem.
The trouble is the kernel runtime is zero , the input is the same as the output image . I use VS2010 on win7 with AMD APP SDK . Here is the host code , the kernel , an addition header to handle the image , they can be found in The OpenCL Programming Book (Ryoji Tsuchiyama…)
My guess the error is in the phase where we pass values from the image pixels to the cl_float2 *xm (line 169-174 in the host code). I can’t access the vector component to check it either , the compiler ain’t accept .sX or .xy , throws an error about it . Other parts –kernel,header…- looks fine with me .
for (i=0; i < n; i++) {
for (j=0; j < n; j++) {
((float*)xm)[(2*n*j)+2*i+0] = (float)ipgm.buf[n*j+i]; //real
((float*)xm)[(2*n*j)+2*i+1] = (float)0; //imag
}
}
So hope you guys help me out . Any ideas will be appreciated .
OpenCL provides a lot of different error codes.
You already retrieve them by doing ret = clInstruction(); on each call, but you are not analysing it.
Please check on each call if this value is equal to CL_SUCCESS.
It may always happen, that the memory is not sufficient, the hardware is already in use or there is a simple error in your source code. The return value will tell you.
Also: Please check your cl_context, cl_program, etc. for NULL values.

SFML getFullscreenModes

Have you ever run into issue where function in SFML 2 to get availiable modes returns you:
availiableVideoModes [3]({width=3131961357 height=3131961357 bitsPerPixel=3131961357 },{width=3131961357 height=3131961357 bitsPerPixel=3131961357 },{width=3131961357 height=3131961357 bitsPerPixel=3131961357 }) std::vector >
max int values in vector? Interesting is why 3? I tried quick debugging without luck so in parallel I thought to raise question here.
code:
std::vector<sf::VideoMode> availiableVideoModes;
availiableVideoModes = sf::VideoMode::getFullscreenModes();
interesting is that
desktopVideoMode = sf::VideoMode::getDesktopMode();
returns correct value.
The issue was in libraries link, I have linked 32bits one instead of 64bits.