Memory management using mex files with matlab - c++

I have a mex file that communicates with a uEye USB3 camera, captures a length of video at resolution and framerate specified by the input arguments, and exports it as a stack of .tif images. Before the mex file returns, I free the buffers I was using to store the images in, but this memory doesn't get made available to the system again, it appears to remain allocated to matlab. Further calls of mex functions appear to re-use this memory, but I can't free it to the wider system without restarting matlab. Running clear mex or clear all appear to have no influence on this, indeed clear all occasionally completely crashes matlab with a segmentation fault. I'm pretty sure it's not a fault with my C++ because if I rewrite the same function in straight C++ I see the memory being released as it should be.
My memory is de-allocated using
is_UnlockSeqBuf(hCam, nMemID, pBuffer);
is_ClearSequence (hCam);
int k=0;
while (k < frame_count){
int fMem = is_FreeImageMem(hCam, pFrameBuffer[k], pFrame[k]);
k++;
}
having previously been allocated with
for (int i = 0; i < frame_count;i++){
int nAlloc = is_AllocImageMem (hCam, width,height, 24, &pFrameBuffer[i], &pFrame[i]);
int nSeq = is_AddToSequence(hCam, pFrameBuffer[i], pFrame[i]);
pFrame[i] = i + 1;
}
Does anyone have any ideas on how to release the memory without restarting matlab?

Related

My program crashes on the Windows machine yet it works fine on the Linux

I tried to program Ackermann function on my notebook (Win10), however the program crashed at higher values instead of continuing to calculate for a few minutes or hours.
My friend tried the same code on his machine (SUSE) and it worked just fine, then we tried it on the school server (CentOS) and it crashed yet again.
EDIT: It worked on server too, it just needed a second try. It also worked on the other server we had tried... All of it is on Linux.
We suspect the stack overflow is behind it but it's weird, because values aren't THAT HIGH yet. How am I able to preform recursive functions on this system then?
Thanks for all the answers. I'm just curious why it happens and how to make it work on my machine.
I tried to use both C and C++ to no change.
#include <stdio.h>
int ackermann (int m, int n);
int main () {
int m = 4;
int n = 1;
return ackermann(m,n);
}
int ackermann (int m, int n)
{
if (m == 0) return n=n+1;
else if (m > 0 && n == 0) return ackermann(m-1,1);
else if (m > 0 && n > 0) return ackermann(m-1,ackermann(m,n - 1));
}
Sounds like it's stack smashing, or more accurately a stack buffer overflow, which is the error message you got. The function has a very large depth, and will keep on pushing variables on the stack. If you push enough variables onto the stack, it'll smash. The computer doesn't have infinite memory for the stack and from the sounds of it, not a crazy amount either.
Given big enough input, it should stack any smash if not optimized well (and might still anyways). Without knowing more about your compiler, there can't be a comprehensive option on how to disable it, but if you're using Microsoft Visual Studio's compiler suite, you can disable the protection if I understand this doc correctly.
In Solution Explorer, right-click the project and then click Properties
In the Property Pages dialog box, click the C/C++ folder.
Click the Code Generation property page.
Modify the Buffer Security Check property.
As a warning, this is not safe and bad things can happen, so use this at your own risk.
In Visual Studio the default stack size is 1 MB, so with a recursion depth of 65535, and, I believe, a minimum stack frame for caller/function on x64 of this type is 72 bytes, so you will run out of available stack frame space for your program ( I compute almost 4.5Mb of stack needed for this scenario). This also produced the stack buffer overflow error, and has nothing to do with stack smashing other than you went beyond the maximum stack size available to your program when compiled.
Most compilers including Visual Studio let you specify the stack size.
More details:
https://learn.microsoft.com/en-us/cpp/build/reference/f-set-stack-size?view=vs-2017
[ Edited to reflect 65,535 frames, and not 1.4 billion ]

Malloc Error: OpenCV/C++ while push_back Vector

I try to create a Descriptor using FAST for the Point detection and SIFT for building the Descriptor. For that purpose I use OpenCV. While I use OpenCV's FAST I just use parts of the SIFT code, because I only need the Descriptor. Now I have a really nasty malloc Error and I don't know, how to solve it. I posted my code into GitHub because it is big and I dont really know where the Error comes from. I just know, that it is created at the end of the DO-WHILE-Loop:
features2d.push_back(features);
features.clear();
candidates2d.push_back(candidates);
candidates.clear();
}
}while(candidates.size() > 100);
As you can see in the code of GitHub I already tried to release Memory of the Application. Xcode Analysis says, that my Application uses 9 Mb memory. I tried to debug the Error but It was very complicated and I haven't found any clue where the Error comes from.
EDIT
I wondered if this Error could occur because I try to access the Image Pixel Value passed to calcOrientationHist(...) with img.at<sift_wt>(...) where typdef float sift_wt at Line 56, and 57 in my code, because normally the Patch I pass outputs the type 0 which means it is a CV_8UC1 But well, I copied this part from the sift.cpp at Line 330 and 331 Normally the SIFT Descriptor should also have a Grayscale image or not?
EDIT2
After changing the type in the img.at<sift_wt>(...)Position nothing changed. So I googled Solutions and landed at the GuardMalloc feature from XCode. Enabling it showed me a new Error which is probably the Reason I get the Malloc Error. In line 77 of my Code. The Error it gives me at this line is EXC_BAD_ACCESS (Code=1, address=....) There are the following lines:
for( k = 0; k < len; k ++){
int bin = cvRound((n/360.f)+Ori[k]);
if(bin >= n)
bin -=n;
if(bin < 0 )
bin +=n;
temphist[bin] += W[k]*Mag[k];
}
The Values of the mentioned Variables are the following:
bin = 52, len = 169, n = 36, k = 0, W, Mag, Ori and temphist are not shown.
Here the GuadMalloc Output (sorry but I dont really understand what exactly it wants)
GuardMalloc[Test-1935]: Allocations will be placed on 16 byte boundaries.
GuardMalloc[Test-1935]: - Some buffer overruns may not be noticed.
GuardMalloc[Test-1935]: - Applications using vector instructions (e.g., SSE) should work.
GuardMalloc[Test-1935]: version 108
Test(1935,0x102524000) malloc: protecting edges
Test(1935,0x102524000) malloc: enabling scribbling to detect mods to free blocks
Answer is simpler as thought...
The Problem was, that in the calculation of Bin in the For-loop the wrong value came out. Instead of adding ori[k] it should be a multiplication with ori[k].
The mistake there resulted in a bin value of 52. But the Length of the Array that temphist is pointing to is 38.
For all who have similar Errors I really recomment to use GuardMalloc or Valgrind to debug Malloc Errors.

Function call causes C++ program to freeze unless stepped-through in debugger

I have this short C++ program which takes snapshot images from a camera in a loop and displays them:
void GenericPGRTest::execute()
{
// connect camera
Camera *cam = Camera::Connect();
// query resolution and create view window
const Resolution res = cam->GetResolution();
cv::namedWindow("View");
c = 0;
// keep taking snapshots until escape hit
while (c != 27)
{
const uchar *buf = cam->SnapshotMono();
// create image from buffer and display it
cv::Mat image(res.height, res.width, CV_8UC1, (void*)buf);
cv::imshow("Camera", image);
c = cv::waitKey(1000);
}
}
This uses a class (Camera) for camera control I created using the Point Grey SDK and functions from the OpenCV library to display the images. I'm not necessarily looking for answers relating to the usage of either of these libraries, but rather some insight on how to debug a bizarre problem in general. The problem is that the application freezes (not crashes) on the cam->SnapshotMono() line. Of course, I ran through the function with a debugger. Here is the contents:
const uchar* Camera::SnapshotMono()
{
cam_.StartCapture();
// get a frame
Image image;
cam_.RetrieveBuffer(&image);
cam_.StopCapture();
grey_buffer_.DeepCopy(&image);
return grey_buffer_.GetData();
}
Now, every time I step through the function in the debugger, everything works OK. But the first time I do a "step over" instead of "step into" SnapshotMono(), bam, the program freezes. When I pause it at that time, I notice that it's stuck inside SnapshotMono() at the RetrieveBuffer() line. I know it's a blocking call so it theoretically can freeze (no idea why but it's possible), but why does it block when running normally and not when being debugged? This is one of the weirdest kinds of behaviour under debugging I've seen so far. Any idea why this could happen?
For those familiar with FlyCapture, the code above doesn't break as is, but rather only when I use StartCapture() in callback mode, then terminate it with StopCapture() before it.
Compiled with MSVC2010, OpenCV 2.4.5 and PGR FlyCapture 2.4R10.
Wild guess ... but may it be that StartCapture already starts the process that
ends up with having the buffer in ìmage, and if you step you leave it some
time until you get to RetrieveBuffer. That's not the case if you run it all at once ...

FFT 2D kernel runtime =0 in OpenCL

I’m working on a homework project compare performance of Fast Fourier Transform on CPU vs GPU . I’m done with the CPU part , but with GPU , I have a problem.
The trouble is the kernel runtime is zero , the input is the same as the output image . I use VS2010 on win7 with AMD APP SDK . Here is the host code , the kernel , an addition header to handle the image , they can be found in The OpenCL Programming Book (Ryoji Tsuchiyama…)
My guess the error is in the phase where we pass values from the image pixels to the cl_float2 *xm (line 169-174 in the host code). I can’t access the vector component to check it either , the compiler ain’t accept .sX or .xy , throws an error about it . Other parts –kernel,header…- looks fine with me .
for (i=0; i < n; i++) {
for (j=0; j < n; j++) {
((float*)xm)[(2*n*j)+2*i+0] = (float)ipgm.buf[n*j+i]; //real
((float*)xm)[(2*n*j)+2*i+1] = (float)0; //imag
}
}
So hope you guys help me out . Any ideas will be appreciated .
OpenCL provides a lot of different error codes.
You already retrieve them by doing ret = clInstruction(); on each call, but you are not analysing it.
Please check on each call if this value is equal to CL_SUCCESS.
It may always happen, that the memory is not sufficient, the hardware is already in use or there is a simple error in your source code. The return value will tell you.
Also: Please check your cl_context, cl_program, etc. for NULL values.

cvHaarDetectObjects(): "Stack aound the variable 'seq_thread' was corrupted."

I have been looking in to writing my own implementation of Haar Cascaded face detection for some time now, and have begun with diving in to the OpenCV 2.0 implementation.
Right out of the box, running in debug mode, Visual Studio breaks on cvhaar.cpp:1518, informing me:
Run-Time Check Failure #2 - Stack aound the variable seq_thread was corrupted.
It seems odd to me that OpenCV ships with a simple array out-of-bounds problem. Running the release works without any problems, but I suspect that it is merely not performing the check and the array is exceeding the bounds.
Why am I receiving this error message? Is it a bug in OpenCV?
A little debugging revealed the culprit, I believe. I "fixed" it, but this all still seems odd to me.
An array of size CV_MAX_THREADS is created on cvhaar.cpp:868:
CvSeq* seq_thread[CV_MAX_THREADS] = {0};
On line 918 it proceeds to specify max_threads:
max_threads = cvGetNumThreads();
In various places, seq_thread is looped using the following for statement:
for( i = 0; i < max_threads; i++ ) {
CvSeq* s = seq_thread[i];
// ...
}
However, cxmisc.h:108 declares CV_MAX_THREADS:
#define CV_MAX_THREADS 1
Hence, the declaration of seq_thread must never be allowed to exceed size 1, yet cvGetNumThreads() returns 2 (I assume this reflects the number of cores in my machine).
I resolved the problem by adding the following simple little statement:
if (max_threads > CV_MAX_THREADS) max_threads = CV_MAX_THREADS;
Does any of this make sense?