Alright; so I'm finding an odd memory leak when attempting to use cvCreateMat to make room for my soon-to-be-filled mat. Below is what I am attempting to do; adaptiveThreshold didn't like it when I put the 3-channel image in, so I wanted to split it into its separate channels. It works! But every time we go through this particular function we gain another ~3MB of memory. Since this function is expected to run a few hundred times, this becomes a rather noticeable problem.
So here's the code:
void adaptiveColorThreshold(Mat *inputShot, int adaptiveMethod, int blockSize, int cSubtraction)
{
Mat newInputShot = (*inputShot).clone();
Mat inputBlue = cvCreateMat(newInputShot.rows, newInputShot.cols, CV_8UC1);
Mat inputGreen = cvCreateMat(newInputShot.rows, newInputShot.cols, CV_8UC1);
Mat inputRed = cvCreateMat(newInputShot.rows, newInputShot.cols, CV_8UC1);
for(int rows = 0; rows < newInputShot.rows; rows++)
{
for(int cols = 0; cols < newInputShot.cols; cols++)
{
inputBlue.data[inputBlue.step[0]*rows + inputBlue.step[1]*cols] = newInputShot.data[newInputShot.step[0]*rows + newInputShot.step[1]*cols + 0];
inputGreen.data[inputGreen.step[0]*rows + inputGreen.step[1]*cols] = newInputShot.data[newInputShot.step[0]*rows + newInputShot.step[1]*cols + 1];
inputRed.data[inputRed.step[0]*rows + inputRed.step[1]*cols] = newInputShot.data[newInputShot.step[0]*rows + newInputShot.step[1]*cols + 2];
}
}
adaptiveThreshold(inputBlue, inputBlue, 255, adaptiveMethod, THRESH_BINARY, blockSize, cSubtraction);
adaptiveThreshold(inputGreen, inputGreen, 255, adaptiveMethod, THRESH_BINARY, blockSize, cSubtraction);
adaptiveThreshold(inputRed, inputRed, 255, adaptiveMethod, THRESH_BINARY, blockSize, cSubtraction);
for(int rows = 0; rows < (*inputShot).rows; rows++)
{
for(int cols = 0; cols < (*inputShot).cols; cols++)
{
(*inputShot).data[(*inputShot).step[0]*rows + (*inputShot).step[1]*cols + 0] = inputBlue.data[inputBlue.step[0]*rows + inputBlue.step[1]*cols];
(*inputShot).data[(*inputShot).step[0]*rows + (*inputShot).step[1]*cols + 1] = inputGreen.data[inputGreen.step[0]*rows + inputGreen.step[1]*cols];
(*inputShot).data[(*inputShot).step[0]*rows + (*inputShot).step[1]*cols + 2] = inputRed.data[inputRed.step[0]*rows + inputRed.step[1]*cols];
}
}
inputBlue.release();
inputGreen.release();
inputRed.release();
newInputShot.release();
return;
}
So going through it one line at a time...
newInputShot adds ~3MB
inputBlue adds ~1MB
inputGreen adds ~1MB
and inputRed adds ~1MB
So far, so good - need memory to hold the data. newInputShot gets its data right off the bat, but inputRGB need to get their data from newInputShot - so we just allocate the space to be filled in the upcoming for-loop, which (as expected) allocates no new memory, just fills in the space already claimed.
The adaptiveThresholds don't add any new memory either, since they're simply supposed to overwrite what is already there, and the next for-loop writes straight to inputShot; no new memory needed there. So now we get around to (manually) releasing the memory.
Releasing inputBlue frees up 0MB
Releasing inputGreen frees up 0MB
Releasing inputRed frees up 0MB
Releasing newInputShot frees up ~3MB
Now, according to the OpenCV documentation site: "OpenCV handles all the memory automatically."
First of all, std::vector, Mat, and other data structures used by the
functions and methods have destructors that deallocate the underlying
memory buffers when needed. This means that the destructors do not
always deallocate the buffers as in case of Mat. They take into
account possible data sharing. A destructor decrements the reference
counter associated with the matrix data buffer. The buffer is
deallocated if and only if the reference counter reaches zero, that
is, when no other structures refer to the same buffer. Similarly, when
a Mat instance is copied, no actual data is really copied. Instead,
the reference counter is incremented to memorize that there is another
owner of the same data. There is also the Mat::clone method that
creates a full copy of the matrix data.
TLDR the quote: Related mats get clumped together in a super-mat that gets released all at once when nothing is left using it.
This is why I created newInputShot as a clone (that doesn't get clumped with inputShot) - so I could see if this was occurring with the inputRGBs. Well... nope! the inputRGBs are their own beast that refuse to be deallocated. I know it isn't any of the intermediate functions because this snippet does the exact same thing:
void adaptiveColorThreshold(Mat *inputShot, int adaptiveMethod, int blockSize, int cSubtraction)
{
Mat newInputShot = (*inputShot).clone();
Mat inputBlue = cvCreateMat(newInputShot.rows, newInputShot.cols, CV_8UC1);
Mat inputGreen = cvCreateMat(newInputShot.rows, newInputShot.cols, CV_8UC1);
Mat inputRed = cvCreateMat(newInputShot.rows, newInputShot.cols, CV_8UC1);
inputBlue.release();
inputGreen.release();
inputRed.release();
newInputShot.release();
return;
}
That's about as simple as it gets. Allocate - fail to Deallocate. So what's going on with cvCreateMat?
I would suggest not to use cvCreateMat and you don't need to clone the original Mat either.
Look into using split() and merge() functions. They will do the dirty work for you and will return Mat's that will handle memory for you. I don't have OpenCV installed right now so i can't test any of the code but i'm sure that's the route you want to take.
Related
I am rewriting some legacy code that does matrix operations on doubles using a raw C-style array. Since the code already has a dependency on OpenCV somewhere else, I want to use the cv::Mat class instead.
The specific code that bothers me works on square matrixes from size 1*1 to NN. It does so by allocating an NN buffer and uses a subset of it for smaller matrixes.
double* buf = new double[NxN];
for (int i = 1; i < N; ++i) {
// Reuse buf to create a i*i matrix and perform matrix operations
...
}
delete[] buf;
Basically, I want to replace that code to use cv::Mat objects in the loop instead. Problem is, the code requires a lot of loop iterations (there are nested loops and so on) and there are too many allocations/deallocations if I just use the naïve and clean approach. Therefore, I want to reserve the size of my matrix object beforehand and resize it for each iteration. This would ideally look like this:
cv::Mat m;
m.reserveBuffer(N * N * sizeof(double));
for (int i = 1; i < N; ++i) {
m = cv::Mat(i, i, CV_64F);
// Perform matrix operations on m
...
}
But in my understanding this would simply drop the previous instance of m and then allocate a i*i matrix. What would the right approach be?
You can create a submatix header for your buffer using cv::Mat::operator(). Pass a cv::Rect for ROI you want to process in current loop iteration ({0, 0, i, i} in your case) and it will return a view of your buffer region as another cv::Mat instance. It will not allocate new buffer but will refer to original buffer data instead.
cv::Mat m(N, N, CV_64FC1);
for (int i = 1; i < N; ++i) {
cv::Mat subM = m({0, 0, i*i});
// Perform matrix operations on "subM"
// Modifying "subM" will modify "m" buffer region that "subM" represents
}
Note that subM will not be continious, so you will need to process it row by row if you do any raw buffer processing.
I am having an issue when trying to store a sequence of image data with Qt.
Here is a piece of code that shows the problem:
#include <vector>
#include <iostream>
#include <QImage>
...
const int nFrames = 1000;
std::vector<int> sizes(nFrames);
std::vector<uchar*> images(nFrames);
for (int k = 0; k < nFrames; k++)
{
QImage *img = new QImage("/.../sample.png");
uchar *data = img->bits();
sizes.at(k) = img->width() * img->height();
images.at(k) = data;
}
std::cout << "Data loaded \"successfully\"." << std::endl;
for (int k = 0; k < nFrames; k++)
{
std::cout << k << ": " << (int) (images.at(k)[0]) << std::endl;
}
In the first loop, the program loads QImage objects and puts the bitmaps in the images vector of pointers. In the second loop, we just read a pixel of each frame.
The problem is that the program proceeds through the first loop without complaining, even if the heap memory becomes full. As a result, I get a crash in the second loop, as shown by the output of the program:
Data loaded "successfully".
0: 128
1: 128
2: 128
...
192: 128
[crash before hitting 1000]
To reproduce the problem, you can use the grayscale image below, and you may need to change the value of nFrames, depending on how much memory you have.
My question is: How can I load the data in the first loop in a way that would allow me to detect if the memory becomes full? I don't necessarily need to keep the QImage objects in memory, but only the data of theimages vector.
Firs of all, the first loop has memory leak becouse of img objects are not deleted.
From Qt documentation:
uchar * QImage::bits()
Returns a pointer to the first pixel data. This
is equivalent to scanLine(0).
Note that QImage uses implicit data sharing. This function performs a
deep copy of the shared pixel data, thus ensuring that this QImage is
the only one using the current return value.
So you can safely delete img at and of loop.
....
images.at(k) = data;
delete img;
}
To detect if the memory becomes full you can check if operator new create QImage object like this:
QImage *img = new QImage("/.../sample.png");
if(!img) {
//out of memory
}
Partial answer:
The first loop can be replaced by the following:
for (int k = 0; k < nFrames; k++)
{
QImage *img = new QImage("/.../sample.png");
sizes.at(k) = img->width() * img->height();
uchar *data = new uchar[sizes.at(k)];
std::copy(img->bits(), img->bits() + sizes.at(k), data);
images.at(k) = data;
delete img;
}
This creates in images.at(k) a copy of the data that img->bits() points to. (Btw, this allows now to delete the QImage at the end of the first for loop.) An std::bad_alloc error in the loop if out of memory.
However, this is not good enough. I suspect possible issues when nFrames is set to a value such that the maximum memory taken by the program is close to the limit (or when another program frees memory while this is running). My concern is that I still have no guarantee that img.bits() returns a pointer to accurate data.
I have a for loop the takes an OpenCV Mat object of n x n dimensions, and returns a Mat object of n^2 x 1 dimensions. It works, but when I time the method it takes between 1 and 2 milliseconds. Since I am calling this method 3 or 4 million times its taking my program about an hour to run. A research paper I'm referencing suggests the author was able to produce a program with the same function that ran in only a few minutes, without running any threads in parallel. After timing each section of code, the only portion taking >1 ms is the following method.
static Mat mat2vec(Mat mat)
{
Mat toReturn = Mat(mat.rows*mat.cols, 1, mat.type());
float* matPt;
float* retPt;
for (int i = 0; i < mat.rows; i++) //rows
{
matPt = mat.ptr<float>(i);
for (int j = 0; j < mat.row(i).cols; j++) //col
{
retPt = toReturn.ptr<float>(i*mat.cols + j);
retPt[0] = matPt[j];
}
}
return toReturn;
}
Is there any way that I can increase the speed at which this method converts an n x n matrix into an n^2 x 1 matrix (or cv::Mat representing a vector)?
that solved most of the problem #berak, its running a lot faster now. however in some cases like below, the mat is not continuous. Any idea of how I can get an ROI in a continuous mat?
my method not looks like this:
static Mat mat2vec(Mat mat)
{
if ( ! mat.isContinuous() )
{
mat = mat.clone();
}
return mat.reshape(1,2500);
}
Problems occur at:
Mat patch = Mat(inputSource, Rect((inputPoint.x - (patchSize / 2)), (inputPoint.y - (patchSize / 2)), patchSize, patchSize));
Mat puVec = mat2vec(patch);
assuming that the data in your Mat is continuous, Mat::reshape() for the win.
and it's almost for free. only rows/cols get adjusted, no memory moved. i.e, mat = mat.reshape(1,1) would make a 1d float array of it.
Seeing this in OpenCV 3.2, but the function is now mat.reshape(1).
I have a function looks like this:
void foo(){
Mat mat(50000, 200, CV_32FC1);
/* some manipulation using mat */
}
Then after several loops (in each loop, I call foo() once), it gives an error:
OpenCV Error: insufficient memory when allocating (about 1G) memory.
In my understanding, the Mat is local and once foo() returns, it is automatically de-allocated, so I am wondering why it leaks.
And it leaks on some data, but not all of them.
Here is my actual code:
bool VidBOW::readFeatPoints(int sidx, int eidx, cv::Mat &keys, cv::Mat &descs, cv::Mat &codes, int &barrier) {
// initialize buffers for keys and descriptors
int num = 50000; /// a large number
int nDims = 0; /// feature dimensions
if (featName == "STIP")
nDims = 162;
Mat descsBuff(num, nDims, CV_32FC1);
Mat keysBuff(num, 3, CV_32FC1);
Mat codesBuff(num, 3000, CV_64FC1);
// move overlapping codes from a previous window to buffer
int idxPre = -1;
int numPre = keys.rows;
int numMov = 0; /// number of overlapping points to move
for (int i = 0; i < numPre; ++i) {
if (keys.at<float>(i, 0) >= sidx) {
idxPre = i;
break;
}
}
if (idxPre > 0) {
numMov = numPre - idxPre;
keys.rowRange(idxPre, numPre).copyTo(keysBuff.rowRange(0, numMov));
codes.rowRange(idxPre, numPre).copyTo(codesBuff.rowRange(0, numMov));
}
// the starting row in code matrix where new codes from the updated features to add in
barrier = numMov;
// read keys and descriptors from feature file
int count = 0; /// number of new points that are read in buffers
if (featName == "STIP")
count = readSTIPFeatPoints(numMov, eidx, keysBuff, descsBuff);
// update keys, descriptors and codes matrix
descsBuff.rowRange(0, count).copyTo(descs);
keysBuff.rowRange(0, numMov+count).copyTo(keys);
codesBuff.rowRange(0, numMov+count).copyTo(codes);
// see if reaching the end of a feature file
bool flag = false;
if (feof(fpfeat))
flag = true;
return flag;
}
You don't post the code that calls your function, so I can't tell whether this is a true memory leak. The Mat objects that you allocate inside readFeatPoints() will be deallocated correctly, so there are no memory leaks that I can see.
You declare Mat codesBuff(num, 3000, CV_64FC1);. With num = 5000, this means you're trying to allocate 1.2 gigabytes of memory in one big block. You also copy some of this data to codes with the line:
codesBuff.rowRange(0, numMov+count).copyTo(codes);
If the value of numMove + count changes between iterations, this will cause reallocation of the data buffer in codes. If the value is large enough, you may also be eating up a significant amount of memory that persists across iterations of your loop. Both of these things may be leading to heap fragmentation. If at any point there doesn't exist a 1.2 GB chunk of memory waiting around, an insufficient memory error occurs, which is what you have experienced.
Please help how to handle this problem:
OpenCV Error: Insufficient memory (Failed to allocate 921604 bytes) in
unknown function, file
........\ocv\opencv\modules\core\src\alloc.cpp, line 52
One of my method using cv::clone and pointer
The code is:
There is a timer every 100ms;
In the timer event, I call this method:
void DialogApplication::filterhijau(const Mat &image, Mat &result) {
cv::Mat resultfilter = image.clone();
int nlhijau = image.rows;
int nchijau = image.cols*image.channels();;
for(int j=0; j<nlhijau; j++) {
uchar *data2=resultfilter.ptr<uchar> (j); //alamat setiap line pada result
for(int i=0; i<nchijau; i++) {
*data2++ = 0; //element B
*data2++ = 255; //element G
*data2++ = 0; //element R
}
// free(data2); //I add this line but the program hung up
}
cv::addWeighted(resultfilter,0.3,image,0.5,0,resultfilter);
result=resultfilter;
}
The clone() method of a cv::Mat performs a hard copy of the data. So the problem is that for each filterhijau() a new image is allocated, and after hundreds of calls to this method your application will have occupied hundreds of MBs (if not GBs), thus throwing the Insufficient Memory error.
It seems like you need to redesign your current approach so it occupies less RAM memory.
I faced this error before, I solved it by reducing the size of the image while reading them and sacrificed some resolution.
It was something like this in Python:
# Open the Video
cap = cv2.VideoCapture(videoName + '.mp4')
i = 0
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
frame = cv2.resize(frame, (900, 900))
# append the frames to the list
images.append(frame)
i += 1
cap.release()
N.B. I know it's not the most optimum solution for the problem but, it was enough for me.