OpenCV 3 SVM train with large Data - c++

I recently switched from OpenCV 2.4.6 to 3.0.
My Code looks like this:
Ptr<ml::SVM> pSVM = ml::SVM::create();
pSVM->->setType(cv::ml::SVM::C_SVC);
pSVM->setKernel(cv::ml::SVM::LINEAR);
pSVM->->setC(1);
cv::Ptr<cv::ml::TrainData> TrainData = cv::ml::TrainData::create(TrainMatrix, cv::ml::ROW_SAMPLE, Labels);
//TrainMatrix is a cv::Mat with 35000 rows and 1900 cols and float values in it. One Feature per row.
//Labels is a std::vector<int> with 35000 Elements with 1 and -1 in it.
pSVM->trainAuto(TrainData, 10, cv::ml::SVM::getDefaultGrid(cv::ml::SVM::C), cv::ml::SVM::getDefaultGrid(cv::ml::SVM::GAMMA), cv::ml::SVM::getDefaultGrid(cv::ml::SVM::P),
cv::ml::SVM::getDefaultGrid(cv::ml::SVM::NU), cv::ml::SVM::getDefaultGrid(cv::ml::SVM::COEF), cv::ml::SVM::getDefaultGrid(cv::ml::SVM::DEGREE), false);
When my program reaches the trainAuto method is crashes and in the error message stands that it cannot allocate 524395968 bytes. This number seems a little bit high. Before the crash the program consumes about 400 MB in Debug Mode.
If I put a smaller matrix (about 500 rows) in the method everything runs normally.
Has anyone same problems and knows a solution to it?

Related

MPEG 2 and 2.5 - problems calculating frame sizes in bytes

I have a console program which I have used for years, for (among other things) displaying info about certain audio-file formats, including mp3. I used data from the mpeghdr site to calculate the frame sizes, in order to further calculate playing time for the tracks. The equation that I got from mpeghdr was:
// Read the BitRate, SampleRate and Padding of the frame header.
// For Layer I files use this formula:
//
// FrameLengthInBytes = (12 * BitRate / SampleRate + Padding) * 4
//
// For Layer II & III files use this formula:
//
// FrameLengthInBytes = 144 * BitRate / SampleRate + Padding
This works well for most mp3 files, but there have always been a small subset for whom this equation failed. Recently, I've been looking at a set of very small mp3 files, and have found that for these files this formula fails much more often, so I'm trying to finally nail down what is going on. All of these mp3 files were generated using Lame V3.100, with default settings, on Windows 7 64-bit.
In all cases, I can successfully find the first frame header, but when I used the above formula to calculate the offset to the next frame header, it is sometimes not correct.
As an example, I have a file 'wolf howl.mp3'; analytical files such as MPEGAudioInfo show frame size as 288 bytes. When I run my program, though, it shows length of first frame as 576 bytes (2 * 288). When I look at the mp3 file in a hex editor, with first frame at 0x154, I can see that the next frame is at 0x154 + 208 bytes, but this calculation does in fact result in 576 bytes...
File info:
mpegV2.5, layer III
frame: bitrate=32, sample_rate=8000, pad=0, bytes=576
mtemp->frame_length_in_bytes =
(144 * (mtemp->bitrate * 1000) / mtemp->sample_rate) + mtemp->padding_bit;
which equals 576
I've looked at numerous other references, and they all show this equation...
At first I thought is was an issue with MPEG 2.5, which is an unofficial standard, but I have also seen this with MPEG2 files as well. Only happens with small files, though.
Does anyone have any insights on what I am missing here??
//**************************************
Later notes:
I thought maybe audio format would be relevant to this issue, so I dumped channel_mode and mode_extension for each of my test files (3 calculate properly, 2 don't). Sadly, all of them are cmode=3, mode_ext=0
(i.e., last byte of the header is 0xC4)... so that doesn't help...
Okay, I found the answer to this queston... it was in the MPEGAudioInfo program on CodeProject site. Here is the vital key:
//*************************************************************************************
// This reference data is from MPEGAudioInfo app
// Samples per Frame / 8
static const u32 m_dwCoefficients[2][3] =
{
{ // MPEG 1
12, // Layer1 (must be multiplied with 4, because of slot size)
144, // Layer2
144 // Layer3
},
{ // MPEG 2, 2.5
12, // Layer1 (must be multiplied with 4, because of slot size)
144, // Layer2
72 // Layer3
}
};
It is unfortunately that none of the reference pages mention this detail !!
My program now successfully calculates frame sizes for all of my mp3 files, including the small ones.
I had the same problem. Some documents, I've read, don't define dividing by 2 in Frame-Size formula for MPEG2.5L3. But some src-code, I encountered - does.
It's hard to find out any proof.
I have nothing better than this link:
https://link.springer.com/chapter/10.1007/978-1-4615-0327-9_12
(it's better to share that link in "add a comment"-form, but I have insufficient rank)

Audio manipulation and delete some part of the audio

I'm new in voice codding, now I am succeed to recording microphone in the files and save each 10 seconds in a file with SaveRecordtoFile function(doing this with no problem)
Now I want to delete for example 2 seconds from the recorded data so my output will be 8 seconds instead of 10, in the randomTime array 0 is the number of seconds witch I want to be delete...
In a for-loop I copy the data of waveHeader->lpData in a new buffer if (randomTime[i] == '1')
It seems this is a true algorithm and should works but the problem is the outputs, some of the outputs are good (about 70% or more) but some of them are corrupted
I think I have a mistake in the code but I debug this code for some days and I don't understand what is the problem?
And as my 70% or more of outputs are good I think It's not because of bytes or samples
Your code can break a sample apart, after that the stream is out of sync and you hear a loud noise.
How it happens? Your sample size is 4 bytes. So you must never copy anything that is not a multiple of 4. 10 seconds of audio will take 10x48000×4=1920000 bytes. However Sleep(10000) will always be near 10 seconds but not exactly 10 seconds. So you can get 1920012 bytes. Then you do:
dwSamplePerSec = waveHeader->dwBytesRecorded / 10; // 10 Secs
that returns 192001 (which is not multiple of 4) and the steam gets out of sync. If you're lucky you receive 1920040 bytes for 10 second and that remains multiple of 4 after division on 10 and you're ok.

Tensorflow RNN slice error

I am attempting to create a multilayered RNN using LSTMs in tensorflow. I am using Tensorflow version 0.9.0 and python 2.7 on Ubuntu 14.04.
However, I keep getting the following error:
tensorflow.python.framework.errors.InvalidArgumentError: Expected begin[1] in [0, 2000], but got 4000
when I use
rnn_cell.MultiRNNCell([cell]*num_layers)
if num_layers is greater than 1.
My code:
size = 1000
config.forget_bias = 1
and config.num_layers = 3
cell = rnn_cell.LSTMCell(size,forget_bias=config.forget_bias)
cell_layers = rnn_cell.MultiRNNCell([cell]*config.num_layers)
I would also like to be able to switch to using GRU cells but this gives me the same error:
Expected begin[1] in [0, 1000], but got 2000
I have tried explicitly setting
num_proj = 1000
which also did not help.
Is this something to do with my use of concatenated states? As I have attempted to set
state_is_tuple=True
which gives:
`ValueError: Some cells return tuples of states, but the flag state_is_tuple is not set. State sizes are: [LSTMStateTuple(c=1000, h=1000), LSTMStateTuple(c=1000, h=1000), LSTMStateTuple(c=1000, h=1000)]`
Any help would be much appreciated!
I'm not sure why this worked but, I added in a dropout wrapper. i.e.
if Training:
cell = rnn_cell.DropoutWrapper(cell,output_keep_prob=config.keep_prob)
And now it works.
This works for both LSTM and GRU cells.
This problem is occurring because you have increased layer of your GRU cell but your initial vector is not doubled. If your initial_vector size is [batch_size,50].
Then initial_vector = tf.concat(1,[initial_vector]*num_layers)
Now input this to decoder as initial vector.

OpenCV matchTemplate throws memory fault but only first time

I am processing a set of >3000 images of same size changing template every 300 images.
code snippet:
cv::Mat inTplate, cFrame, Cresult;
Cresult.create(resultH, resultW, IPL_DEPTH_32F);
cFrame(rect).copyTo(inTplate);
...
// this part executed for every frame
matchTemplate(cFrame, inTplate, Cresult, CV_TM_CCORR_NORMED);
minMaxLoc(Cresult, &minVal, &maxVal, &minLoc, &maxLoc, Mat());
rect = ( 250, 20, 1420, 1040); and resultH = 41; resultW = 501;
the very first time thru the code, the call to matchTemplate throws a memory fault that i believe comes from combase.dll and references an address that is not in the space for any of the three matrices: cFrame, inTplate or Cresult.
Also the sizes for the three matrices are consistent: cFrame 1080 rows X 1920 cols, inTplate 1040 rows X 1420 cols; Cresult is 41 rows X 501 cols. yes the first time inTplate is a region of cFrame; thereafter cFrame is the next image read in.
i verified that the answsers coming back from matchTemplate are correct -- the matching is correct. And the memory fault occurs ONLY on the very first call, not on any of the subsequent frames.
Am I doing something wrong or am i looking at a bug in OpenCV ?
thanks for taking the time.
A sort of answer:
I modified my call to matchTemplate to use a try ... catch block. but it wouldnt catch the exception.
Then I went into Debug | Windows | Exception Settings and turned off the checkbox for cv::Exception.
Now the program runs without stopping with a memory exception. It would seem that i have now enabled OpenCV to catch the exception and deal with it. So the underlying issue still exists but OpenCV is taking care of it. i would still like to understand why the exception is being thrown in the first place, though.

haar training OpenCV assertion failed

I am trying to train a haar-like classifier for pedestrians in OpenCV using 3340 positive images and 1224 negative images. (in a .txt file I keep the negative image names i.e negatives(1).bmp, and in a txt file I keep the positives i.e. picture(1).bmp 1 0 0 64 128.
Actually positive examples are already cropped images of pedestrians so I only need specify one positive sample per image).
At some point during the training process it stops and says :
"Opencv Error: Assertion failed (elements_read==1)in unknown function, file c:\path\cvhaartraining.cpp, line 1858"
Any ideas as to what is causing this ?
this issue was answered by creater of the utility on the OpenCV DevZone site in June 2012.
To quote Maria:
The problem is that your vec-file has exactly the same samples count
that you passed in command line -numPos 979. Training application used
all samples from the vec-file to train 0-stage and it can not get new
positive samples for the next stage training because vec-file is over.
The bug of traincascade is that it had assert() in such cases, but it
has to throw an exception with error message for a user. It was fixed
in r8913.
-numPose is a samples count that is used to train each stage. Some already used samples can be filtered by each previous stage (ie
recognized as background), but no more than (1 - minHitRate) * numPose
on each stage. So vec-file has to contain >= (numPose + (numStages-1)
* (1 - minHitRate) * numPose) + S, where S is a count of samples from vec-file that can be recognized as background right away. I hope it
can help you to create vec-file of correct size and chose right numPos
value.
It worked for me. I also had same problem, I was following the famous tutorial on HAAR training but wanted to try the newer training utility with
-npos 7000 -nneg 2973
so i did following calcs:
vec-file has to contain >= (numPos + (numStages-1) * (1 - minHitRate) * numPos) + S
7000 >= (numPos + (20-1) * (1 - 0.999) * numPos) + 2973
(7000 - 2973)/(1 + 19*0.001) >= numPos
numPos <= 4027/1.019
numPos <= 3951 ~~ 3950
and used:
-npos 3950 -nneg 2973
It works. I also noticed that others have also had success with reducing numPos : here