I have been looking in to writing my own implementation of Haar Cascaded face detection for some time now, and have begun with diving in to the OpenCV 2.0 implementation.
Right out of the box, running in debug mode, Visual Studio breaks on cvhaar.cpp:1518, informing me:
Run-Time Check Failure #2 - Stack aound the variable seq_thread was corrupted.
It seems odd to me that OpenCV ships with a simple array out-of-bounds problem. Running the release works without any problems, but I suspect that it is merely not performing the check and the array is exceeding the bounds.
Why am I receiving this error message? Is it a bug in OpenCV?
A little debugging revealed the culprit, I believe. I "fixed" it, but this all still seems odd to me.
An array of size CV_MAX_THREADS is created on cvhaar.cpp:868:
CvSeq* seq_thread[CV_MAX_THREADS] = {0};
On line 918 it proceeds to specify max_threads:
max_threads = cvGetNumThreads();
In various places, seq_thread is looped using the following for statement:
for( i = 0; i < max_threads; i++ ) {
CvSeq* s = seq_thread[i];
// ...
}
However, cxmisc.h:108 declares CV_MAX_THREADS:
#define CV_MAX_THREADS 1
Hence, the declaration of seq_thread must never be allowed to exceed size 1, yet cvGetNumThreads() returns 2 (I assume this reflects the number of cores in my machine).
I resolved the problem by adding the following simple little statement:
if (max_threads > CV_MAX_THREADS) max_threads = CV_MAX_THREADS;
Does any of this make sense?
Related
I recently upgraded my ray tracing renderer from Vulkan SDK version 1.2.148.0 to 1.2.162.1.
This was necessary because the ray tracing extension went out of beta and thus now works with non-beta
graphics drivers (am on version 461.40 for my RTX 2070 SUPER). It required me to make quite a few changes to the ray tracing side of my renderer which
I managed thanks to the nvidia tutorial.
Unfortunately, code that used to work started to cause errors now.
In many situations, submitting a single time command causes vkQueueWaitIdle to fail with VK_ERROR_DEVICE_LOST which results in a validation error, saying I'm trying to free the command buffer while it's still in use. This happens for a variety of uses: transitioning an image layout(undef to general it seems), building acceleration structures, copying buffers but not every time (e.g. from a staging to a device buffer, after which freeing the staging buffer also throws an error, since it's still in use, the copy not having finished)... But for other uses, it works fine. I can't currently identify a common denominator...
Finally, the program crashes because presenting the first frame fails, because its layout is undefined - I assume this is caused by one or more of the previously mentioned errors.
Did something change about this since last I used it? This is the offending code (endSingleTimeCommands):
vkEndCommandBuffer(commandBuffer);
VkSubmitInfo submitInfo{};
submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
submitInfo.commandBufferCount = 1;
submitInfo.pCommandBuffers = &commandBuffer;
vkQueueSubmit(graphicsQueue, 1, &submitInfo, VK_NULL_HANDLE);
switch (vkQueueWaitIdle(graphicsQueue)) {
//debug output removed for brevity
};
vkFreeCommandBuffers(device, commandPool, 1, &commandBuffer);
One of the places where it fails is this:
//[fill the structs with info...]
//function pointer grabbed via vkGetDeviceProcAddr
vk::vkCmdBuildAccelerationStructuresKHR(cmd, 1, &buildInfo, &buildOffset);
//[call to the above code here]
But also code unrelated to extensions fails (sometimes!) such as this one:
VkCommandBuffer commandBuffer = beginSingleTimeCommands();
VkBufferCopy copyRegion{};
copyRegion.srcOffset = 0; // Optional
copyRegion.dstOffset = 0; // Optional
copyRegion.size = size;
vkCmdCopyBuffer(commandBuffer, srcBuffer, dstBuffer, 1, ©Region);
endSingleTimeCommands(commandBuffer);
Perhaps beginSingleTimeCommands is also relevant:
VkCommandBufferAllocateInfo allocInfo{};
allocInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO;
allocInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY;
allocInfo.commandPool = commandPool;
allocInfo.commandBufferCount = 1;
VkCommandBuffer commandBuffer;
if (vkAllocateCommandBuffers(device, &allocInfo, &commandBuffer) != VK_SUCCESS) {
std::cout << "beginSingleTimeCommands: could not allocate command buffer!\n";
}
VkCommandBufferBeginInfo beginInfo{};
beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
if (vkBeginCommandBuffer(commandBuffer, &beginInfo) != VK_SUCCESS) {
std::cout << "beginSingleTimeCommands: could not begin command buffer!\n";
}
return commandBuffer;
Some additional info I think I gathered:
I used the nvidia pipeline checkpoint system to add a checkpoint before and after the call to vkCmdBuildAccelerationStructuresKHR and both checkpoints are at TOP_OF_PIPE. After the first call to this function, no more checkpoint output is generated, leading me to believe that the first call to the build somehow ruins everything. I will triplecheck my AS building I guess, I'll get back to you if I find anything.
Turns out, the actual error can occur before the command buffer whose vkQueueWaitIdle returns the DEVICE_LOST error. I've had and continue to have a variety of errors in my acceleration structure building code. I can't easily debug it, because apparently the validation layers don't show if there's subtle mistakes in the structs fed to vkCmdBuildAccelerationStructures, instead it's a lot of trial and error.
One notable example which I'm certain would've been caught by the validation layers pre-upgrade is forgetting to set the VkAccelerationStructureBuildGeometryInfoKHR::scratchData field, the last mistake I had to fix to finally get everything to run.
The answer to my question is thus: Don't look at the commands that trigger the DEVICE_LOST, look at what you do with the queue before that command, there's a chance the error is there, instead. In fact, once the first DEVICE_LOST error occurred, (almost?) all further vkQueueWaitIdle failed with the same error (same with the vkQueueSubmit). In cases such as my copy buffer code being the first to fail, the error was always found in the queue usage before that one.
I can't post the exact solution to my problem as - like I've said - there's more than one cause and I've only fixed some of them so far, there's still some left. I think the details are not relevant to future people who come across my question but if there's anything I can add to help other people, please let me know.
This is so true! I was stuck with this issue for couple of days only to figure out that VkAccelerationStructureBuildGeometryInfoKHR flags was mismatching when I query the size using vkGetAccelerationStructureBuildSizesKHR() vs when I use it to actually build the BLAS! In my case, I was using VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_TRACE_BIT_KHR | VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_UPDATE_BIT_KHR while querying the size and only FAST_TRACE while actually creating the AS, this was causing the same issue!
I tried to program Ackermann function on my notebook (Win10), however the program crashed at higher values instead of continuing to calculate for a few minutes or hours.
My friend tried the same code on his machine (SUSE) and it worked just fine, then we tried it on the school server (CentOS) and it crashed yet again.
EDIT: It worked on server too, it just needed a second try. It also worked on the other server we had tried... All of it is on Linux.
We suspect the stack overflow is behind it but it's weird, because values aren't THAT HIGH yet. How am I able to preform recursive functions on this system then?
Thanks for all the answers. I'm just curious why it happens and how to make it work on my machine.
I tried to use both C and C++ to no change.
#include <stdio.h>
int ackermann (int m, int n);
int main () {
int m = 4;
int n = 1;
return ackermann(m,n);
}
int ackermann (int m, int n)
{
if (m == 0) return n=n+1;
else if (m > 0 && n == 0) return ackermann(m-1,1);
else if (m > 0 && n > 0) return ackermann(m-1,ackermann(m,n - 1));
}
Sounds like it's stack smashing, or more accurately a stack buffer overflow, which is the error message you got. The function has a very large depth, and will keep on pushing variables on the stack. If you push enough variables onto the stack, it'll smash. The computer doesn't have infinite memory for the stack and from the sounds of it, not a crazy amount either.
Given big enough input, it should stack any smash if not optimized well (and might still anyways). Without knowing more about your compiler, there can't be a comprehensive option on how to disable it, but if you're using Microsoft Visual Studio's compiler suite, you can disable the protection if I understand this doc correctly.
In Solution Explorer, right-click the project and then click Properties
In the Property Pages dialog box, click the C/C++ folder.
Click the Code Generation property page.
Modify the Buffer Security Check property.
As a warning, this is not safe and bad things can happen, so use this at your own risk.
In Visual Studio the default stack size is 1 MB, so with a recursion depth of 65535, and, I believe, a minimum stack frame for caller/function on x64 of this type is 72 bytes, so you will run out of available stack frame space for your program ( I compute almost 4.5Mb of stack needed for this scenario). This also produced the stack buffer overflow error, and has nothing to do with stack smashing other than you went beyond the maximum stack size available to your program when compiled.
Most compilers including Visual Studio let you specify the stack size.
More details:
https://learn.microsoft.com/en-us/cpp/build/reference/f-set-stack-size?view=vs-2017
[ Edited to reflect 65,535 frames, and not 1.4 billion ]
I try to create a Descriptor using FAST for the Point detection and SIFT for building the Descriptor. For that purpose I use OpenCV. While I use OpenCV's FAST I just use parts of the SIFT code, because I only need the Descriptor. Now I have a really nasty malloc Error and I don't know, how to solve it. I posted my code into GitHub because it is big and I dont really know where the Error comes from. I just know, that it is created at the end of the DO-WHILE-Loop:
features2d.push_back(features);
features.clear();
candidates2d.push_back(candidates);
candidates.clear();
}
}while(candidates.size() > 100);
As you can see in the code of GitHub I already tried to release Memory of the Application. Xcode Analysis says, that my Application uses 9 Mb memory. I tried to debug the Error but It was very complicated and I haven't found any clue where the Error comes from.
EDIT
I wondered if this Error could occur because I try to access the Image Pixel Value passed to calcOrientationHist(...) with img.at<sift_wt>(...) where typdef float sift_wt at Line 56, and 57 in my code, because normally the Patch I pass outputs the type 0 which means it is a CV_8UC1 But well, I copied this part from the sift.cpp at Line 330 and 331 Normally the SIFT Descriptor should also have a Grayscale image or not?
EDIT2
After changing the type in the img.at<sift_wt>(...)Position nothing changed. So I googled Solutions and landed at the GuardMalloc feature from XCode. Enabling it showed me a new Error which is probably the Reason I get the Malloc Error. In line 77 of my Code. The Error it gives me at this line is EXC_BAD_ACCESS (Code=1, address=....) There are the following lines:
for( k = 0; k < len; k ++){
int bin = cvRound((n/360.f)+Ori[k]);
if(bin >= n)
bin -=n;
if(bin < 0 )
bin +=n;
temphist[bin] += W[k]*Mag[k];
}
The Values of the mentioned Variables are the following:
bin = 52, len = 169, n = 36, k = 0, W, Mag, Ori and temphist are not shown.
Here the GuadMalloc Output (sorry but I dont really understand what exactly it wants)
GuardMalloc[Test-1935]: Allocations will be placed on 16 byte boundaries.
GuardMalloc[Test-1935]: - Some buffer overruns may not be noticed.
GuardMalloc[Test-1935]: - Applications using vector instructions (e.g., SSE) should work.
GuardMalloc[Test-1935]: version 108
Test(1935,0x102524000) malloc: protecting edges
Test(1935,0x102524000) malloc: enabling scribbling to detect mods to free blocks
Answer is simpler as thought...
The Problem was, that in the calculation of Bin in the For-loop the wrong value came out. Instead of adding ori[k] it should be a multiplication with ori[k].
The mistake there resulted in a bin value of 52. But the Length of the Array that temphist is pointing to is 38.
For all who have similar Errors I really recomment to use GuardMalloc or Valgrind to debug Malloc Errors.
I am working in a project where I must interface between managed code and unmanaged code. I am currently having a strange issue with math.h.
Some functions will floating numbers will returns 0 (ex: 2.1219957934356005e-314) and randomly...
By example:
int error = 0;
int success = 0;
for (int i = 0; i < 1000; ++i)
{
double test = std::sqrt(9.01);
if (test < 2 || test > 4)
{
++error;
}
else
{
++success;
}
}
Usually I will obtain error = 1000 with breakpoints, I'll retry it some times and get again 1000 errors and some times I will get 1000 success...
I see nothing wrong into the dissassembly and into the registers (except the bad result).
For context: this code is compiled into a dll for 64 bits and is used by C#. This app is indeed multithreaded.
Any idea?
I think I solved a part of my problem (I must do more tests).
The main problem was caused by the fact that my project changed to clr (managed code) for everything. With more research I found that I can compile specific files of my project with the property clr set to no. Now all my old native source code is compiled as native. But it still doesn't explain why functions like sqrt(), ceil(), etc. does not work everytime in managed context.
Now I must resolve bugs with OpenGL, but that's another story :D (This project is an abomination!)
I have a function that creates and insert some numbers in a vector.
if(Enemy2.dEnemy==true)
{
pt.y=4;
pt.x=90;
pt2.y=4;
pt2.x=125;
for(int i=0; i<6; i++)
{
Enemy2.vS1Enemy.push_back(pt);
Enemy2.vS2Enemy.push_back(pt2);
y-=70;
pt.y=y;
pt2.y=y;
}
Enemy2.dEnemy=false;
Enemy3.cEnemy=0;
}
It should insert 6 numbers in two vectors, the only problem is that it doesn't - it actually inserts more.
I don't think the snippet will run unless Enemy2.dEnemy == true, and it won't stay true for ever.
The first time the snippet runs, then Enemy2.dEnemy is set to false and it shouldn't run again.
I don't set Enemy2.dEnemy to true anywhere except when the window is created.
If I insert a break point any where in the snippet, the program will work fine - it will insert ONLY 6 numbers in the two vectors.
Any ideas what's wrong here?
ok so i did some debugging.
i found that Enemy2.dEnemy=false; is being skipped for some reason.
i tried to do this to see if it was.
if(Enemy2.dEnemy)
{
pt.y=4;
pt.x=90;
pt2.y=4;
pt2.x=125;
for(int i=0; i<6; i++)
{
Enemy2.vS1Enemy.push_back(pt);
Enemy2.vS2Enemy.push_back(pt2);
y-=70;
pt.y=y;
pt2.y=y;
}
TCHAR s[244];
Enemy2.dEnemy=false;
if(Enemy2.dEnemy)
{
MessageBox(hWnd, _T("0"), _T(""), MB_OK);
}
else
{
MessageBox(hWnd, _T("1"), _T(""), MB_OK);
}
Enemy3.cEnemy=0;
}
well the message box popped saying 1 and my code worked fine. it seems that Enemy2.dEnemy=false; doesn't have time to run ;/
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah!
ok i found where is the real problem which was causing to insert more than 6 numbers..
it was where i was asigning Enemy2.dEnemy=true;
if(Enemy2.e1)
{
Enemy2.now=time(NULL);
Enemy2.tEnemy=Enemy2.now+4;
Enemy2.e1=false;
}
if(Enemy2.tEnemy==time(NULL))
{
check=1;
Enemy2.aEnemy=0;
Enemy2.dEnemy=true;
}
the problem seems that the second if runs more than one time, which is weird!
First things first: get rid of that abominable if (Enemy2.dEnemy == true) - it should be:
if (Enemy2.dEnemy)
(I also prefer to name my booleans as a readable sentence segments like Enemy2.isABerserker or Enemy3.hasHadLeftLegCutOffThreeInchesBelowTheKnee but that's just personal preference).
Other than that, the only thing I can suggest is a threading problem. There's nothing wrong with that code per se, but there is a window in which two threads could enter the if statement and both start pushing values into your vector.
In other words, if thread 1 is doing the pushing when thread 2 encounters the if statement, thread 2 will also start pushing values, since thread 1 has yet to set dEnemy to true. And don't think you can just move the assignment to the top of the if block - that will reduce but not remove the window.
My advice is to print out the contents of the vectors in the situation where they have more than six entries and that may give a clue as to what's happened (post the output here if you wish).
Re your update that the second if below is running twice:
if(Enemy2.e1)
{
Enemy2.now=time(NULL);
Enemy2.tEnemy=Enemy2.now+4;
Enemy2.e1=false;
}
if(Enemy2.tEnemy==time(NULL))
{
check=1;
Enemy2.aEnemy=0;
Enemy2.dEnemy=true;
}
If this code is executed twice in the same second (and that's not beyond the bounds of possibility), the second if statement will run twice.
That's because time(NULL) give you the number of seconds since the epoch so, until that second is over, you may well be executing the contents of that if thousands of times (or more).
If this problem disappears when you put in a breakpoint or a diagnostic output message, that's a strong clue that the problem is undefined behavior, which is usually caused by something like dereferencing an uninitialized pointer or careless use of const_cast.
The cause of the problem probably has nothing to do with the code you're looking at. It's caused somewhere else and just happens to show up here. It's like someone being hit by a falling brick: the obvious symptom is a man lying unconscious on the sidewalk, but the real problem has nothing to do with the man or the sidewalk, it's several stories up.
If you want to find the cause of the error, remove your diagnostics until the problem reappears, then start removing everything else. Prune away all of the other code. Whenever the error stops, back up until it starts again; if you don't see the cause of the error, start pruning somewhere else. Eventually the bug will have nowhere to hide.