What is the best way to debug OpenGL? [closed] - opengl

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 27 days ago.
The community is reviewing whether to reopen this question as of 27 days ago.
Improve this question
I find that a lot of the time, OpenGL will show you it failed by not drawing anything. I'm trying to find ways to debug OpenGL programs, by inspecting the transformation matrix stack and so on. What is the best way to debug OpenGL? If the code looks and feels like the vertices are in the right place, how can you be sure they are?

There is no straight answer. It all depends on what you are trying to understand. Since OpenGL is a state machine, sometimes it does not do what you expect as the required state is not set or things like that.
In general, use tools like glTrace / glIntercept (to look at the OpenGL call trace), gDebugger (to visualize textures, shaders, OGL state etc.) and paper/pencil :). Sometimes it helps to understand how you have setup the camera and where it is looking, what is being clipped etc. I have personally relied more to the last than the previous two approaches. But when I can argue that the depth is wrong then it helps to look at the trace. gDebugger is also the only tool that can be used effectively for profiling and optimization of your OpenGL app.
Apart from this tool, most of the time it is the math that people get wrong and it can't be understood using any tool. Post on the OpenGL.org newsgroup for code specific comments, you will be never disappointed.

What is the best way to debug OpenGL?
Without considering additional and external tools (which other answers already do).
Then the general way is to extensively call glGetError(). However a better alternative is to use Debug Output (KHR_debug, ARB_debug_output). This provides you with the functionality of setting a callback for messages of varying severity level.
In order to use debug output, the context must be created with the WGL/GLX_DEBUG_CONTEXT_BIT flag. With GLFW this can be set with the GLFW_OPENGL_DEBUG_CONTEXT window hint.
glfwWindowHint(GLFW_OPENGL_DEBUG_CONTEXT, GL_TRUE);
Note that if the context isn't a debug context, then receiving all or even any messages aren't guaranteed.
Whether you have a debug context or not can be detected by checking GL_CONTEXT_FLAGS:
GLint flags;
glGetIntegerv(GL_CONTEXT_FLAGS, &flags);
if (flags & GL_CONTEXT_FLAG_DEBUG_BIT)
// It's a debug context
You would then go ahead and specify a callback:
void debugMessage(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length,
const GLchar *message, const void *userParam)
{
// Print, log, whatever based on the enums and message
}
Each possible value for the enums can be seen on here. Especially remember to check the severity, as some messages might just be notifications and not errors.
You can now do ahead and register the callback.
glEnable(GL_DEBUG_OUTPUT);
glEnable(GL_DEBUG_OUTPUT_SYNCHRONOUS);
glDebugMessageCallback(debugMessage, NULL);
glDebugMessageControl(GL_DONT_CARE, GL_DONT_CARE, GL_DONT_CARE, 0, NULL, GL_TRUE);
You can even inject your own messages using glDebugMessageInsert().
glDebugMessageInsert(GL_DEBUG_SOURCE_APPLICATION, GL_DEBUG_TYPE_ERROR, 0,
GL_DEBUG_SEVERITY_NOTIFICATION, -1, "Vary dangerous error");
When it comes to shaders and programs you always want to be checking GL_COMPILE_STATUS, GL_LINK_STATUS and GL_VALIDATE_STATUS. If any of them reflects that something is wrong, then additionally always check glGetShaderInfoLog() / glGetProgramInfoLog().
GLint linkStatus;
glGetProgramiv(program, GL_LINK_STATUS, &linkStatus);
if (!linkStatus)
{
GLchar *infoLog = new GLchar[infoLogLength + 1];
glGetProgramInfoLog(program, infoLogLength * sizeof(GLchar), NULL, infoLog);
...
delete[] infoLog;
}
The string returned by glGetProgramInfoLog() will be null terminated.
You can also go a bit more extreme and utilize a few debug macros in a debug build. Thus using glIs*() functions to check if the expected type is the actual type as well.
assert(glIsProgram(program) == GL_TRUE);
glUseProgram(program);
If debug output isn't available and you just want to use glGetError(), then you're of course free to do so.
GLenum err;
while ((err = glGetError()) != GL_NO_ERROR)
printf("OpenGL Error: %u\n", err);
Since a numeric error code isn't that helpful, we could make it a bit more human readable by mapping the numeric error codes to a message.
const char* glGetErrorString(GLenum error)
{
switch (error)
{
case GL_NO_ERROR: return "No Error";
case GL_INVALID_ENUM: return "Invalid Enum";
case GL_INVALID_VALUE: return "Invalid Value";
case GL_INVALID_OPERATION: return "Invalid Operation";
case GL_INVALID_FRAMEBUFFER_OPERATION: return "Invalid Framebuffer Operation";
case GL_OUT_OF_MEMORY: return "Out of Memory";
case GL_STACK_UNDERFLOW: return "Stack Underflow";
case GL_STACK_OVERFLOW: return "Stack Overflow";
case GL_CONTEXT_LOST: return "Context Lost";
default: return "Unknown Error";
}
}
Then checking it like this:
printf("OpenGL Error: [%u] %s\n", err, glGetErrorString(err));
That still isn't very helpful or better said intuitive, as if you have sprinkled a few glGetError() here and there. Then locating which one logged an error can be troublesome.
Again macros come to the rescue.
void _glCheckErrors(const char *filename, int line)
{
GLenum err;
while ((err = glGetError()) != GL_NO_ERROR)
printf("OpenGL Error: %s (%d) [%u] %s\n", filename, line, err, glGetErrorString(err));
}
Now simply define a macro like this:
#define glCheckErrors() _glCheckErrors(__FILE__, __LINE__)
and voila now you can call glCheckErrors() after everything you want, and in case of errors it will tell you the exact file and line it was detected at.

GLIntercept is your best bet. From their web page:
Save all OpenGL function calls to text or XML format with the option to log individual frames.
Free camera. Fly around the geometry sent to the graphics card and enable/disable wireframe/backface-culling/view frustum render
Save and track display lists.
Saving of the OpenGL frame buffer (color/depth/stencil) pre and post render calls. The ability to save the "diff" of pre and post images is also available.

Apitrace is a relatively new tool from some folks at Valve, but It works great! Give it a try: https://github.com/apitrace/apitrace

I found you can check using glGetError after every line of code your suspect will be wrong, but after do it, the code looks not very clean but it works.

For those on Mac, the buit in OpenGL debugger is great as well. It lets you inspect buffers, states, and helps in finding performance issues.

The gDebugger is an excellent free tool, but no longer supported. However, AMD has picked up its development, and this debugger is now known as CodeXL. It is available both as a standalone application or as a Visual Studio plugin - works both for native C++ applications, or Java/Python applications using OpenGL bindings, both on NVidia and AMD GPUs. It's one hell of a tool.

There is also the free glslDevil: http://www.vis.uni-stuttgart.de/glsldevil/
It allows you to debug glsl shaders extensively. It also shows failed OpenGL calls.
However it's missing features to inspect textures and off screen buffers.

Nsight is good debugging tool if you have an NVidia card.

Download and install RenderDoc.
It will give you a timeline where you can inspect the details of every object.
Use glObjectLabel to give OpenGL objects names. These will show up in RenderDoc.
Enable GL_DEBUG_OUTPUT and use glDebugMessageCallback to install a callback function. Very verbose but you won't miss anything.
Check glGetError at the end of every function scope. This way, you will have a traceback to the source of a OpenGL error in the function scope where the error occurred. Very maintainable. Preferably, use a scope guard.
I don't check errors after each OpenGL call unless there's a good reason for it. For example, if glBindProgramPipeline fails that is a hard stop for me.
You can find out even more about debugging OpenGL here.

Updating the window title dynamically is convenient for me.
Example (use GLFW, C++11):
glfwSetWindowTitle(window, ("Now Time is " + to_string(glfwGetTime())).c_str());

Related

After upgrading to 1.2.162.1: vkQueueWaitIdle == VK_ERROR_DEVICE_LOST

I recently upgraded my ray tracing renderer from Vulkan SDK version 1.2.148.0 to 1.2.162.1.
This was necessary because the ray tracing extension went out of beta and thus now works with non-beta
graphics drivers (am on version 461.40 for my RTX 2070 SUPER). It required me to make quite a few changes to the ray tracing side of my renderer which
I managed thanks to the nvidia tutorial.
Unfortunately, code that used to work started to cause errors now.
In many situations, submitting a single time command causes vkQueueWaitIdle to fail with VK_ERROR_DEVICE_LOST which results in a validation error, saying I'm trying to free the command buffer while it's still in use. This happens for a variety of uses: transitioning an image layout(undef to general it seems), building acceleration structures, copying buffers but not every time (e.g. from a staging to a device buffer, after which freeing the staging buffer also throws an error, since it's still in use, the copy not having finished)... But for other uses, it works fine. I can't currently identify a common denominator...
Finally, the program crashes because presenting the first frame fails, because its layout is undefined - I assume this is caused by one or more of the previously mentioned errors.
Did something change about this since last I used it? This is the offending code (endSingleTimeCommands):
vkEndCommandBuffer(commandBuffer);
VkSubmitInfo submitInfo{};
submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
submitInfo.commandBufferCount = 1;
submitInfo.pCommandBuffers = &commandBuffer;
vkQueueSubmit(graphicsQueue, 1, &submitInfo, VK_NULL_HANDLE);
switch (vkQueueWaitIdle(graphicsQueue)) {
//debug output removed for brevity
};
vkFreeCommandBuffers(device, commandPool, 1, &commandBuffer);
One of the places where it fails is this:
//[fill the structs with info...]
//function pointer grabbed via vkGetDeviceProcAddr
vk::vkCmdBuildAccelerationStructuresKHR(cmd, 1, &buildInfo, &buildOffset);
//[call to the above code here]
But also code unrelated to extensions fails (sometimes!) such as this one:
VkCommandBuffer commandBuffer = beginSingleTimeCommands();
VkBufferCopy copyRegion{};
copyRegion.srcOffset = 0; // Optional
copyRegion.dstOffset = 0; // Optional
copyRegion.size = size;
vkCmdCopyBuffer(commandBuffer, srcBuffer, dstBuffer, 1, &copyRegion);
endSingleTimeCommands(commandBuffer);
Perhaps beginSingleTimeCommands is also relevant:
VkCommandBufferAllocateInfo allocInfo{};
allocInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO;
allocInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY;
allocInfo.commandPool = commandPool;
allocInfo.commandBufferCount = 1;
VkCommandBuffer commandBuffer;
if (vkAllocateCommandBuffers(device, &allocInfo, &commandBuffer) != VK_SUCCESS) {
std::cout << "beginSingleTimeCommands: could not allocate command buffer!\n";
}
VkCommandBufferBeginInfo beginInfo{};
beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
if (vkBeginCommandBuffer(commandBuffer, &beginInfo) != VK_SUCCESS) {
std::cout << "beginSingleTimeCommands: could not begin command buffer!\n";
}
return commandBuffer;
Some additional info I think I gathered:
I used the nvidia pipeline checkpoint system to add a checkpoint before and after the call to vkCmdBuildAccelerationStructuresKHR and both checkpoints are at TOP_OF_PIPE. After the first call to this function, no more checkpoint output is generated, leading me to believe that the first call to the build somehow ruins everything. I will triplecheck my AS building I guess, I'll get back to you if I find anything.
Turns out, the actual error can occur before the command buffer whose vkQueueWaitIdle returns the DEVICE_LOST error. I've had and continue to have a variety of errors in my acceleration structure building code. I can't easily debug it, because apparently the validation layers don't show if there's subtle mistakes in the structs fed to vkCmdBuildAccelerationStructures, instead it's a lot of trial and error.
One notable example which I'm certain would've been caught by the validation layers pre-upgrade is forgetting to set the VkAccelerationStructureBuildGeometryInfoKHR::scratchData field, the last mistake I had to fix to finally get everything to run.
The answer to my question is thus: Don't look at the commands that trigger the DEVICE_LOST, look at what you do with the queue before that command, there's a chance the error is there, instead. In fact, once the first DEVICE_LOST error occurred, (almost?) all further vkQueueWaitIdle failed with the same error (same with the vkQueueSubmit). In cases such as my copy buffer code being the first to fail, the error was always found in the queue usage before that one.
I can't post the exact solution to my problem as - like I've said - there's more than one cause and I've only fixed some of them so far, there's still some left. I think the details are not relevant to future people who come across my question but if there's anything I can add to help other people, please let me know.
This is so true! I was stuck with this issue for couple of days only to figure out that VkAccelerationStructureBuildGeometryInfoKHR flags was mismatching when I query the size using vkGetAccelerationStructureBuildSizesKHR() vs when I use it to actually build the BLAS! In my case, I was using VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_TRACE_BIT_KHR | VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_UPDATE_BIT_KHR while querying the size and only FAST_TRACE while actually creating the AS, this was causing the same issue!

_com_error exception on DirectX Create* methods

I have been trying to get into DirectX programming lately (with C++/Win32), and I encountered some issues.
A friend programmed a very simple random terrain generator that works with Perlin Noise, so I tried to grab his code and make a DirectX 11 renderer for it, but my program throws an "Microsoft C++ Exception", _com_error when I run it.
Even wierder, the program breaks at different lines of code depending on the build type...
On Debug, the program breaks on this chunk of code:
// Load the pixel shader
std::ifstream pixelFile("pixel.cso");
if (!pixelFile)
return false;
std::string pixelContents((std::istreambuf_iterator<char>(pixelFile)),
std::istreambuf_iterator<char>());
pixelFile.close();
if (FAILED(mpDevice->CreatePixelShader(pixelContents.c_str(),
pixelContents.size(),
NULL,
&mpPixelShader)))
return false;
And on Release:
// Create and set the input layout
if (FAILED(mpDevice->CreateInputLayout(layoutDescription,
layoutDescriptionSize,
vertexContents.c_str(),
vertexContents.length(),
&mpVertexLayout)))
return false;
mpPixelShader is a member of type ID3D11PixelShader*, and mpVertexLayout is a member of type ID3D11InputLayout*.
Of course, the first thing I thought when I saw that the error was different was checking for memory corruption, and sure enough I found one in my friend's code after running ApplicationVerifier and setting the _crtDbgFlag, but it is fixed now and I still have this error...
The application ran fine when the rendering initialization function was filled with a modified version of the one found on the MSDN Win32 DirectX Triangle Tutorial too.
The cso file is not a text, it is a binary file. A std::string will not work, the file will contains bytes that are 0.

glewInit() crashing (segfault) after creating osmesa (off-screen mesa) context

I'm trying to run an opengl application on a remote computing cluster. I'm using osmesa as I intend to execute off-screen software rendering (no x11 forwarding etc). I want to use glew (to make life dealing with shaders and other extension related calls easier), and I seem to have built and linked both mesa and glew fine.
When I call mesa-create-context, glewinit gives a OPENGL Version not available output, which probably means the context has not been created. When I call glGetString(GL_EXTENSIONS) i dont get any output, which confirms this. This also shows that glew is working fine on its own. (Other glew commands like glew version etc also work).
Now when I (as shown below), add the mesa-make-context-current function, glewinit crashes with a segfault. Running glGetString(GL_EXTENSIONS) gives me a list of extensions now however (which means context creation is successful!)
I've spent hours trying to figure this out, tried tinkering but nothing works. Would greatly appreciate any help on this. Maybe some of you has experienced something similar before?? Thanks again!
int Height = 1; int Width = 1;
OSMesaContext ctx; void *buffer;
ctx = OSMesaCreateContext( OSMESA_RGBA, NULL );
buffer = malloc( Width * Height * 4 * sizeof(GLfloat) );
if (!OSMesaMakeCurrent( ctx, buffer, GL_UNSIGNED_BYTE, Width, Height )) {
printf("OSMesaMakeCurrent failed!\n");
return 0;
}
-- glewinit() crashes after this.
Just to add, osmesa and glew actually did not compile initially. Because glew undefines GLAPI in it's last line and since osmesa will not include gl.h again, GLAPI remains undefined and causes an error in osmesa.h (119). I got around this by adding an extern to GLAPI, not sure if this is relevant though.
Looking at the source to glewInit in glew.c if glewContextInit succeeds it returns GLEW_OK, GLEW_OK is defined to 0, and so on Linux systems it will always call glxewContextInit which calls glX functions that in the case of OSMesa will likely not be ready for use. This will cause a segfault (as I see), and it seems that the glewInit function has no capability to handle this case unfortunately without patching the C source and recompiling the library.
If others have already solved this I would be interested, I have seen some patched versions of the glew.c sources that workaround this. It isn't clear if there is any energy in the GLEW community to merge changes in that address this use case.

(Define a macro to) facilitate OpenGL command debugging?

Sometimes it takes a long time of inserting conditional prints and checks to glGetError() to narrow down using a form of binary search where the first function call is that an error is first reported by OpenGL.
I think it would be cool if there is a way to build a macro which I can wrap around all GL calls which may fail which will conditionally call glGetError immediately after. When compiling for a special target I can have it check glGetError with a very high granularity, while compiling for typical release or debug this wouldn't get enabled (I'd check it only once a frame).
Does this make sense to do? Searching for this a bit I find a few people recommending calling glGetError after every non-draw gl-call which is basically the same thing I'm describing.
So in this case is there anything clever that I can do (context: I am using GLEW) to simplify the process of instrumenting my gl calls this way? It would be a significant amount of work at this point to convert my code to wrap a macro around each OpenGL function call. What would be great is if I can do something clever and get all of this going without manually determining which are the sections of code to instrument (though that also has potential advantages... but not really. I really don't care about performance by the time I'm debugging the source of an error).
Try this:
void CheckOpenGLError(const char* stmt, const char* fname, int line)
{
GLenum err = glGetError();
if (err != GL_NO_ERROR)
{
printf("OpenGL error %08x, at %s:%i - for %s\n", err, fname, line, stmt);
abort();
}
}
#ifdef _DEBUG
#define GL_CHECK(stmt) do { \
stmt; \
CheckOpenGLError(#stmt, __FILE__, __LINE__); \
} while (0)
#else
#define GL_CHECK(stmt) stmt
#endif
Use it like this:
GL_CHECK( glBindTexture2D(GL_TEXTURE_2D, id) );
If OpenGL function returns variable, then remember to declare it outside of GL_CHECK:
const char* vendor;
GL_CHECK( vendor = glGetString(GL_VENDOR) );
This way you'll have debug checking if _DEBUG preprocessor symbol is defined, but in "Release" build you'll have just the raw OpenGL calls.
BuGLe sounds like it will do what you want:
Dump a textual log of all GL calls made.
Take a screenshot or capture a video.
Call glGetError after each call to check for errors, and wrap glGetError so that this checking is transparent to your program.
Capture and display statistics (such as frame rate)
Force a wireframe mode
Recover a backtrace from segmentation faults inside the driver, even if the driver is compiled without symbols.

Passing D3DFMT_UNKNOWN into IDirect3DDevice9::CreateTexture()

I'm kind of wondering about this, if you create a texture in memory in DirectX with the CreateTexture function:
HRESULT CreateTexture(
UINT Width,
UINT Height,
UINT Levels,
DWORD Usage,
D3DFORMAT Format,
D3DPOOL Pool,
IDirect3DTexture9** ppTexture,
HANDLE* pSharedHandle
);
...and pass in D3DFMT_UNKNOWN format what is supposed to happen exactly? If I try to get the surface of the first or second level will it cause an error? Can it fail? Will the graphics device just choose a random format of its choosing? Could this cause problems between different graphics card models/brands?
I just tried it out and it does not fail, mostly
When Usage is set to D3DUSAGE_RENDERTARGET or D3DUSAGE_DYNAMIC, it consistently came out as D3DFMT_A8R8G8B8, no matter what I did to the back buffer format or other settings. I don't know if that has to do with my graphics card or not. My guess is that specifying unknown means, "pick for me", and that the 32-bit format is easiest for my card.
When the usage was D3DUSAGE_DEPTHSTENCIL, it failed consistently.
So my best conclusion is that specifying D3DFMT_UNKNOWN as the format gives DirectX the choice of what it should be. Or perhaps it always just defaults to D3DFMT_A8R8G8B.
Sadly, I can't confirm any of this in any documentation anywhere. :|
MSDN doesn't say. But I'm pretty sure you'd get "D3DERR_INVALIDCALL" as a result.
If the method succeeds, the return
value is D3D_OK. If the method fails,
the return value can be one of the
following: D3DERR_INVALIDCALL,
D3DERR_OUTOFVIDEOMEMORY,
E_OUTOFMEMORY.
I think this falls into the "undefined" category. Some drivers will fail the allocations, while others may default to something. I've never seen anything in the WDK that says that this condition needs to be handled. I'm guessing if you enable the debug DX runtime you will see an error message.