In C, I can debug code like:
fprintf(stderr, "blah: %f", some_var);
in GLSL ... is there anyway for me to just dump out a value in a Vertex or Fragment shader? I don't care if it's slow; I just want to dump out the value. Ideally, I want a setup like the following:
normal state = run GLSL shader normally
press key 'd' = next frame is generated in ULTRA slow mode, where the "printfs" in the
Vertex/Fragment shader are executed and dumped out.
Is this feasible? (I don't care about performance; I just want to do this for one frame).
Thanks!
Unfortunately it's not possible directly. One possible solution though, that I end up using a lot (but I'm sure it's pretty common among GLSL developers) is to "print" values as colors, in place of your intended final result.
Of course this has many limitations; for one, you have to make sure that your value maps in a (0,1.0) range. Functions as mod, fract etc. turn out useful in these cases. But, in general, this is what I see as the "printf" equivalent in GLSL.
Instead of printing values, have you thought of trying a GLSL debugger?
For example, glslDevil will let you step through your shader's execution and examine the variables at each step.
Check out AMD CodeXL. It will let you step frame by frame to inspect opengl state values, shader code, and texture memory.
http://developer.amd.com/tools-and-sdks/heterogeneous-computing/codexl/
You can see the variable you want to check by copying its value in a uniform and then get that uniform with glGetUniformfv
Related
I am not a graphics programmer, I use C++ and C mainly, and every time I try to go into OpenGL, every book, and every resource starts like this:
GLfloat Vertices[] = {
some, numbers, here,
some, more, numbers,
numbers, numbers, numbers
};
Or they may even be vec4.
But then you do something like this:
for(int i = 0; i < 10000; i++)
for(int j = 0; j < 10000; j++)
make_vertex();
And you get a problem. That loop is going to take a significant amount of time to finish- and if the make_vertex() function is anything like a saxpy or something of the sort, it is not just a problem... it is a big problem. For example, let us assume I wish to create fractal terrain. For any modern graphic card this would be trivial.
I understand the paradigm goes like this: Write the vertices manually -> Send them over to the GPU -> GPU does vertex processing, geometry, rasterization all the good stuff. I am sure it all makes sense. But why do I have to do the entire 'Send it over' step? Is there no way to skip that entire intermediary step, and just create vertices on the GPU, and draw them, without the obvious bottleneck?
I would very much appreciate at least a point in the right direction.
I also wonder if there is a possible solution without delving into compute shaders or CUDA? Does openGL or GLSL not provide a suitable random function which can be executed in parallel?
I think what you're asking for could work by generating height maps with a compute shader, and mapping that onto a grid with fixed spacing which can be generated trivially. That's a possible solution off the top of my head. You can use GL Compute shaders, OpenCL, or CUDA. Details can be generated with geometry and tessellation shaders.
As for preventing the camera from clipping, you'd probably have to use transform feedback and do a check per frame to see if the direction you're moving in will intersect the geometry.
Your entire question seems to be built on a huge misconception, that vertices are the only things which need to be "crunched" by the GPU.
First, you should understand that GPUs are far more superior than CPUs when it comes to parallelism (heck, GPUs sacrifice conditional control jumping for the sake of parallelism). Second, shaders and these buffers you make are all stored on the GPU after being uploaded by the CPU. The reason you don't just create all vertices on the GPU? It's the same reason for why you load an image from the hard drive instead of creating a raw 2D array and start filling it up with your pixel data inline. Even then, your image would be stored in the executable program file, which is stored on the hard disk and only loaded to memory when you run it. In an actual application, you'll want to load your graphics off assets stored somewhere (usually the hard drive). Why not let the GPU load the assets from the hard drive by itself? The GPU isn't connected to a hardware's storage directly, but barely to the system's main memory via some BUS. That's because to connect to any storage directly, the GPU will have to deal with the file system which is managed by the OS. That's one of the things the CPU would be faster at doing since we're dealing with serialized data.
Now what shaders deal with is this data you upload to the GPU (vertices, texture coordinates, textures..etc). In ancient OpenGL, no one had to write any shaders. Graphics drivers came with a builtin pipeline which handles regular rendering requests for you. You'd provide it with 4 vertices, 4 texture coordinates and a texture among other things (transformation matrices..etc), and it'd draw your graphics for you on the screen. You could go a bit farther and add some lights to your scene and maybe customize a few things about it, but things were still pretty tight. New OpenGL specifications gave more freedom to the developer by allowing them to rewrite parts of the pipeline with shaders. The developer becomes responsible for transforming vertices into place and doing all sort of other calculations related to lighting etc.
I would very much appreciate at least a point in the right direction.
I am guessing it has something to do with uniforms, but really, with
me skipping pages, I really cannot understand how a shader program
runs or what the lifetime of the variables is.
uniforms are variables you can send to the shaders from the CPU every frame before you use it to render graphics. When you use the saturation slider in Photoshop or Gimp, it (probably) sends the saturation factor value to the shader as a uniform of type float. uniforms are what you use to communicate little settings like these to your shaders from your application.
To use a shader program, you first have to set it up. A shader program consists of at least 2 types of shaders linked together, a fragment shader and a vertex shader. You use some OpenGL functions to upload your shader sources to the GPU, issue an order of compilation followed by linking, and it'll give you the program's ID. To use this program, you simply glUseProgram(programId) and everything following this call will use it for drawing. The vertex shader is the code that runs on the vertices you send to position them on the screen correctly. This is where you can do transformations on your geometry like scaling, rotation etc. A fragment shader runs at some stage afterwards using interpolated (transitioned) values outputted from the vertex shader to define the color and the depth of every unit fragment on what you're drawing. This is where you can do post-processing effects on your pixels.
Anyway, I hope I've helped making a few things clearer to you, but I can only tell you that there are no shortcuts. OpenGL has quite a steep learning curve, but it all connects and things start to make sense after a while. If you're getting so bored of books and such, then consider maybe taking code snippets of every lesson, compile them, and start messing around with them while trying to rationalize as you go. You'll have to resort to written documents eventually, but hopefully then things will fit easier into your head when you have some experience with the implementation components. Good luck.
Edit:
If you're trying to generate vertices on the fly using some algorithm, then try looking into Geometry Shaders. They may give you what you want.
You probably want to use CUDA for the things you are used to do in C or C++, and let OpenGL access the rasterizer and other graphics stuff.
OpenGL an CUDA interact somehow nicely. A good entry point to customize the contents of a buffer object is here: http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__OPENGL.html#group__CUDART__OPENGL_1g0fd33bea77ca7b1e69d1619caf44214b , with cudaGraphicsGLRegisterBuffer method.
You may also want to have a look at the nbody sample from NVIDIA GPU SDK samples the come with current CUDA installs.
I have an HLSL shader that defines some resources, say a constant buffer:
cbuffer MyCB : register(b0);
If I compile my shader, I will then be able to query the register through the reflection API. But is it possible to change the register (for instance, to b3) in a compiled shader blob in a similar manner you can assign bind points to resources in a compiled OpenGL program?
There is no API to change the shader bindings at runtime in a compiled shader.
If you jumped through many hoops, you might be able to achieve this with dynamic shader linking in Shader Model 5.0, although it would be lots of work and not really worth it, when there is a very easy alternative - simply create a new compiled shader with the bindings you want.
You can accomplish this in direct3d12 by specifying a BaseShaderRegister other than zero, or using different RegisterSpace values, in the D3D12_DESCRIPTOR_RANGE struct. If code changes are not feasible, you can isolate each set of registers implicitly by setting the root parameter's ShaderVisibility property. This will isolate, for example, VS b0 from PS b0. For more details, you can check out the developer video on the topic.
The only time you will run into trouble is if you've actually explicitly bound two resources to the same slot and register space (by explicitly specifying it using shader model 5.1 syntax). In this case, you are expected to understand that in D3D12, registers are shared cross-stage, and it's up to you to make sure you have no collisions.
In D3D11, this problem does not occur as each stage has its own register space (VS b0 is not the same as PS b0) and you can't share even if you wanted to. Still, if you for some reason have a component hard-coded to attach data to VS b0 but your vertex shader has already been compiled to expect it at b1, there's not much you can do.
I have a software renderer that is similar designed to the OpenGL 2.0+ rendering pipeline, however, my software renderer is quite static in its functionality. I would like to design it so I can put in custom vertex- and fragment-"shaders" (written as C++ "functions", not in the OpenGL language), however, I'm not sure how to implement a good, reusable, extensible solution.
Basically I want to choose between a custom "function" that is then called in my renderer to process every vertex (or fragment). So maybe I could work with a function object passed to the renderer, or work out some inheritance-based solution or I'm thinking this could be a case for a template-based solution.
I imagine it like this:
for every vertex
// call the vertex-shading function given by the user, with the standard
// arguments plus the custom ones given in user code. May produce some custom
// output that has to be fed into the fragment shader below
end
// do some generic rendering-stuff like clipping etc.
for every triangle
for every pixel in the triangle
// call the fragment-shading function given by the user, with the standard
// arguments, plus the custom ones from the vertex shader and the ones
// given in user code
end
end
I can program C++ quite well, however I don't have much practical experience with templates and the more advanced stuff - I have read a lot and watched a lot of videos though.
There's a few requirements like that one of those "shader-functions" can have multiple (different) input and output variables. There is 2-3 parameters that are not optional and always the same (like the input to a vertex-shader is obviously the triangle, and the output is the transformed position), but one shader could e.g. also require an additional weight-parameter or barycentric coordinates as input. Also, it should be possible to feed one of such custom outputs of the vertex shader into a corresponding fragment shader (like in OpenGL where the output of a variable in the vertex shader is fed into the fragment shader).
At the same time, I would also prefer a simple solution - it shouldn't be too advanced (like I don't want to mimic the GLSL compiler, or have my own DSL). It should just be something like - write VertexShaderA and VertexShaderB and be able to plug them both into my Renderer, along with some parameters depending on the shader.
I would like if the solution uses "modern" C++, as in basically everything that compiles with VS2013 and gcc-4.8.
So to rephrase my main question:
How can I accomplish this "passing of custom functions to my renderer", with the additional functionality mentioned?
If possible, I would welcome a bit of example code to help get me started.
TinyRenderer is a very simple but rather elegant implementation of around 500 lines and it has some wiki with a tutorial. See https://github.com/ssloy/tinyrenderer and the actual shader https://github.com/ssloy/tinyrenderer/blob/master/main.cpp
Using image load and store, i would like to do the following in GLSL 4.2:
vec3 someColor = ...;
vec4 currentPixel = imageLoad(myImage, uv);
float a = currentPixel.a/(currentPixel.a+1.0f);
vec4 newPixel = vec4(currentPixel.rgb*a+someColor*(1.0f-a),currentPixel.a+1.0f);
imageStore(myImage, uv, newPixel);
the value for 'uv' can be the same for multiple rasterized pixels. In order to get the proper result, of course I want no other shaderexecution to write into my pixel inbetween the calls of imageLoad() and imageStore();
Is this possible to do somehow with memoryBarrier? if so, how does it have to be used in this code?
the value for 'uv' can be the same for multiple rasterized pixels.
Then you can't do it.
memoryBarrier is not a way to create an atomic operation. It only guarantees the ordering for a single shader's operation. So if a particular shader invocation reads an image, writes it, and then reads it again, you need a memoryBarrier to ensure that what is read is what was written before. If some other shader invocation wrote to it, then you're out of luck (unless it was a dependent invocation. The rules for this stuff are complex).
If you're trying to do programmatic blending, then you need to make certain that each fragment shader invocation reads/writes to a unique value. Otherwise, it's not going to work.
You don't say what it is you're trying to actually achieve, so it's not possible to provide a better way of getting what you want. All I can say is that this way is not going to work.
You would need to implement locking system (lock / mutex).
For this purpose, it is good to use imageAtomicCompSwap or if buffer is used, atomicCompSwap. Ofcourse you would need to use global variable (say texture) not local one.
For implementation purpose, I think this question is in big part answer to your problem: Is my spin lock implementation correct and optimal?
I'm trying to use the transform feedback functionality of OpenGL. I've written a minimalistic vertex shader and created a program with it (there's no fragment shader). I've also made a call to glTransformFeedbackVaryings with a single output varying name and I've set the buffer mode to be GL_INTERLEAVED_ATTRIBS. The shader program compiles and links fine (I also make sure I link after the glTransformFeedbackVaryings call.
I've enabled a single vertex attrib array using glEnableVertexAttribArray, allocated a VBO for the generic vertex attributes and made a call to glVertexAttribPointer for the attribute.
I've bound the TRANSFORM_FEEDBACK_BUFFER to another buffer which I've generated and created a data store which should be plenty big enough to be written to during transform feedback.
I then enable transform feedback and call glDrawArrays(GL_POINTS, 0, 1000). I don't get any crashes throughout the running of the program.
The problem is that I'm getting no indication that the transform feedback is writing anything to the TRANSFORM_FEEDBACK_BUFFER during the glDrawArrays call. I set up a query which monitors GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN and this always returns 0. No matter what I try I can't seem to get the transform feedback to write ANYTHING (never mind anything meaningful!)
If anyone has any suggestions as to how I could get the transform feedback to write anything, or things that I should check for please let me know!
Note: I can't use transform feedback objects and I'm not using vertex array objects.
I think the problem ended up being how I was calling glBindBufferBase. Given that I can't see this function call in the original question it may have been that I omitted it altogether.
Certainly I didn't realise that the GL_TRANSFORM_FEEDBACK_BUFFER also has to be bound with a call to glBindBuffer to the correct buffer object before calling glBindBufferBase.