How to make non limits lighting system - c++

i am writing a game engine with directx 11.
I want to creating a lighting system, but, non light limits!
I sending this structure to my pixel shader:
struct LightingConstantBufferData {
DirectionalLight directionalLights[???];
};
Problem is 'directionalLights' number of array. C++ required const. But non limits lighting for dynamic array required. I am not know make this.
And my pixel shader code getting part:
cbuffer LightingConstantBufferData: register(b3) {
DirectionalLight directionalLights[SH_TOTAL_LIGHTS]; // 'SH_TOTAL_LIGHTS' is defined from cpp code(shader macro)
};

I come from OpenGL, so I may be wrong here, but as far as I know, you can send compute shaders a buffer of data, where the last element of the buffer can be an array of an unknown size, I'm not sure if this is the way in DirectX, but definetly worth a try, if you can do your calculation somehow in a compute shader.

Each 'pass' of rendering is going to be limited to a certain number of lights. Generally graphics shaders and constant buffers are not designed for dynamic arrays.
There are of course other approaches to achieve the effect of 'as many lights as you'd like'. Multi-pass rendering was the common approach in the old days where you'd render the same frame multiple times and blend the results together. Many engines used a mix of a few 'important' direct lights (3 or 4) for dynamic characters and objects, and static lightmaps for background.
Many modern game engines use 'deferred rendering' or a hybrid of deferred and forward techniques (called forward tiled rendering or "forward+") largely because it allows them to linearly scale lighting based on the desires of designers and artists.
A quick search of the Internet will bring up numerous articles on these topics. This one is a good overview.
Forward vs Deferred vs Forward+ Rendering with DirectX 11
My advice is start small and don't worry about hundreds of lights just yet... The DirectX Tool Kit may be of use to you as well.

Related

OpenGL constructing and using data on the GPU

I am not a graphics programmer, I use C++ and C mainly, and every time I try to go into OpenGL, every book, and every resource starts like this:
GLfloat Vertices[] = {
some, numbers, here,
some, more, numbers,
numbers, numbers, numbers
};
Or they may even be vec4.
But then you do something like this:
for(int i = 0; i < 10000; i++)
for(int j = 0; j < 10000; j++)
make_vertex();
And you get a problem. That loop is going to take a significant amount of time to finish- and if the make_vertex() function is anything like a saxpy or something of the sort, it is not just a problem... it is a big problem. For example, let us assume I wish to create fractal terrain. For any modern graphic card this would be trivial.
I understand the paradigm goes like this: Write the vertices manually -> Send them over to the GPU -> GPU does vertex processing, geometry, rasterization all the good stuff. I am sure it all makes sense. But why do I have to do the entire 'Send it over' step? Is there no way to skip that entire intermediary step, and just create vertices on the GPU, and draw them, without the obvious bottleneck?
I would very much appreciate at least a point in the right direction.
I also wonder if there is a possible solution without delving into compute shaders or CUDA? Does openGL or GLSL not provide a suitable random function which can be executed in parallel?
I think what you're asking for could work by generating height maps with a compute shader, and mapping that onto a grid with fixed spacing which can be generated trivially. That's a possible solution off the top of my head. You can use GL Compute shaders, OpenCL, or CUDA. Details can be generated with geometry and tessellation shaders.
As for preventing the camera from clipping, you'd probably have to use transform feedback and do a check per frame to see if the direction you're moving in will intersect the geometry.
Your entire question seems to be built on a huge misconception, that vertices are the only things which need to be "crunched" by the GPU.
First, you should understand that GPUs are far more superior than CPUs when it comes to parallelism (heck, GPUs sacrifice conditional control jumping for the sake of parallelism). Second, shaders and these buffers you make are all stored on the GPU after being uploaded by the CPU. The reason you don't just create all vertices on the GPU? It's the same reason for why you load an image from the hard drive instead of creating a raw 2D array and start filling it up with your pixel data inline. Even then, your image would be stored in the executable program file, which is stored on the hard disk and only loaded to memory when you run it. In an actual application, you'll want to load your graphics off assets stored somewhere (usually the hard drive). Why not let the GPU load the assets from the hard drive by itself? The GPU isn't connected to a hardware's storage directly, but barely to the system's main memory via some BUS. That's because to connect to any storage directly, the GPU will have to deal with the file system which is managed by the OS. That's one of the things the CPU would be faster at doing since we're dealing with serialized data.
Now what shaders deal with is this data you upload to the GPU (vertices, texture coordinates, textures..etc). In ancient OpenGL, no one had to write any shaders. Graphics drivers came with a builtin pipeline which handles regular rendering requests for you. You'd provide it with 4 vertices, 4 texture coordinates and a texture among other things (transformation matrices..etc), and it'd draw your graphics for you on the screen. You could go a bit farther and add some lights to your scene and maybe customize a few things about it, but things were still pretty tight. New OpenGL specifications gave more freedom to the developer by allowing them to rewrite parts of the pipeline with shaders. The developer becomes responsible for transforming vertices into place and doing all sort of other calculations related to lighting etc.
I would very much appreciate at least a point in the right direction.
I am guessing it has something to do with uniforms, but really, with
me skipping pages, I really cannot understand how a shader program
runs or what the lifetime of the variables is.
uniforms are variables you can send to the shaders from the CPU every frame before you use it to render graphics. When you use the saturation slider in Photoshop or Gimp, it (probably) sends the saturation factor value to the shader as a uniform of type float. uniforms are what you use to communicate little settings like these to your shaders from your application.
To use a shader program, you first have to set it up. A shader program consists of at least 2 types of shaders linked together, a fragment shader and a vertex shader. You use some OpenGL functions to upload your shader sources to the GPU, issue an order of compilation followed by linking, and it'll give you the program's ID. To use this program, you simply glUseProgram(programId) and everything following this call will use it for drawing. The vertex shader is the code that runs on the vertices you send to position them on the screen correctly. This is where you can do transformations on your geometry like scaling, rotation etc. A fragment shader runs at some stage afterwards using interpolated (transitioned) values outputted from the vertex shader to define the color and the depth of every unit fragment on what you're drawing. This is where you can do post-processing effects on your pixels.
Anyway, I hope I've helped making a few things clearer to you, but I can only tell you that there are no shortcuts. OpenGL has quite a steep learning curve, but it all connects and things start to make sense after a while. If you're getting so bored of books and such, then consider maybe taking code snippets of every lesson, compile them, and start messing around with them while trying to rationalize as you go. You'll have to resort to written documents eventually, but hopefully then things will fit easier into your head when you have some experience with the implementation components. Good luck.
Edit:
If you're trying to generate vertices on the fly using some algorithm, then try looking into Geometry Shaders. They may give you what you want.
You probably want to use CUDA for the things you are used to do in C or C++, and let OpenGL access the rasterizer and other graphics stuff.
OpenGL an CUDA interact somehow nicely. A good entry point to customize the contents of a buffer object is here: http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__OPENGL.html#group__CUDART__OPENGL_1g0fd33bea77ca7b1e69d1619caf44214b , with cudaGraphicsGLRegisterBuffer method.
You may also want to have a look at the nbody sample from NVIDIA GPU SDK samples the come with current CUDA installs.

best way to wrap opengl models

In short: What is the "preferred" way to wrap OpenGL's buffers, shaders and/or matrices required for a more high level "model" object?
I am trying to write this tiny graphics engine in C++ built on core OpenGL 3.3 and I would like to implement an as clean as possible solution to wrapping a higher level "model" object, which would contain its vertex buffer, global position/rotation, textures (and also a shader maybe?) and potentially other information.
I have looked into this open source engine, called GamePlay3D and don't quite agree with many aspects of its solution to this problem. Is there any good resource that discusses this topic for modern OpenGL? Or is there some simple and clean way to do this?
That depends a lot on what you want to be able to do with your engine. Also note that these concepts are the same with DirectX (or any other graphic API), so don't focus too much your search on OpenGL. Here are a few points that are very common in a 3D engine (names can differ):
Mesh:
A mesh contains submeshes, each submesh contains a vertex buffer and an index buffer. The idea being that each submesh will use a different material (for example, in the mesh of a character, there could be a submesh for the body and one for the clothes.)
Instance:
An instance (or mesh instance) references a mesh, a list of materials (one for each submesh in the mesh), and contains the "per instance" shader uniforms (world matrix etc.), usually grouped in a uniform buffer.
Material: (This part changes a lot depending on the complexity of the engine). A basic version would contain some textures, some render states (blend state, depth state), a shader program, and some shader uniforms that are common to all instances (for example a color, but that could also be in the instance depending on what you want to do.)
More complex versions usually separates the materials in passes (or sometimes techniques that contain passes) that contain everything that's in the previous paragraph. You can check Ogre3D documentation for more info about that and to take a look at one possible implementation. There's also a very good article called Designing a Data-Driven Renderer in GPU PRO 3 that describes an even more flexible system based on the same idea (but also more complex).
Scene: (I call it a scene here, but it could really be called anything). It provides the shader parameters and textures from the environment (lighting values, environment maps, this kind of things).
And I thinks that's it for the basics. With that in mind, you should be able to find your way around the code of any open-source 3D engine if you want the implementation details.
This is in addition to Jerem's excellent answer.
At a low level, there is no such thing as a "model", there is only buffer data and the code used to process it. At a high level, the concept of a "model" will differ from application to application. A chess game would have a static mesh for each chess piece, with shared textures and materials, but a first-person shooter could have complicated models with multiple parts, swappable skins, hit boxes, rigging, animations, et cetera.
Case study: chess
For chess, there are six pieces and two colors. Let's over-engineer the graphics engine to show how it could be done if you needed to draw, say, thousands of simultaneous chess games in the same screen, instead of just one game. Here is how you might do it.
Store all models in one big buffer. This buffer has all of the vertex and index data for all six models clumped together. This means that you never have to switch buffers / VAOs when you're drawing pieces. Also, this buffer never changes, except when the user goes into settings and chooses a different style for the chess pieces.
Create another buffer containing the current location of each piece in the game, the color of each piece, and a reference to the model for that piece. This buffer is updated every frame.
Load the necessary textures. Maybe the normals would be in one texture, and the diffuse map would be an array texture with one layer for white and another for black. The textures are designed so you don't have to change them while you're drawing chess pieces.
To draw all the pieces, you just have to update one buffer, and then call glMultiDrawElementsIndirect()... once per frame, and it draws all of the chess pieces. If that's not available, you can fall back to glDrawElements() or something else.
Analysis
You can see how this kind of design won't work for everything.
What if you have to stream new models into memory, and remove old ones?
What if the models have different size textures?
What if the models are more complex, with animations or forward kinematics?
What about translucent models?
What about hit boxes and physics data?
What about different LODs?
The problem here is that your solution, and even the very concept of what a "model" is, will be very different depending on what your needs are.

(rendering particles) Should I learn shader or OpenCL?

I am trying to run 100000 and more particles.
I've been watching many tutorials and other examples that demonstrate the power of shaders and OpenCL.
In one example that I watched, particle's position was calculated based on the position of your mouse pointer(physical device that you hold with one hand and cursor on the screen).
The position of each particle was stored as RGB. R being x, G y, and B, z. And passed to pixel shader.And then each color pixel was drawn as position of particle afterward.
However I felt absurd towards this approach.
Isn't this approach or coding style rather to be avoided?
Shoudn't I learn how to use OpenCL and use the power of GPU's multithreading to directly state and pass my intended code?
Isn't this approach or coding style rather to be avoided?
Why?
The entire point of shaders is for you to be able to do what you want, to more effectively express what you want to do, and to allow yourself greater control over the hardware.
You should never, ever be afraid of re-purposing something for a different functionality. Textures do not store colors; they store data, which can be color, but it can also be other stuff. The sooner you stop thinking of textures as pictures, the better off you will be as a graphics programmer.
The GPU and API exist to be used. Use it as you see fit; do not allow how you think the API should be used to limit you.
Shoudn't I learn how to use OpenCL and use the power of GPU's multithreading to directly state and pass my intended code?
Yesterday, I would have said "yes". However, today this was released: OpenGL compute shaders.
The fact that the OpenGL ARB and Khronos created this shader type and so forth is a tacit admission that OpenCL/OpenGL interop is not the most efficient way to generate data for rendering purposes. After all, if it was, there would be no need for OpenGL to have generalized compute functionality. There were 3 versions of GL 4.x that didn't provide this. The fact that it's here now is basically the ARB saying, "Yeah, OK, we need this."
If the ARB, staffed by many people who make the hardware, think that CL/GL interop is not the fastest way to go, then it's pretty clear that you should use compute shaders.
Of course, if you're trying to do something right now, that won't help; only NVIDIA has compute shader support. And even that's only in beta drivers. It will take many months before AMD gets support for them, and many more before that support becomes solid and stable enough to use.
Even so, you don't need compute shaders to generate data. People have used transform feedback and geometry shaders to do LOD and frustum culling for instanced rendering. Do not be afraid to think outside of the "OpenGL draws stuff" box.
To simulate particles in OpenCL, you should try out "Yet Another Shader Editor" / http://yase.chnk.us/ - it takes away all the tricky parts and lets you get down to the meat of coding the particle control algorithms. IN YOUR BROWSER. Nothing to download, no accounts to create, just alter whatever examples you find. It's a blast.
https://lotsacode.wordpress.com/2013/04/16/fun-with-particles-yet-another-shader-editor/
I'm not affiliated with yase in any way.

OpenGL Picking from a large set

I'm trying to, in JOGL, pick from a large set of rendered quads (several thousands). Does anyone have any recommendations?
To give you more detail, I'm plotting a large set of data as billboards with procedurally created textures.
I've seen this post OpenGL GL_SELECT or manual collision detection? and have found it helpful. However it can take my program up to several minutes to complete a rendering of the full set, so I don't think drawing 2x (for color picking) is an option.
I'm currently drawing with calls to glBegin/glVertex.../glEnd. Given that I made the switch to batch rendering on the GPU with vao's and vbo's, do you think I would receive a speedup large enough to facilitate color picking?
If not, given all of the recommendations against using GL_SELECT, do you think it would be worth me using it?
I've investigated multithreaded CPU approaches to picking these quads that completely sidestep OpenGL all together. Do you think a OpenGL-less CPU solution is the way to go?
Sorry for all the questions. My main question remains to be, whats a good way that one can pick from a large set of quads using OpenGL (JOGL)?
The best way to pick from a large number of quad cannot be easily defined. I don't like color picking or similar techniques very much, because they seem to be to impractical for most situations. I never understood why there are so many tutorials that focus on people that are new to OpenGl or even programming focus on picking that is just useless for nearly everything. For exmaple: Try to get a pixel you clicked on in a heightmap: Not possible. Try to locate the exact mesh in a model you clicked on: Impractical.
If you have a large number of quads you will probably need a good spatial partitioning or at least (better also) a scene graph. Ok, you don't need this, but it helps A LOT. Look at some tutorials for scene graphs for further information's, it's a good thing to know if you start with 3D programming, because you get to know a lot of concepts and not only OpenGl code.
So what to do now to start with some picking? Take the inverse of your modelview matrix (iirc with glUnproject(...)) on the position where your mouse cursor is. With the orientation of your camera you can now cast a ray into your spatial structure (or your scene graph that holds a spatial structure). Now check for collisions with your quads. I currently have no link, but if you search for inverse modelview matrix you should find some pages that explain this better and in more detail than it would be practical to do here.
With this raycasting based technique you will be able to find your quad in O(log n), where n is the number of quads you have. With some heuristics based on the exact layout of your application (your question is too generic to be more specific) you can improve this a lot for most cases.
An easy spatial structure for this is for example a quadtree. However you should start with they raycasting first to fully understand this technique.
Never faced such problem, but in my opinion, I think the CPU based picking is the best way to try.
If you have a large set of quads, maybe you can group quads by space to avoid testing all quads. For example, you can group the quads in two boxes and firtly test which box you
I just implemented color picking but glReadPixels is slow here (I've read somehere that it might be bad for asynchron behaviour between GL and CPU).
Another possibility seems to me using transform feedback and a geometry shader that does the scissor test. The GS can then discard all faces that do not contain the mouse position. The transform feedback buffer contains then exactly the information about hovered meshes.
You probably want to write the depth to the transform feedback buffer too, so that you can find the topmost hovered mesh.
This approach works also nice with instancing (additionally write the instance id to the buffer)
I haven't tried it yet but I guess it will be a lot faster then using glReadPixels.
I only found this reference for this approach.
I'm using the solution that I've borrowed from DirectX SDK, there's a nice example how to detect the selected polygon in a vertext buffer object.
The same algorithm works nice with OpenGL.

How to do ray tracing in modern OpenGL?

So I'm at a point that I should begin lighting my flatly colored models. The test application is a test case for the implementation of only latest methods so I realized that ideally it should be implementing ray tracing (since theoretically, it might be ideal for real time graphics in a few years).
But where do I start?
Assume I have never done lighting in old OpenGL, so I would be going directly to non-deprecated methods.
The application has currently properly set up vertex buffer objects, vertex, normal and color input and it correctly draws and transforms models in space, in a flat color.
Is there a source of information that would take one from flat colored vertices to all that is needed for a proper end result via GLSL? Ideally with any other additional lighting methods that might be required to complement it.
I would not advise to try actual ray tracing in OpenGL because you need a lot hacks and tricks for that and, if you ask me, there is not a point in doing this anymore at all.
If you want to do ray tracing on GPU, you should go with any GPGPU language, such as CUDA or OpenCL because it makes things a lot easier (but still, far from trivial).
To illustrate the problem a bit further:
For raytracing, you need to trace the secondary rays and test for intersection with the geometry. Therefore, you need access to the geometry in some clever way inside your shader, however inside a fragment shader, you cannot access the geometry, if you do not store it "coded" into some texture. The vertex shader also does not provide you with this geometry information natively, and geometry shaders only know the neighbors so here the trouble already starts.
Next, you need acceleration data-structures to get any reasonable frame-rates. However, traversing e.g. a Kd-Tree inside a shader is quite difficult and if I recall correctly, there are several papers solely on this problem.
If you really want to go this route, though, there are a lot papers on this topic, it should not be too hard to find them.
A ray tracer requires extremely well designed access patterns and caching to reach a good performance. However, you have only little control over these inside GLSL and optimizing the performance can get really tough.
Another point to note is that, at least to my knowledge, real time ray tracing on GPUs is mostly limited to static scenes because e.g. kd-trees only work (well) for static scenes. If you want to have dynamic scenes, you need other data-structures (e.g. BVHs, iirc?) but you constantly need to maintain those. If I haven't missed anything, there is still a lot of research currently going on just on this issue.
You may be confusing some things.
OpenGL is a rasterizer. Forcing it to do raytracing is possible, but difficult. This is why raytracing is not "ideal for real time graphics in a few years". In a few years, only hybrid systems will be viable.
So, you have three possibities.
Pure raytracing. Render only a fullscreen quad, and in your fragment shader, read your scene description packed in a buffer (like a texture), traverse the hierarchy, and compute ray-triangles intersections.
Hybrid raytracing. Rasterize your scene the normal way, and use raytracing in your shader on some parts of the scene that really requires it (refraction, ... but it can be simultated in rasterisation)
Pure rasterization. The fragment shader does its normal job.
What exactly do you want to achieve ? I can improve the answer depending on your needs.
Anyway, this SO question is highly related. Even if this particular implementation has a bug, it's definetely the way to go. Another possibility is openCL, but the concept is the same.
As for 2019 ray tracing is an option for real time rendering but requires high end GPUs most users don't have.
Some of these GPUs are designed specifically for ray tracing.
OpenGL currently does not support hardware accelerated ray tracing.
DirectX 12 on windows does have support for it. It is recommended to wait a few more years before creating a ray tracing only renderer although it is possible using DirectX 12 with current desktop and laptop hardware. Support from mobile may take a while.
Opengl (glsl) can be used for ray (path) tracing. however there are few better options: Nvidia OptiX (Cuda toolkit -- cross platform), directx 12 (with Nvidia ray tracing extension DXR -- windows only), vulkan (nvidia ray tracing extension VKR -- cross platform, and widely used), metal (only works on MacOS), falcor (DXR, VKR, OptiX based framework), Intel Embree (CPU ray tracing only).
I found some of the other answers to be verbose and wordy. For visual examples that YES, functional ray tracers absolutely CAN be built using the OpenGL API, I highly recommend checking out some of the projects people are making on https://www.shadertoy.com/ (Warning: lag)
To answer the topic: OpenGL has no RTX extension, but Vulkan has, and interop is possible. Example here: https://github.com/nvpro-samples/gl_vk_raytrace_interop
As for the actual question: To light the triangles, there are tons of techniques, look up for "forward", "forward+" or "deferred" renderers. The technique to be used depends on you goal. The simplest and most good-looking these days, is image based lighting (IBL) with physically based shading (PBS). Basically, you use a cubemap and blur it more or less depending on the glossiness of the object. For a simple object viewer you don't need more.