I'm using SDL2 for my game.
I have an std::vector of SDL_Rects (that is, rectangle objects) that holds the solid platforms (i.e. platforms that the player can't go through) of a level in my game.
When checking for collision, my current code does the following:
for (SDL_Rect rect : rects) {
if (player.collides(rect)) {
// handle collision
}
}
Consider I have a level with numerous (e.g. 500) solid platform rectangles, is it inefficient to go through all of them and check for collision? Is there a better way to do this?
The collides() function only checks for AABB collision (4 simple conditions).
I think this is reasonable. You have simple shapes and are doing simple collision checking. Imagine a more graphically intense game. Even then, they may have a complex skeletal mesh for the character, but just do the collision checking against an easy-to-calculate bounding shape, and they probably have a lot more than 500 things going on at once.
In a more complex game engine, different types may be blocking against some types and non-blocking against others, so not only will it be checking for simple overlap events, it has to know if the overlapping objects should interact or not. Or there might be different interactions for different objects.
With games, by and large your bottleneck is rendering and the associated calculations, so unless you know you're in danger of doing something incredibly slowly with the game logic (complex path finding or AI or something like that), I would concentrate my optimizing efforts on the rendering.
Related
Is there a "standard" method for 3d picking? What do most game companies do? (for accurate picking)
I thought the fastest way is to use the gpu and render every object with an "color index", and then to use glReadPixels(), but then I heard that it's considered slow because of glFlush(), glFinish() calls.
There's also this ray casting approach, which is nice but isn't accurate because of the spheres/AABBs approximations.
Any question about what is "standard" is probably going to invoke some opinionated responses, but I would suggest that the closest to "standard" here is raycasting.
Take your watertight ray/triangle intersection function and test a ray that is unprojected from your mouse cursor position against the triangles in your scene.
Normally this would be quite slow, requiring linear complexity. So the next step is to accelerate it to something better, like logarithmic time. This is typically achieved with a data structure such as an octree, BVH, K-D tree, or BSP. Sometimes people skip this step and just try to make the ray/tri intersection really fast and really parallel, possibly even using GPGPU.
It takes a lot more work upfront than framebuffer-based solutions, but complex applications tend to go this route probably because:
Portability: it's decoupled from the rendering engine. It doesn't have to be tied to OpenGL or DirectX, e.g., and that improves portability.
Generality: typically the accelerator and associated queries are needed for other things. For example, an FPS game might have players and enemies constantly shooting at each other. Figuring out what projectiles hit what tends to require these kinds of intersection queries occurring constantly, and not just from a uniform viewing angle.
Simplicity: the developers can afford the extra work upfront to simplify things later on.
There's also this ray casting approach, which is nice but isn't
accurate because of the spheres/AABBs approximations.
There should be nothing inaccurate about using AABBs or bounding spheres for acceleration purposes. Those are purely to accelerate the tests and quickly reduce the number of the more costly ray/triangle intersections that need to occur by doing cheaper tests and ones that eliminate large batches of triangles to check in bulk. Normally they should be constructed to encompass the elements in the scene. If you do a ray/AABB intersection first, e.g., and if that hits, test the elements encompassed within the AABB. Any acceleration structure that doesn't give the same results without the accelerator would typically be a glitchy one.
For example, a very basic form of acceleration is just put a bounding box around one mesh element in a scene, like a character, and sometimes this basic form without involving a full-blown accelerator might be useful for very dynamic elements in the scene (to avoid the cost of constantly updating the accelerator). If the ray intersects the character's bounding box, then check all the triangles making up the character. As long as you check the triangles within the AABB afterwards, it becomes about acceleration rather than approximation. Of course if you only checked the AABB and nothing else, then it would be a crude approximation.
I am trying to implement a sphere sphere collision. I understand the math behind it. However, I am still looking around tutorials to see if there are better and faster approaches. I came across nehe's collision detection tutorial ( http://nehe.gamedev.net/tutorial/collision_detection/17005/ ). In this tutorial, if I understood correctly, he is trying to check if two spheres collides within a frame, and he tries not to miss it, by first checking if their paths intersect. And then tries to simulate it in a way.
My approach was to check every frame, whether spheres are colliding and be done with it. I didnt consider checking intersection paths etc. Now I am kinda confused how to approach to the problem.
My question is, is it really necessary trying to be that safe and check if we missed the collision by one frame ?
When writing collision detection algorithms, it is important to recognize objects are moving at discrete time steps, unlike the real world. In a typical modern game, objects will be moving with a time step of 0.016 seconds per frame (often with smaller fixed or variable time steps).
It is possible that two spheres moving with very high velocities could pass through each other during a single frame and not be within each other's bounding spheres after integration is performed. This problem is called Tunneling and there are multiple ways, each with varying levels of complexity and cost, to approach the problem. Some options are swept volumes and Minkowski Addition.
Choosing the right algorithm depends on your application. How precise does the collision need to be? Is it vital to your application or can you get away with some false negatives/positives. Typically, the more precise the collision detection, the more you pay in performance.
Similar question here
I am designing a class that manages a collection of rectangles.
Each rectangle represents a button, so contains the following properties:
x position
y position
width
height
a callback function for what happens when it is pressed.
The concept itself is fairly straightforward and is managed through a command line interface. In particular.
If I type "100, 125" it looks up whether there is a rectangle that contains this point (or multiple) and performs their callback functions.
My proposal is to iterate over all rectangles in this collection and perform the callback of each individual rectangle which contains this point, or stop upon the first rectangle that matches (simulating z order).
I fear however that this solution is sloppy, as this iteration becomes longer the more rectangles I have.
This is fine for the console application, since it can easily go over 10,000 rectangles and find which ones match, expensive computation, but time wise it is not particularly an issue.
The issue is that if I were to implement this algorithm in a GUI application which needs to perform this check every time a mouse is moved (to simulate mouse over effect), moving your mouse 10 pixels over a panel with 10,000 object would require checking of 100,000 objects, that's not even 1000 pixels or over people tend to move mouse over at.
Is there a more elegant solution to this issue, or will such programs always need to be so expensive?
Note: I understand that most GUIs do not have to deal with 10,000 active objects at once, but that is not my objective.
The reason I choose to explain this issue in terms of buttons is because they are simpler. Ideally, I would like a solution which would be able to work in GUIs as well as particle systems which interact with the mouse, and other demanding systems.
In a GUI, I can easily use indirection to reduce amount of checks drastically, but this does not alleviate the issue of
needing to perform checks every single time a mouse is moved, which can be quite demanding even if there are 25 buttons, as moving over 400 pixels with 25 objects(in ideal conditions) would be as bad as moving 1 pixel with 10,000 objects.
In a nutshell, this issue is twofold:
What can I do to reduce the amount of checks from 10,000 (granted there are 10,000 objects).
Is it possible to reduce amount of checks required in such a GUI application from every mouse move, to something more reasonable.
Any help is appreciated!
There are any number of 2D-intersection acceleration structures you could apply. For example, you could use a Quadtree (http://en.wikipedia.org/wiki/Quadtree) to recursively divide the viewport into quadrants. Subdivide each quadrant that doesn't fall entirely within or entirely outside every rectangle and place a pointer to either the top or the list of rectangles at each leaf (or NULL if no rectangles land there). The structure isn't trivial, but it's fairly straightforward conceptually.
Is there a more elegant solution to this issue, or will such programs
always need to be so expensive?
Instead of doing a linear search through all the objects you could use a data structure like a quad tree that lets you efficiently find the nearest object.
Or, you could come up with a more realistic set of requirements based on the intended use for your algorithm. A GUI with 10,000 buttons visible at once is a poor design for many reasons, the main one being that the poor user will have a very hard time finding the right button. A linear search through a number of rectangles more typical of a UI, say somewhere between 2 and 100, will be no problem from a performance point of view.
I am quickly finding that one of the organisational considerations you must make when preparing rendering in OpenGL is the type of topography and the arrangement of vertices.
Now there are some interesting methods out there for organising vertices into very long arrays, with nice uses of interleaved arrays, indexes, etc, so that you can pour a lot of geometry into one OpenGL call.
But it's much easier in some cases to simply iterate and perform multiple calls with smaller vertex arrays.
While I agree with the notion that premature optimization is somewhat wasteful, just how important of a consideration should it be to minimize OpenGL calls, especially if multiple calls would actually involve far fewer vertexes per call?
I can see that this is one of those decisions that is important early in the development process, since it forms a lot of the structure of how vertexes get created and organized.
There is an overhead for each command you send down to the GPU. By batching the vertices you minimize that overhead and also allows the driver to make small optimizations in you data before sending it to the hardware. It can make quite a difference and is the reason the glBegin and glEnd was completely removed from newer iterations of OpenGL.
You should try to avoid making many driver states changes and many drawing calls.
EDIT: Consider using degenerated vertices in you triangle strips (also helps in minimizing the number of vertices processed) so that you can just use one drawing call and render all your topology (unless you need to change some driver state between parts of the topology).
You can find a balance for your specific needs. But the thing is that there're many variables in the equation. And there's no simple solution (like "always make scene as one big single batch!"). TraxNet gave you a good advice though - always try to minimize api calls(whether drawing or state changes). But it hasn't to be just a few calls. On modern PC it could be thousands per frame, not so modern mobile phone, maybe, just a few hundred.
Also TraxNet mentioned degenerate triangles(helping form strips) - though they're still triangles(kinda add to 'total' triangle count rendered) - they cost almost nothing still helping to minimize amount of draw calls.
I've worked on a variety of demo projects with OpenGL and C++, but they've all involved simply rendering a single cube (or similarly simple mesh) with some interesting effects. For a simple scene like this, the vertex data for the cube could be stored in an inelegant global array. I'm now looking into rendering more complex scenes, with multiple objects of different types.
I think it makes sense to have different classes for different types of objects (Rock, Tree, Character, etc), but I'm wondering how to cleanly break up the data and rendering functionality for objects in the scene. Each class will store its own array of vertex positions, texture coordinates, normals, etc. However I'm not sure where to put the OpenGL calls. I'm thinking that I will have a loop (in a World or Scene class) that iterates over all the objects in the scene and renders them.
Should rendering them involve calling a render method in each object (Rock::render(), Tree::render(),...) or a single render method that takes in an object as a parameter (render(Rock), render(Tree),...)? The latter seems cleaner, since I won't have duplicate code in each class (although that could be mitigated by inheriting from a single RenderableObject class), and it allows the render() method to be easily replaced if I want to later port to DirectX. On the other hand, I'm not sure if I can keep them separate, since I might need OpenGL specific types stored in the objects anyway (vertex buffers, for example). In addition, it seems a bit cumbersome to have the render functionality separate from the object, since it will have to call lots of Get() methods to get the data from the objects. Finally, I'm not sure how this system would handle objects that have to be drawn in different ways (different shaders, different variables to pass in to the shaders, etc).
Is one of these designs clearly better than the other? In what ways can I improve upon them to keep my code well-organised and efficient?
Firstly, dont even bother with platform independence right now. wait until you have a much better idea of your architecture.
Doing a lot of draw calls/state changes is slow. The way that you do it in an engine is you generally will want to have a renderable class that can draw itself. This renderable will associated to whatever buffers it needs (e.g. vertex buffers) and other information (like vertex format, topology, index buffers etc). Shader input layouts can be associated to vertex formats.
You will want to have some primitive geo classes, but defer anything complex to some type of mesh class which handles indexed tris. For a performant app, you will want to batch up calls (and potentially data) for similar input types in your shading pipeline to minimise unneccesary state changes and pipeline flushes.
Shaders parameters and textures are generally controlled via some material class that is associated to the renderable.
Each renderable in a scene itself is usually a component of a node in a hierarchical scene graph, where each node usually inherits the transform of its ancestors through some mechanism. You will probably want a scene culler that uses a spatial partitioning scheme to do fast visibility determination and avoid draw call overhead for things out of view.
The scripting/behaviour part of most interactive 3d apps is tightly connected or hooked into its scene graph node framework and an event/messaging system.
This all fits together in a high level loop where you update each subsystem based on time and draw the scene at current frame.
Obviously there are tonnes of little details left out but it can become very complex depending on how generalised and performant you want to be and what kind of visual complexity you are aiming for.
Your question of draw(renderable), vs renderable.draw() is more or less irrelevant until you determine how all the parts fit together.
[Update] After working in this space a bit more, some added insight:
Having said that, in commercial engines, its usually more like draw(renderBatch) where each render batch is an aggregation of objects that are homogenous in some meaningful way to the GPU, since iterating over heterogeneous objects (in a "pure" OOP scene graph via polymorphism) and calling obj.draw() one-by-one has horrible cache locality and is generally an inefficient use of GPU resources. It is very useful to take a data-oriented approach to designing how an engine talks to its underlying graphics API(s) in the most efficient way possible, batching up things as much as possible without negatively effecting the code structure/readability.
A practical suggestion is to write a first engine using a naive/"pure" approach to get really familiar with the domain space. Then on a second pass (or probably rewrite), focus on the hardware: things like memory representation, cache locality, pipeline state, bandwidth, batching, and parallelism. Once you really start considering these things, you will realise that most of your initial design goes out the window. Good fun.
I think OpenSceneGraph is kind of an answer. Take a look at it and its implementation. It should provide you with some interesting insights on how to use OpenGL, C++ and OOP.
Here is what I have implemented for a physical simulation and what worked pretty well and was on a good level of abstraction. First I'd separate the functionality into classes such as:
Object - container that holds all the necessary object information
AssetManager - loads the models and textures, owns them (unique_ptr), returns a raw pointer to the resources to the object
Renderer - handles all OpenGL calls etc., allocates the buffers on GPU and returns render handles of the resources to the object (when wanting the renderer to draw the object I call the renderer giving it model render handle, texture handle and model matrix), renderer should aggregate such information o be able to draw them in batches
Physics - calculations that use the object along with it's resources (vertices especially)
Scene - connects all the above, can also hold some scene graph, depends on the nature of the application (can have multiple graphs, BVH for collisions, other representations for draw optimization etc.)
The problem is that GPU is now GPGPU (general purpose gpu) so OpenGL or Vulkan is not only a rendering framework anymore. For example physical calculations are being performed on the GPU. Therefore the renderer might now transform into something like GPUManager and other abstractions above it. Also the most optimal way to draw is in one call. In other words, one big buffer for the whole scene that can be also edited via compute shaders to prevent excessive CPU<->GPU communication.