Calculation new velocities between objects (AABB) - c++

lately I have been trying to create a 2D platformer engine in C++ with Direct2D. The problem I am currently having is getting objects that are resting against each other to interact correctly after accelerations like gravity have been applied to them.
Right now I can detect collisions and respond to them correctly (I think) and when objects collide they remember what other objects they're resting against so objects can be pushed by other objects (note that there is no bounce in any collisions so when objects collide they are guaranteed to become resting until something else happens). Every time the simulation advances, the acceleration for objects is applied to their velocities (for example vx += ax * t, where t is time elapsed since last advancement).
After these accelerations are applied, I want to check if any objects that are resting against each other are moving at different speeds than their counterparts (as different objects can have different accelerations) and depending on that difference either unlink the two objects so they are no longer resting, or even out their velocities so they are moving at the same speed once again. I am having trouble creating an algorithm that can do this across many resting objects.
Here's a diagram to help explain my problem
http://i.imgur.com/cYYsWdE.png

Related

Update index variables in threads

I have objects like balls. These objects are dynamically created and stacked into a vector. For each of these balls, a separate stream is created that updates its coordinates. Each of these streams has a reference to a vector with balls and knows the sequence number of its ball. Then, let's say I need to delete several balls and streams associated with them.
I did it like this:
the sword has a bool killMe variable that becomes true when the ball needs to be removed. The thread that updates the coordinates notices that the ball needs to be removed, removes the ball, and terminates on its own. But when the ball is removed from the vector, the sequence numbers of the subsequent balls change and their streams, trying to refer to them the next time, cause the program to crash.
How to organize a timely update of the ball index in their streams?
Rather than each thread having an index into the vector, why not pass a reference to the object being worked on?
Note that this may still be problematic if your vector is vector<Ball>, as I'm not sure what happens to references to objects that are moved. That sounds like a problem.
But you could store vector<std::shared_ptr<Ball>> and then you're golden.
Another choice if you really want to use indexes is still to use a vector of shared pointers but then you can nullify the pointers you need to delete -- leaving holes in your vector, but at least you aren't moving things around.
The other choice involves mutexes, and you'll be mutex-locked A LOT. This seems less useful.

Faster to create map of pointers or if statement

I'm creating a game engine using C++ and SFML. I have a class called character that will be the base for entities within the game. The physics class is also going to handle character movement.
My question is, is it faster to create a vector of pointers to the characters that move in a frame. Then, whenever a function moves a character it places it inside that vector. After the physics class is done handling the vector it gets cleared?
Or is it faster to have a bool variable that gets set to true whenever a function moves a character and then have an if statement inside my physics class that tests every character for movement?
EDIT:
Ok i've gone with a different approach where a function inside the Physics class is responsible for dealing with character movement. Immediately upon movement, it tests for collision detection. If collision happens it stops the movement in that direction.
Thanks for your help guys
Compared to all the other stuff that is going on in your program (physics, graphics), this will not make a difference. Use the method that makes programming easier because you will not notice a runtime difference at all.
If the total number of characters is relatively small, then you won't notice the difference between the approaches.
Else (the number of characters is large), if most of characters move during a frame, then the approach with flag inside a character seems more appropriate, because even with vector of moved characters, you'll traverse all of them and besides that you get additional overhead of maintaining the vector.
Else (the number of characters is large, but only few of them move during a frame), it may be better to use vector because it can save you time by not traversing characters which didn't move.
What is a small or large number, depends on your application. You should test under which conditions you get better performance using either of approaches.
This would be the right time to quote Hoare, but I'll abstain. Generally, however, you should profile before you optimize (if, and only if, the time budget is not enough on the minimum spec hardware -- if your game runs at 60fps on the target hardware you will do nothing whatsoever).
It is much more likely that the actual physics calculations will be the limiting factor, not doing the "is this unit moving?" check. Also, it is much more likely that submitting draw calls will bite you rather than checking a few hundred or so units.
As an "obvious" thing, it appears to be faster to hold a vector of object pointers and only process the units that are actually moving. However, the "obvious" is not always correct. Iterating linearly over a greater number of elements can very well be faster than jumping around (due to cache). Again, if this part of your game is identified as the bottleneck (very unlikely) then you will have to measure which is better.

Are sparse AABB trees made with pointers?

I'm using an octree of axis aligned bounding boxes to segment the space in my scene where I do a physics simulation.The problem is, the scene is very large(space) and I need to detect collision of large objects at large distances as well as small objects at close distances.The thing is, there are only a few of them on the scene, but kilometers apart, so this means a lot of empty space.So basically I'm wasting 2 gigs of RAM to store bounding boxes for empty sectors.I'd like to only allocate memory for the sectors that actually contain something(to have them be pointers to AABBs), but that would mean thousands of allocations each frame to re-create the octree.If I use a pool to counter the slowdown from allocations it would still mean I'm allocating 2 gigs of RAM for my application.Is there any other way to achieve this?
Look into Loose Octrees (for dealing with many objects) or a more adaptive system such as AABB-trees built around each object rather than one for the entire space. You can perform general distance/collision using the overall AABB (the root) and get finer collisions using the tree under each object (and eventually a ray-triangle intersection test if you need that fine a resolution). The only disadvantage with AABB-trees is that if the object rotates you need to rebuild the tree (you can adaptively scale and translate the AABB-tree).

Dealing with dead objects in OpenGL VBOs

Imagine a typical game where objects in the simulated world are created and destroyed. When these objects are created, their vertex data is stored in a VBO. This VBO is rendered once per frame.
Is there a best practice for dealing with dead objects? I.e. when the object is destroyed and thus no longer needs to be rendered, what should happen to its corresponding VBO data?
It seems like you'd want to "free" that memory up for future use by other objects. Otherwise, your VBO would eventually be filled almost entirely with dead data.
I have one possible idea for implementing this: a map of VBO memory wherein individual bytes are marked as free or in use. (This map would live on the CPU as a normal array, not on the GPU.) When an object is created, we buffer its data to a free region as determined by the map. We mark that region as used on the map. Then when the object is destroyed, we mark that same region as free. I'm thinking you would store the map either as an array of booleans if you're lazy, or pack it in as one map bit per VBO byte if you want to do it right.
So far, does this sound like the best approach? Is there a more common approach that I'm not seeing?
I know a lot of these questions hinge on the characteristics of the scene you're rendering, so here's the context. My scene consists of several hundred objects. Each object has about eight vertices. Each vertex has a position and texture coordinate stored as floats. So, we're looking at approximately:
4 bytes per float * 6 floats per vert * 8 verts per object * 500 objects
= 96,000 bytes of vertex data
Sounds like you're thinking of using a pool allocator. There's a lot of existing work done on those, which should apply quite well to allocations inside a VBO also.
It will be pretty straightforward if all elements are the same size. Otherwise, you need to be concerned about fragmentation, but heap managers are quite well known.
The simplest improvement I would offer is to start your scan for a free slot from the last slot filled, instead of always from the beginning.
You can trade space for speed by using a deque-style data structure to store a list of free locations, which eliminates the need to scan for a free spot.
The size of the data stored in the VBO really has no impact on the manager. Only the number of slots which can be invididually repurposed.

Threading access to various buffers

I'm trying to figure out the best way to do this, but I'm getting a bit stuck in figuring out exactly what it is that I'm trying to do, so I'm going to explain what it is, what I'm thinking I want to do, and where I'm getting stuck.
I am working on a program that has a single array (Image really), which per frame can have a large number of objects placed on an image array. Each object is completely independent of all other objects. The only dependency is the output, in theory possible to have 2 of these objects placed on the same location on the array. I'm trying to increase the efficiency of placing the objects on the image, so that I can place more objects. In order to do that, I'm wanting to thread the problem.
The first step that I have taken towards threading it involves simply mutex protecting the array. All operations which place an object on the array will call the same function, so I only have to put the mutex lock in one place. So far, it is working, but it is not seeing the improvements that I would hope to have. I am hypothesizing that this is because most of the time, the limiting factor is the image write statement.
What I'm thinking I need to do next is to have multiple image buffers that I'm writing to, and to combine them when all of the operations are done. I should say that obscuration is not a problem, all that needs to be done is to simply add the pixel counts together. However, I'm struggling to figure out what mechanism I need to use in order to do this. I have looked at semaphores, but while I can see that they would limit a number of buffers, I can envision a situation in which two or more programs would be trying to write to the same buffer at the same time, potentially leading to inaccuracies.
I need a solution that does not involve any new non-standard libraries. I am more than willing to build the solution, but I would very much appreciate a few pointers in the right direction, as I'm currently just wandering around in the dark...
To help visualize this, imagine that I am told to place, say, balls at various locations on the image array. I am told to place the balls each frame, with a given brightness, location, and size. The exact location of the balls is dependent on the physics from the previous frame. All of the balls must be placed on a final image array, as quickly as they possibly can be. For the purpose of this example, if two balls are on top of each other, the brightness can simply be added together, thus there is no need to figure out if one is blocking the other. Also, no using GPU cards;-)
Psuedo-code would look like this: (Assuming that some logical object is given for location, brightness, and size). Also, assume, that isValidPoint simply finds if the point should be on the circle, given the location and radius of said circle.
global output_array[x_arrLimit*y_arrLimit)
void update_ball(int ball_num)
{
calc_ball_location(ball_num, *location, *brightness, *size); // location, brightness, size all set inside function
place_ball(location,brightness,size)
}
void place_ball(location,brighness,size)
{
get_bounds(location,size,*xlims,*ylims)
for (int x=xlims.min;x<xlims.max;y++)
{
for (int y=ylims.min;y<ylims.max;y++)
{
if (isValidPoint(location,size,x,y))
{
output_array(x,y)+=brightness;
}
}
}
}
The reason you're not seeing any speed up with the current design is that, with a single mutex for the entire buffer, you might as well not bother with threading, as all the objects have to be added serially anyway (unless there's significant processing being done to determine what to add, but it doesn't sound like that's the case). Depending on what it takes to "add an object to the buffer" (do you use scan-line algorithms, flood fill, or something else), you might consider having one mutex per row or a range of rows, or divide the image into rectangular tiles with one mutex per region or something. That would allow multiple threads to add to the image at the same time as long as they're not trying to update the same regions.
OK, you have an image member in some object. Add the, no doubt complex, code to add other image/objects to it. maipulate it, whatever. Aggregate in all the other objects that may be involved, add some command enun to tell threads what op to do and an 'OnCompletion' event to call when done.
Queue it to a pool of threads hanging on the end of a producer-consumer queue. Some thread will get the *object, perform the operation on the image/set and then call the event, (pass the completed *object as a parameter). In the event, you can do what you like, according to the needs of your app. Maybe you will add the processed images into a (thread-safe!!), vector or other container or queue them off to some other thread - whatever.
If the order of processing the images must be preserved, (eg. video stream), you could add an incrementing sequence-number to each object that is submitted to the pool, so enabling your 'OnComplete' handler to queue up 'later' images until all earlier ones have come in.
Since no two threads ever work on the same image, you need no locking while processing. The only locks you should, (may), need are those internal the queues, and they only lock for the time taken to push/pop object pointers to/from the queue - contention will be very rare.