2D Tower Defense - Units stacking on top of each other

2D Tower Defense - Units stacking on top of each other - c++

I'm currently implementing a 2D top down Tower Defense game. For the pathfinding I've used a Breadth-First-Search backwards from the goal. Everything works quite fine, though my units all follow the exact same line and therefore might stack on top of each other.
For units of the same time, I can of course release them one after another but if faster and slower units are mixed, the faster ones will "walk over" the slower ones and it looks quite weird.
In Fieldrunners 2 units walk around each other when the need to pass which looks quite cool, though I imagine that this is quite complex to implement.
Do you have any idea how I can solve these issues / improve my game?

You could try looking into something known as steering behaviour. Use collision checks to determine when a unit is about to collide with something that they cannot pass through on a node that they should be able to pass through and use steering behaviour to avoid it.
This has the benefit of meaning that you don't have to constantly update and recalculate paths for all units, and so it is far more scalable.

Related

How to fetch patterns from a game board in a fast way?

For my recent project I'm right now looking for an efficient way to structure and store the board information with consideration of the usage for patternmatching.
I'm having a square board, and for pattern matching, I'm using bitfields with 2 bits representing one field of the board. The patterns to match have a diamond shape, that could be centered around any possible field on the board. (so the center is not static, I need to be able to do it for any center)
Example of diamond area around O:
..X..
.XXX.
XXOXX
.XXX.
..X..
If parts of the diamond are outside the the playing area, the bits will be set to 11. The diamonds can have differing radiuses, aboves example would have a radius of 2.
Another important thing for the efficiency of the system is, that I have to be able to quickly rotate/mirror the pattern into all 8 possible symmetries.
For this, it may be beneficial to actually NOT store the information of the central point in the pattern, and as this is not required for my algorithm anyway, this may be a valuable timesaver. Because now some bitshifting magic is possible to quickly rotate/mirror the patterns.
As this kind of patternmatching has to be done at a high frequency, it can prove to be a severe bottleneck of my overall project, when implemented badly.
When trying to get a nice model for doing all this work, I figured, there are 3 important keyareas that require thinking about, but are of course tightly connected.
A. How is the data stored in the board implementation.
Currently this is done in a rather difficult manner, which would be too difficult to read from with such high frequency. But it would be no problem or timeloss to actually store and update the 2 bit data in any possible way for the entire board.
Easiest would be to just store the entire board in an bitset with the size of twice the board, and then each two bit represent the value of a single field. But there is no necessarity for doing it in a special sequence or in only one bitset, even though at first it may look natural to do so.
Anyway, this is the part I'm most felxible about, as this can be done without performance issues in any way it seves the other 2 critical parts of the problem the best.
B. How is the data stored in the pattern.
This is already more difficult. As said, my intention is to store them in a bitset of the appropriate size, but there is he question in what order.
There seem to be two ways, that quickly come to mind:
a) (this could be done with or without the central point C)
...0...
..123..
.45678.
9ABCDEF
.GHIJK.
..LMN..
...O...
b)
...0...
..N14..
.ML235.
KJI.678
.HFC9A.
..GDB..
...E...
If we are just talking about the patterns, b) seems clearly superior. A rotation of the pattern is done by a simple rotateshift (3 bitops total per rotation) and even mirroring the pattern can be done with about a dozen bitops. This kind of operations are much more time consuming with a).
But b) has also some severe drawback... And this leads to:
C. How is the data read from the board implementation to the pattern.
Looking at aboves 2 potential ways to order the pattern bits, now a) is clearly superior. a) can be read by a bunch of bitops from a potential array, as discussed in A. you bitshift each line (getting the line by AND with a bitset nulling all other bits) to the appropriate place and put them together with some OR-operations. Even near the board edges this is done very quick.
Problem of course is, that this would still only get me one possible symmetry of the pattern, but rotations/mirrors are not that easily done. This could be circumvented by saving each pattern to match agaisnt 8 times, but this would look very crude, and may cause troubles elsewhere.
With b) this is much more difficult... Honestly, I don't see a way how it can be done quick, without checking every single bit individually. But when increasing the pattern size (like radius 15) this takes forever, when done very often, especially as the [] operator of bitsets is rather slow.
One possible solution I thought of writing it in CUDA, with each thread generating a pattern around one field, and each block of the thread checking one fixed position around this center. But as I haven't used CUDA before, I don't know how reasonable this is, but if done parallel, this sounds more reasonable than iterating over all positions serially.
As I still didn't find a satisfying solution for the problem, I wanted to ask here, if someone probably knows how it can be done better:
- either rotate/mirror patterns of type a)
- or quickly read pattern of type b) (possibly by arranging the data in a better way in step A., I'm flexible here)
- or if the CUDA idea may actually solve that problem
- or maybe some completely different way, I didn't think of, as I'm sure this has been done before by smarter people
If it matters: I'm coding with VS Pro 2013 and don't mind using boost. If CUDA could solve this effectively, I would also use it.
EDIT:
Okay... So I continued thinking about the whole thing. Maybe there are some other ways to make the whole thing more efficient, by doing some work in more efficient batches.
First of all, what I usually need: On a given board position (and we are talking about 10k positions per second) I need for a large set of positions (every empty field of the board, so most fields) all patterns from size 15 down to size 3. I only need the biggest pattern matched by my database, but in any case, I may often need most of them. So there are 2 things, that could make some time savings possible:
1) some efficient way to use the larger pattern, to generate the pattern one size smaller. This should actually possible, when using the bitordering from b), if it is done the proper way... Then it would only need a few bit ops to cut out the outer ring...
2) As often neighboring fields need their specific pattern, if there would be some way to create their patterns in some sort of batch operation... But I admit, I don't see how this could be done very well... But there may be some time savings.
Oh, and another additional comment, as I had the discussion earlier today with some friend: No it is not an option, t instead of matching the board position against the pattern database, to reverse it and do it the other way around (check if DB pattern matches some board position) I have way too many patterns for that. When doing it the first way, i can just look, if the bitstring exists in my database and be done.
Edit2:
Another Update... First I looked into CUDA, and as it seems incompatible with VS2013, this is a severe blow to that idea. Second I thought about the process how patterns are matched. In fact, it may seem possible, instead of going from the large patterns down to the small ones, doing it in reverse. Now suddenly my pattern library is less of a dictionary but more of a searchtree, as larger patterns certainly have their inner core saved as pattern as well. This should speed up any lookups, but still does not solve my problem of the patterngeneration, sadly.
Edit3:
As I felt, it is more worth of an answer then an edit, I just posted my own new idea (which is different from what I had in mind when posting this question) below.

Okay, as I was thinking about this more and more, I now think, that the following solution may be the best to tackle the problems. This is certainly not final, but my currently best idea. So any criticism is welcome and improvements can surely be made.
As the discussion in the comments led me to the believe, that the approach imagined in the question is not practical for the problem at hand, I now drastically changed my idea. Instead of trying to read the pattern around each empty intersection after each move, I will now update the surrounding pattern of each empty intersection after each move made.
This can be made in an efficient way, as we can use 2 very important features of our patterns:
1) each larger patterns core (so the pattern reduced to a lower radius) will guaranteed to be in the database
2) most patterns will have a rather low radius, and in most cases, not many positions on the board are changed with each move, resulting in not too many positions needing a recheck of their patterns.
My idea is, to store the currently largest pattern, it's radius and it's evaluation with each empty intersection. Now, while a move is made, I generate a list of all positions changed during that move. (usually one) Once the move is finished, I iterate over all empty positions on the board and look at their distance to the closest change. Now we are having 3 possible cases:
a) the distance is smaller or equal the radius of currently matched pattern. Now we have to recheck the pattern.
b) the distance is one bigger then the radius of the currently matched pattern. Now we have to check, if actually a (r+1) size pattern exists, matching the surrounding. If it does, we have to check r+2 etc, until we found the largest.
c) the distance is even bigger: We can keep everything as it is.
As we are having now basically a tree of patterns, with each pattern having lots of child pattern with an incremented radius, it is actually practical to store the pattern information in a series of bitets, each representing a ring of a certain radius around the center.
I hope that this system maximizes the reusability of all the information at hand and is fast enough for my needs. As mentioned before, I welcome criticism and opinions for improvement and if there is not better solution found, will probably implement it in the near future. Once done, I can probably report back on the results.

C++ - fastest sorting algorithm for objects based on distance

I'm trying to make a game or 3D application using openGL. The game/program will have many objects in them and drawn to the screen(around 7000 of them). When I render them, I would need to calculate the distance between the camera and the object and sort them in order to correctly render the objects within the scene. Knowing this, what is the best way to sort them? I really want the sorting to be done really fast, but I've heard there are "trade off" for them, so what algorithm should I use to get the best performance out of it?
Any help would be greatly appreciated.
Edit: a lot of people are talking about the z-buffer/depth buffer. This doesn't work in some cases like a few people talked about. This is why I asked this question.

Sorting by distance doesn't solve the transparency problem perfectly. Consider the situation where two transparent surfaces intersect and each has a part which is closer to you. Perhaps rare in games, but still something to consider if you don't want an occasional glitched look to your renderer.
The better solution is order-independent transparency. With the latest graphics hardware supporting atomic operations, you can use an A-buffer to do this with little memory overhead and in a single pass so it is pretty efficient. See for example this article.
The issue of sorting your scene is still a valid one, though, even if it isn't for transparency -- it is still useful to sort opaque objects front to back to to allow depth testing to discard unseen fragments. For this, Vaughn provided the great solution of BSP trees -- these have been used for this purpose for as long as 3D games have been around.

Use http://en.wikipedia.org/wiki/Insertion_sort which has O(n) complexity for nearly sorted arrrays.
In your case by exploiting temporal cohesion insertion sort gives fastest results.
It is used for http://en.wikipedia.org/wiki/Sweep_and_prune
From link above:
In many applications, the configuration of physical bodies from one time step to the next changes very little. Many of the objects may not move at all. Algorithms have been designed so that the calculations done in a preceding time step can be reused in the current time step, resulting in faster completion of the calculation.
So in such cases insertion sort is best(or similar sorts with O(n) at best case)

Render 1000+ shapes in opengl

How can I render a bunch of hand drawn shapes in opengl 1.x? I know about instancing but how is it possible in old opengl? Could I get examples of some sort? This is for a game, I'm expecting a thousand or so shapes all of which will need to be updated every frame.

Assuming that (at least most of) the shapes remain unchanged from one frame to the next, so most of the update is just moving them around, you could at least consider building a display list for each shape, then rendering the display lists during an update.
The amount of good you'll get from this varies widely depending on the hardware (and possibly driver) in use though. Some hardware supports display lists directly, and gains a lot from it. With other hardware, you'll be hard put to find any difference at all.
The good points are that at worst this won't do any harm, and building/using display lists is pretty quick and easy. So, in the worst case you don't lose much, and in the best case you might gain quite a bit.

Directx 9 Terrain collision

I searched and I found some tutorials how to do terrain collision but they were using .raw files, I'm using .x. But, I think i can do same thing they did. They took x,y,z values of an object can checked it against every single triangle in the terrain. It makes sense but It look like it will be slow. It is just like picking checking against every single triangle is slow.
Is there faster way to do it and good?
UPDATE
My terrain is not flat, if it was i would use bounding boxes.

Last time I did this, I used the Bullet library, and it worked great. It has various collision shapes to choose from, optimised for different scenarios, including general triangle meshes and heightfields. You can use the library's collision routines without the physics.

One common way to significantly reduce the time it takes to detect collisions is to organize the space into an octree, which will allow you to very quickly determine whether or not a collision could occur in a particular node. Generally speaking, it's easier to accomplish these sorts of tasks with a game engine.

Mahjong-solitaire solver algorithm, which needs a speed-up

I'm developing a Mahjong-solitaire solver and so far, I'm doing pretty good. However,
it is not so fast as I would like it to be so I'm asking for any additional optimization
techniques you guys might know of.
All the tiles are known from the layouts, but the solution isn't. At the moment, I have few
rules which guarantee safe removal of certain pairs of same tiles (which cannot be an obstacle to possible solution).
For clarity, a tile is free when it can be picked any time and tile is loose, when it doesn't bound any other tiles at all.
If there's four free free tiles available, remove them immediately.
If there's three tiles that can be picked up and at least one of them is a loose tile, remove the non-loose ones.
If there's three tiles that can be picked up and only one free tile (two looses), remove the free and one random loose.
If there's three loose tiles available, remove two of them (doesn't matter which ones).
Since there is four times the exact same tile, if two of them are left, remove them since they're the only ones left.
My algorithm searches solution in multiple threads recursively. Once a branch is finished (to a position where there is no more moves) and it didn't lead to a solution, it puts the position in a vector containing bad ones. Now, every time a new branch is launched it'll iterate via the bad positions to check, if that particular position has been already checked.
This process continues until solution is found or all possible positions are being checked.
This works nicely on a layout which contains, say, 36 or 72 tiles. But when there's more,
this algorithm becomes pretty much useless due to huge amount of positions to search from.
So, I ask you once more, if any of you have good ideas how to implement more rules for safe tile-removal or any other particular speed-up regarding the algorithm.
Very best regards,
nhaa123

I don't completely understand how your solver works. When you have a choice of moves, how do you decide which possibility to explore first?
If you pick an arbitrary one, it's not good enough - it's just brute search, basically. You might need to explore the "better branches" first. To determine which branches are "better", you need a heuristic function that evaluates a position. Then, you can use one of popular heuristic search algorithms. Check these:
A* search
beam search
(Google is your friend)

Some years ago, I wrote a program that solves Solitaire Mahjongg boards by peeking. I used it to examine one million turtles (took a day or something on half a computer: it had two cores) and it appeared that about 2.96 percent of them cannot be solved.
http://www.math.ru.nl/~debondt/mjsolver.html
Ok, that was not what you asked, but you might have a look at the code to find some pruning heuristics in it that did not cross your mind thus far. The program does not use more than a few megabytes of memory.

Instead of a vector containing the "bad" positions, use a set which has a constant lookup time instead of a linear one.

If 4 Tiles are visible but can not be picked up, the other Tiles around have to be removed first. The Path should use your Rules to remove a minimum of Tiles, towards these Tiles, to open them up.
If Tiles are hidden by other Tiles, the Problem has no full Information to find a Path and a Probability of remaining Tiles needs to be calculated.
Very nice Problem!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js