How to store game objects in quadtree efficiently - c++

I'm implementing a quadtree structure, for simplifying collision code,. but I'm unsure as to the best practice for doing so. Currently, the quadtree creates subtrees during setup down to a preset maximum depth, then I insert objects into its appropriate tree, for use in pair generation(the actual maths stuff).
However, I've heard of other approaches, which only generate subtrees when a certain number of objects are stored.
I know my method has a space overhead, but might be computationally faster during update cycles.
What would be the best way to handle it?

One approach is to store k elements in each node, starting with one parent node which spans the entire collision space. When inserting element k+1, you subdivide the space and place the new element in the correct quadrant.
Additionally you can use this approach to statically allocate the data structure, assuming you know the maximum number of nodes that will be used, and that there will be some maximum density. This requires a fixed array of nodes and elements to be allocated for the life of the application, but it avoids costly dynamic allocations, which should be a speed gain.

Related

Calculate memory usage of a tree structure in C++

I have a tree structure
struct TrieNode {
std::unordered_map<std::string, TrieNode> children;
std::vector<std::string> terminals;
};
Some details about its usage:
The tree is not modified after it's been populated.
The keys in unordered map are short strings (do not exceed 5 characters).
This structure can grow very large. And I need to calculate its size in memory. This size does not need to be very precise.
Are there any existing approaches to do that?
If no I was thinking of these options:
I can keep track of modifications to this structure separately.
Use a custom allocator for containers that keeps track of the space (is there a common implementation for that?).
Overload new operator for my structure to keep track of memory (not sure how to keep track of insertions into vector after that).
Calculate the size after the tree was populated by traversing the entire tree (last resort as for the large tree it would take really long time but the result is more precise).
What would be the best approach?
The last one. I have following reasons:
It's the simplest among the four approaches.
because the tree is fixed after populated, lazily evaluating the size makes more sense, because:
when the size is not used, we can save the time spend on calculating the size.
It won't take extra time, for the time complexity is also O(n), the only extra time spend is call on recursive functions.
it avoids the presence of global variable

How to store waypoints efficiently

i have a game with a very large map, and i need to store a lot of waypoints (millions, if not billions) to then use them for pathfinding using the A* algorithm.
What i need:
Efficient way to store a lot of them
Fast way to access them directly for A* algorithm.
At first i thought to use a simple vector, but this would use all the available memory soon.
Then i thought i should use mysql, this maybe it's a good idea as i can query the database for an area of waypoints.
The big problem is that for A* i need to access the waypoints as fast as possible, so maybe i need an unique ID per waypoint.
What is the best way to accomplish this?
I think you should consider going slightly lower-level by managing your memory yourself --- calling new and delete explicitly for every node of your graph; and referencing your nodes by memory address. If you are worried about allocating a lot of small chunks of memory, you can consider using tcmalloc library.
The nodes should contain adjacency lists. If the graph is static, I would suggest storing the adjacency in a dynamically created array in each node, which size matches exactly the number of neighbors. If the number of neighbors can change over time, std::vector may be the next best thing.
Note: I assume the graph is irregular. For regular graphs (e.g. grid as pointed out by Potatoswatter) the location of the nodes can be learned implicitly.

Kd tree: data stored only in leaves vs stored in leaves and nodes

I am trying to implement a Kd tree to perform the nearest neighbor and approximate nearest neighbor search in C++. So far I came across 2 versions of the most basic Kd tree.
The one, where data is stored in nodes and in leaves, such as here
The one, where data is stored only in leaves, such as here
They seem to be fundamentally the same, having the same asymptotic properties.
My question is: are there some reasons why choose one over another?
I figured two reasons so far:
The tree which stores data in nodes too is shallower by 1 level.
The tree which stores data only in leaves has easier to
implement delete data function
Are there some other reasons I should consider before deciding which one to make?
You can just mark nodes as deleted, and postpone any structural changes to the next tree rebuild. k-d-trees degrade over time, so you'll need to do frequent tree rebuilds. k-d-trees are great for low-dimensional data sets that do not change, or where you can easily afford to rebuild an (approximately) optimal tree.
As for implementing the tree, I recommend using a minimalistic structure. I usually do not use nodes. I use an array of data object references. The axis is defined by the current search depth, no need to store it anywhere. Left and right neighbors are given by the binary search tree of the array. (Otherwise, just add an array of byte, half the size of your dataset, for storing the axes you used). Loading the tree is done by a specialized QuickSort. In theory it's O(n^2) worst-case, but with a good heuristic such as median-of-5 you can get O(n log n) quite reliably and with minimal constant overhead.
While it doesn't hold as much for C/C++, in many other languages you will pay quite a price for managing a lot of objects. A type*[] is the cheapest data structure you'll find, and in particular it does not require a lot of management effort. To mark an element as deleted, you can null it, and search both sides when you encounter a null. For insertions, I'd first collect them in a buffer. And when the modification counter reaches a threshold, rebuild.
And that's the whole point of it: if your tree is really cheap to rebuild (as cheap as resorting an almost pre-sorted array!) then it does not harm to frequently rebuild the tree.
Linear scanning over a short "insertion list" is very CPU cache friendly. Skipping nulls is very cheap, too.
If you want a more dynamic structure, I recommend looking at R*-trees. They are actually desinged to balance on inserts and deletions, and organize the data in a disk-oriented block structure. But even for R-trees, there have been reports that keeping an insertion buffer etc. to postpone structural changes improves performance. And bulk loading in many situations helps a lot, too!

Graph memory implementation

The two ways commonly used to represent a graph in memory are to use either an adjacency list or and adjacency matrix. An adjacency list is implemented using an array of pointers to linked lists. Is there any reason that that is faster than using a vector of vectors? I feel like it should make searching and traversals faster because backtracking would be a lot simpler.
The vector of linked adjacencies is a favorite textbook meme with many variations in practice. Certainly you can use vectors of vectors. What are the differences?
One is that links (double ones anyway) allow edges to be easily added and deleted in constant time. This obviously is important only when the edge set shrinks as well as grows. With vectors for edges, any individual operation may require O(k) where k is the incident edge count.
NB: If the order of edges in adjacency lists is unimportant for your application, you can easily get O(1) deletions with vectors. Just copy the last element to the position of the one to be deleted, then delete the last! Alas, there are many cases (e.g. where you're worried about embedding in the plane) when order of adjacencies is important.
Even if order must be maintained, you can arrange for copying costs to amortize to an average that is O(1) per operation over many operations. Still in some applications this is not good enough, and it requires "deleted" marks (a reserved vertex number suffices) with compaction performed only when the number of marked deletions is a fixed fraction of the vector. The code is tedious and checking for deleted nodes in all operations adds overhead.
Another difference is overhead space. Adjacency list nodes are quite small: Just a node number. Double links may require 4 times the space of the number itself (if the number is 32 bits and both pointers are 64). For a very large graph, a space overhead of 400% is not so good.
Finally, linked data structures that are frequently edited over a long period may easily lead to highly non-contiguous memory accesses. This decreases cache performance compared to linear access through vectors. So here the vector wins.
In most applications, the difference is not worth worrying about. Then again, huge graphs are the way of the modern world.
As others have said, it's a good idea to use a generalized List container for the adjacencies, one that may be quickly implemented either with linked nodes or vectors of nodes. E.g. in Java, you'd use List and implement/profile with both LinkedList and ArrayList to see which works best for your application. NB ArrayList compacts the array on every remove. There is no amortization as described above, although adds are amortized.
There are other variations: Suppose you have a very dense graph, where there's a frequent need to search all edges incident to a given node for one with a certain label. Then you want maps for the adjacencies, where the keys are edge labels. Of course the maps can be hashes or trees or skiplists or whatever you like.
The list goes on. How to implement for efficient vertex deletion? As you might expect, there are alternatives here, too, each with advantages and disadvantages.

Hashmap to implement adjacency lists

I've implement an adjacency list using the vector of vectors approach with the nth element of the vector of vectors refers to the friend list of node n.
I was wondering if the hash map data structure would be more useful. I still have hesitations because I simply cannot identify the difference between them and for example if I would like to check and do an operation in nth elements neighbors (search,delete) how could it be more efficient than the vector of vectors approach.
A vector<vector<ID>> is a good approach if the set of nodes is fixed. If however you suddenly decide to remove a node, you'll be annoyed. You cannot shrink the vector because it would displace the elements stored after the node and you would lose the references. On the other hand, if you keep a list of free (reusable) IDs on the side, you can just "nullify" the slot and then reuse later. Very efficient.
A unordered_map<ID, vector<ID>> allows you to delete nodes much more easily. You can go ahead and assign new IDs to the newly created nodes and you will not be losing empty slots. It is not as compact, especially on collisions, but not so bad either. There can be some slow downs on rehashing when a vector need be moved with older compilers.
Finally, a unordered_multimap<ID, ID> is probably one of the easiest to manage. It also scatters memory to the wind, but hey :)
Personally, I would start prototyping with a unordered_multimap<ID, ID> and switch to another representation only if it proves too slow for my needs.
Note: you can cut in half the number of nodes if the adjacency relationship is symmetric by establishing than the relation (x, y) is stored for min(x, y) only.
Vector of vectors
Vector of vectors is good solution when you don't need to delete edges.
You can add edge in O(1), you can iterate over neighbours in O(N).
You can delete edge by vector[node].erase(edge) but it will be slow, complexity only O(number of vertices).
Hash map
I am not sure how you want to use hash map. If inserting edge means setting hash_map[edge] = 1 then notice that you are unable to iterate over node's neighbours.