For my recent project I'm right now looking for an efficient way to structure and store the board information with consideration of the usage for patternmatching.
I'm having a square board, and for pattern matching, I'm using bitfields with 2 bits representing one field of the board. The patterns to match have a diamond shape, that could be centered around any possible field on the board. (so the center is not static, I need to be able to do it for any center)
Example of diamond area around O:
..X..
.XXX.
XXOXX
.XXX.
..X..
If parts of the diamond are outside the the playing area, the bits will be set to 11. The diamonds can have differing radiuses, aboves example would have a radius of 2.
Another important thing for the efficiency of the system is, that I have to be able to quickly rotate/mirror the pattern into all 8 possible symmetries.
For this, it may be beneficial to actually NOT store the information of the central point in the pattern, and as this is not required for my algorithm anyway, this may be a valuable timesaver. Because now some bitshifting magic is possible to quickly rotate/mirror the patterns.
As this kind of patternmatching has to be done at a high frequency, it can prove to be a severe bottleneck of my overall project, when implemented badly.
When trying to get a nice model for doing all this work, I figured, there are 3 important keyareas that require thinking about, but are of course tightly connected.
A. How is the data stored in the board implementation.
Currently this is done in a rather difficult manner, which would be too difficult to read from with such high frequency. But it would be no problem or timeloss to actually store and update the 2 bit data in any possible way for the entire board.
Easiest would be to just store the entire board in an bitset with the size of twice the board, and then each two bit represent the value of a single field. But there is no necessarity for doing it in a special sequence or in only one bitset, even though at first it may look natural to do so.
Anyway, this is the part I'm most felxible about, as this can be done without performance issues in any way it seves the other 2 critical parts of the problem the best.
B. How is the data stored in the pattern.
This is already more difficult. As said, my intention is to store them in a bitset of the appropriate size, but there is he question in what order.
There seem to be two ways, that quickly come to mind:
a) (this could be done with or without the central point C)
...0...
..123..
.45678.
9ABCDEF
.GHIJK.
..LMN..
...O...
b)
...0...
..N14..
.ML235.
KJI.678
.HFC9A.
..GDB..
...E...
If we are just talking about the patterns, b) seems clearly superior. A rotation of the pattern is done by a simple rotateshift (3 bitops total per rotation) and even mirroring the pattern can be done with about a dozen bitops. This kind of operations are much more time consuming with a).
But b) has also some severe drawback... And this leads to:
C. How is the data read from the board implementation to the pattern.
Looking at aboves 2 potential ways to order the pattern bits, now a) is clearly superior. a) can be read by a bunch of bitops from a potential array, as discussed in A. you bitshift each line (getting the line by AND with a bitset nulling all other bits) to the appropriate place and put them together with some OR-operations. Even near the board edges this is done very quick.
Problem of course is, that this would still only get me one possible symmetry of the pattern, but rotations/mirrors are not that easily done. This could be circumvented by saving each pattern to match agaisnt 8 times, but this would look very crude, and may cause troubles elsewhere.
With b) this is much more difficult... Honestly, I don't see a way how it can be done quick, without checking every single bit individually. But when increasing the pattern size (like radius 15) this takes forever, when done very often, especially as the [] operator of bitsets is rather slow.
One possible solution I thought of writing it in CUDA, with each thread generating a pattern around one field, and each block of the thread checking one fixed position around this center. But as I haven't used CUDA before, I don't know how reasonable this is, but if done parallel, this sounds more reasonable than iterating over all positions serially.
As I still didn't find a satisfying solution for the problem, I wanted to ask here, if someone probably knows how it can be done better:
- either rotate/mirror patterns of type a)
- or quickly read pattern of type b) (possibly by arranging the data in a better way in step A., I'm flexible here)
- or if the CUDA idea may actually solve that problem
- or maybe some completely different way, I didn't think of, as I'm sure this has been done before by smarter people
If it matters: I'm coding with VS Pro 2013 and don't mind using boost. If CUDA could solve this effectively, I would also use it.
EDIT:
Okay... So I continued thinking about the whole thing. Maybe there are some other ways to make the whole thing more efficient, by doing some work in more efficient batches.
First of all, what I usually need: On a given board position (and we are talking about 10k positions per second) I need for a large set of positions (every empty field of the board, so most fields) all patterns from size 15 down to size 3. I only need the biggest pattern matched by my database, but in any case, I may often need most of them. So there are 2 things, that could make some time savings possible:
1) some efficient way to use the larger pattern, to generate the pattern one size smaller. This should actually possible, when using the bitordering from b), if it is done the proper way... Then it would only need a few bit ops to cut out the outer ring...
2) As often neighboring fields need their specific pattern, if there would be some way to create their patterns in some sort of batch operation... But I admit, I don't see how this could be done very well... But there may be some time savings.
Oh, and another additional comment, as I had the discussion earlier today with some friend: No it is not an option, t instead of matching the board position against the pattern database, to reverse it and do it the other way around (check if DB pattern matches some board position) I have way too many patterns for that. When doing it the first way, i can just look, if the bitstring exists in my database and be done.
Edit2:
Another Update... First I looked into CUDA, and as it seems incompatible with VS2013, this is a severe blow to that idea. Second I thought about the process how patterns are matched. In fact, it may seem possible, instead of going from the large patterns down to the small ones, doing it in reverse. Now suddenly my pattern library is less of a dictionary but more of a searchtree, as larger patterns certainly have their inner core saved as pattern as well. This should speed up any lookups, but still does not solve my problem of the patterngeneration, sadly.
Edit3:
As I felt, it is more worth of an answer then an edit, I just posted my own new idea (which is different from what I had in mind when posting this question) below.
Okay, as I was thinking about this more and more, I now think, that the following solution may be the best to tackle the problems. This is certainly not final, but my currently best idea. So any criticism is welcome and improvements can surely be made.
As the discussion in the comments led me to the believe, that the approach imagined in the question is not practical for the problem at hand, I now drastically changed my idea. Instead of trying to read the pattern around each empty intersection after each move, I will now update the surrounding pattern of each empty intersection after each move made.
This can be made in an efficient way, as we can use 2 very important features of our patterns:
1) each larger patterns core (so the pattern reduced to a lower radius) will guaranteed to be in the database
2) most patterns will have a rather low radius, and in most cases, not many positions on the board are changed with each move, resulting in not too many positions needing a recheck of their patterns.
My idea is, to store the currently largest pattern, it's radius and it's evaluation with each empty intersection. Now, while a move is made, I generate a list of all positions changed during that move. (usually one) Once the move is finished, I iterate over all empty positions on the board and look at their distance to the closest change. Now we are having 3 possible cases:
a) the distance is smaller or equal the radius of currently matched pattern. Now we have to recheck the pattern.
b) the distance is one bigger then the radius of the currently matched pattern. Now we have to check, if actually a (r+1) size pattern exists, matching the surrounding. If it does, we have to check r+2 etc, until we found the largest.
c) the distance is even bigger: We can keep everything as it is.
As we are having now basically a tree of patterns, with each pattern having lots of child pattern with an incremented radius, it is actually practical to store the pattern information in a series of bitets, each representing a ring of a certain radius around the center.
I hope that this system maximizes the reusability of all the information at hand and is fast enough for my needs. As mentioned before, I welcome criticism and opinions for improvement and if there is not better solution found, will probably implement it in the near future. Once done, I can probably report back on the results.
Related
I'm trying to figure out an efficient algorithm that takes in two QAbstractItemModels (trees) (A,B) and computes the differences between them, such that I get a list of Items that are not present in A (but are in B - added), or items that have been modified / deleted.
The only current way I can think of is doing a Breadth search of A for every item item in B. But this doesn't seem very efficient. Any ideas are welcome.
Have you tried using magic?
Seriously though, this is a very broad question, especially if we consider the fact it is an QAbstractItemModels and not a QAbstractListModel. For a list it would be much simpler, but an abstract item model implements a tree structure so there are a lot of variables.
do you check for total item count
do you check for item count per level
do you check if item is contained in both models
if so, is it contained at the same level
if so, is it contained at the same index
is the item in its original state or has it been modified
You need to make all those considerations and come up with an efficient solution. And don't expect it will as simple as a "by the book algorithm". Good news, since you are dealing with isolated items, it will be easier than trying to do that for text, and in the case of the latter, you can't hope to get anywhere nearly as concise as with isolated items. I've had my fair share of absurdly mindless github diff results.
And just in case that's your actual goal, it will be much easier to achieve by tracking the history of the derived data set than doing a blind comparison. Tracking history is much easier if you want to establish what is added, what is deleted, what is moved and what is modified. Because it will consider the actual event flow rather than just the end result comparison. Especially if you don't have any persistent ID scheme implemented. Is there a way to tell if item X has been deleted or moved to a new level/index and modified and stuff like that.
Also, worry about efficiency only after you have empirically established a performance issue. Some algorithms may seem overly complex, but modern machines are overly fast, and unless you are running that in a tight loop you shouldn't really worry about it. In the end, it doesn't boil down to how complex it is, it boils down to whether it is fast enough or not.
I'm trying to make a game or 3D application using openGL. The game/program will have many objects in them and drawn to the screen(around 7000 of them). When I render them, I would need to calculate the distance between the camera and the object and sort them in order to correctly render the objects within the scene. Knowing this, what is the best way to sort them? I really want the sorting to be done really fast, but I've heard there are "trade off" for them, so what algorithm should I use to get the best performance out of it?
Any help would be greatly appreciated.
Edit: a lot of people are talking about the z-buffer/depth buffer. This doesn't work in some cases like a few people talked about. This is why I asked this question.
Sorting by distance doesn't solve the transparency problem perfectly. Consider the situation where two transparent surfaces intersect and each has a part which is closer to you. Perhaps rare in games, but still something to consider if you don't want an occasional glitched look to your renderer.
The better solution is order-independent transparency. With the latest graphics hardware supporting atomic operations, you can use an A-buffer to do this with little memory overhead and in a single pass so it is pretty efficient. See for example this article.
The issue of sorting your scene is still a valid one, though, even if it isn't for transparency -- it is still useful to sort opaque objects front to back to to allow depth testing to discard unseen fragments. For this, Vaughn provided the great solution of BSP trees -- these have been used for this purpose for as long as 3D games have been around.
Use http://en.wikipedia.org/wiki/Insertion_sort which has O(n) complexity for nearly sorted arrrays.
In your case by exploiting temporal cohesion insertion sort gives fastest results.
It is used for http://en.wikipedia.org/wiki/Sweep_and_prune
From link above:
In many applications, the configuration of physical bodies from one time step to the next changes very little. Many of the objects may not move at all. Algorithms have been designed so that the calculations done in a preceding time step can be reused in the current time step, resulting in faster completion of the calculation.
So in such cases insertion sort is best(or similar sorts with O(n) at best case)
I've been struggling with this problem just like everyone else and I'm quite sure there has been more than enough posts to explain this problem. However in terms of understanding it fully, I wanted to share my thoughts and get more efficient solutions from all the great people in here related to Subset Sum problem.
I've searched it over the Internet and there is actually a lot sources but I'm really willing to re-implement an algorithm or finding my own in order to understand fully.
The key thing I'm struggling with is the efficiency considering the set size will be large. (I do not have a limit, just conceptually large). The two phases I'm trying to implement ideas on is finding two numbers that are equal to given integer T, finding three numbers and eventually K numbers. Some ideas I've though;
For the two integer part I'm thing basically sorting the array O(nlogn) and for each element in the array searching for its negative value. (i.e if the array element is 3 searching for -3). Maybe a hash table inclusion could be better, providing a O(1) indexing the element?
For the three or more integers I've found an amazing blog post;http://www.skorks.com/2011/02/algorithms-a-dropbox-challenge-and-dynamic-programming/. However even the author itself states that it is not applicable for large numbers.
So I was for 2 and 3 and more integers what ideas could be applied for the subset problem. I'm struggling with setting up a dynamic programming method that will be efficient for the large inputs as well.
That blog post you linked to looked pretty great, actually. This is, after all, an NP-complete problem...
But I bet you could speed it up even further. I haven't done any benchmarks, but I'm gonna guess that his use of a matrix is his single biggest time sink. First, it'll take a huge amount of memory for some really trivial inputs (For example: [-1000, 1000] will need 2001 columns! Good grief!), and then you're wasting a ton of cycles scanning through each row looking for "T"s, which are often gonna be pretty sparse.
So instead: Use a "set" data structure. That'll keep space and iteration time to a minimum,* but store values just as well: If it's in the set, it's a "T"; otherwise, it's an "F".
Hope that helps!
*: Of course, "minimum" doesn't necessarily = "small."
Intuitively, it would seem that given a dozen or so 2d images from different angles of almost any object, it should be easy to construct a 3d representation of that object. Subsequently a library of 3d representations attained in this way could be used to identify new 2d images.
What literature is there along these lines, and why has it not yet produced strong object recognition?
It is your word "intuitively" that is causing you trouble there. Your brain is not designed to be very good at certain tasks, like multiplying thousands of numbers in an instant. However for raw computational power your brain makes the fastest computer look like mere tiddly-winks (neural response time of only about 10 milliseconds, but all those 10^14 or so neurons all working in parallel totally beats any modern machine). Its just that your brain is designed to solve problems that are intensely more computationally complex, like recognizing objects in a picture, parsing sound data and picking out individual speakers amidst background noise. Learning to classify and deal with tens of thousands of types of objects.
The incredibly computationally intense things your brain is designed to do really well are the things that, to a person, seem "intuitive". The things it isn't designed to do really well seem "unintuitive" or difficult. But the raw computation needed for strong object recognition (because there are just so MANY kinds of objects, many of which really have subobjects, and multiple classifications, and non-rigid forms, e.g. "trousers", "water", "dog") is WAY more than what is needed accomplish things one considers only possible for a computer. Things like using "common sense" to solve an every day problem are similarly trivial for a person, but computationally incredibly complex.
What you want to do is indeed possible, but (there are quite a few buts)
for the 3D reconstruction:
For anything but the simplest shapes you need more than just a few dozen images.
The shape you are reconstructing needs to have a lot of recognizable features that look similar enough from different angles so that you can match them.
Lighting needs to be fairly constant over your entire set of images, otherwise shadows will throw you off (or you need even more images)
even with very feature rich objects (i.e. lot of variation in colour and shape) 3D reconstruction accuracy from any matched pair of features is going to be terrible if you do not have full knowledge of the parameters (position, view direction and opening angle) of the camera used to take each picture.
These are all problems can be solved, so suppose you did, and now you have a new picture from the object that you want to match to your 3D shape.
You could of course try to find a 2D projection of your shape that fit the new picture, but the search space there is enormous. It would probably be a lot easier and faster to use the feature finding and matching system you built for the initial 3D reconstruction to directly match the new picture to the existing set, and find where it fits on the object that way.
So once you've solved the problem of creating the initial 3D reconstruction your second step is basically done as well.
Photosynth is a brilliant example of these two steps. Browse the site, try to find some of the references they have there.
As for your final step, strong object recognition, just imagine the search space! What you need for strong object recognition, apart from a good representation of the objects you want to recognize, is a good way to search the space of objects you know, and a good way to represent your new object (the image of an object in this case) in that space. This is something I know nearly nothing about.
For just matching the same object in different 2D images there are SIFT features. But I don't think this translates well to 3D.
Note that what you're describing is instance recognition. Computer can indeed do a good job of instance recognition these days. For example, Google Goggles is very good at recognizing landmarks like the Golden Gate Bridge and Eiffel Tower.
However, computers are less good at doing category recognition and classification. Creating dozens of 2D snapshots for all possible objects under all types of lighting conditions etc. becomes intractable very quickly. The fact that certain objects such as a dog can move around makes the space of possibilities even bigger. Computers become much worse at this.
Also, from the biological standpoint, our visual field is around 100 million pixels. Graphics cards have only now started to become capable of rendering that much data in real-time. Making sense of that much data is even more computationally intensive.
One often talks about having a machine reach a 5 year old's ability to process information. But let's think about how much data that is. 100 million pixels with 3 color channels and 1 byte per pixel = 300MB/s. Now multiply that by 30 frames per second, 31,556,926 seconds per year, and 5 years, you end up with roughly 1.4 exabytes (1.4x10^18).
I'm developing a Mahjong-solitaire solver and so far, I'm doing pretty good. However,
it is not so fast as I would like it to be so I'm asking for any additional optimization
techniques you guys might know of.
All the tiles are known from the layouts, but the solution isn't. At the moment, I have few
rules which guarantee safe removal of certain pairs of same tiles (which cannot be an obstacle to possible solution).
For clarity, a tile is free when it can be picked any time and tile is loose, when it doesn't bound any other tiles at all.
If there's four free free tiles available, remove them immediately.
If there's three tiles that can be picked up and at least one of them is a loose tile, remove the non-loose ones.
If there's three tiles that can be picked up and only one free tile (two looses), remove the free and one random loose.
If there's three loose tiles available, remove two of them (doesn't matter which ones).
Since there is four times the exact same tile, if two of them are left, remove them since they're the only ones left.
My algorithm searches solution in multiple threads recursively. Once a branch is finished (to a position where there is no more moves) and it didn't lead to a solution, it puts the position in a vector containing bad ones. Now, every time a new branch is launched it'll iterate via the bad positions to check, if that particular position has been already checked.
This process continues until solution is found or all possible positions are being checked.
This works nicely on a layout which contains, say, 36 or 72 tiles. But when there's more,
this algorithm becomes pretty much useless due to huge amount of positions to search from.
So, I ask you once more, if any of you have good ideas how to implement more rules for safe tile-removal or any other particular speed-up regarding the algorithm.
Very best regards,
nhaa123
I don't completely understand how your solver works. When you have a choice of moves, how do you decide which possibility to explore first?
If you pick an arbitrary one, it's not good enough - it's just brute search, basically. You might need to explore the "better branches" first. To determine which branches are "better", you need a heuristic function that evaluates a position. Then, you can use one of popular heuristic search algorithms. Check these:
A* search
beam search
(Google is your friend)
Some years ago, I wrote a program that solves Solitaire Mahjongg boards by peeking. I used it to examine one million turtles (took a day or something on half a computer: it had two cores) and it appeared that about 2.96 percent of them cannot be solved.
http://www.math.ru.nl/~debondt/mjsolver.html
Ok, that was not what you asked, but you might have a look at the code to find some pruning heuristics in it that did not cross your mind thus far. The program does not use more than a few megabytes of memory.
Instead of a vector containing the "bad" positions, use a set which has a constant lookup time instead of a linear one.
If 4 Tiles are visible but can not be picked up, the other Tiles around have to be removed first. The Path should use your Rules to remove a minimum of Tiles, towards these Tiles, to open them up.
If Tiles are hidden by other Tiles, the Problem has no full Information to find a Path and a Probability of remaining Tiles needs to be calculated.
Very nice Problem!