Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
I'm intrested in creating programs of games and such which learns while playing and saving the information for further use. For example a tic tac toe game where the program saves evry game it won or lost and creates a tree of some kind that store the information of the game and save it in a file when the program is being quit. The problem i'm having is how to save a tree in a file efficently. Any suggestions?
Thenx in advance.
(I'm programming in c++)
One option is to use Boost Property Tree, that can help you to load and store tree structures into the xml files (and several other formats are supported as well). This library is well documented, you can find many examples directly on the web-site
What you're looking for is also called serialization. There are quite a few hits here on StackOverflow for serializing trees in C++. I suggest you look into JSON and YAML as possible formats. There are multiple C++ libraries for both.
I think I'd try nogard's answer first (I've upvoted his). However, if you find that you don't get much value out of the runtime support (ie. the tree in memory) and it only really serves as a serialization/unserialization mechanism, I'd suggest trying out pugixml. It's a very complete, very performant, yet simple to integrate, xml library.
Furthermore, to aide in the in-memory part, I've had good success with simply composing trees out of STL containers (it's simpler than it seems). I've found that different trees can have different requirements (ie. pointer to parent, no pointer to parent, etc.) If the tree has a large number of nodes, these pointers per node can add up. And, if you don't have a use for them, given the way you're iterating through your tree, then it'll just be waste.
If your tree is a stl container, such as a map, then you can create a functor for each element that serialises its data to a string, or directly to file, then walk the tree - using an algorithm like for_each which will call the functor for you.
That's simple. Re-creating the tree is a matter of reading the elements from the file, turning them back into objects and adding them to a new tree structure. You'll need to keep the keys to the tree, not just the values.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 months ago.
The community reviewed whether to reopen this question 9 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I have a list of unsigned shorts that act as local IDs for a database. I was wondering what is the most memory-efficient way to store allowed IDs. For the lifetime of my project, the allowed ID list will be dynamic, so it may have more true or more false allowed IDs as time goes on, with a range of none allowed or all allowed.
What would be the best method to store these? I've considered the following:
List of allowed IDs
Bool vector/array of true/false for allowed IDs
Byte array that can be iterated through, similar to 2
Let me know which of these would be best or if another, better method, exists.
Thanks
EDIT: If possible, can a vector have a value put at say, index 1234, without all 1233 previous values, or would this suit a map or similar type more?
I'm looking at using an Arduino with 2k total ram and using external storage to assist with managing a large block of data, but I'm exploring what my options are
"Best" is opinion-based, unless you are aiming for memory efficiency at the expense of all other considerations. Is that really what you want?
First of all, I hope we're talking <vector> here, not <list> -- because a std::list< short > would be quite wasteful already.
What is the possible value range of those ID's? Do they use the full range of 0..USHRT_MAX, or is there e.g. a high bit you could use to indicate allowed ones?
If that doesn't work, or you are willing to sacrifice a bit of space (no pun intended) for a somewhat cleaner implementation, go for a vector partitioned into allowed ones first, disallowed second. To check whether a given ID is allowed, find it in the vector and compare its position against the cut-off iterator (which you got from the partitioning). That would be the most memory-efficient standard container solution, and quite close to a memory-optimum solution either way. You would need to re-shuffle and update the cut-off iterator whenever the "allowedness" of an entry changes, though.
One suitable data structure to solve your problem is a trie (string tree) that holds your allowed or disallowed IDs.
Your can refer to the ID binary representation as the string. Trie is a compact way to store the IDs (memory wise) and the runtime access to it is bound by the longest ID length (which in your case is constant 16)
I'm not familiar with a standard library c++ implementation, but if efficiency is crucial you can find an implementation or implementat yourself.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Fact: my code is in c++
Question:
I have a list of unsigned long long, which I use as reppresentation for some objects and the position in the list is the ID for the object.
Each unsigned long long is a sort of lis that mark if an object has the component j, so let say that I have
OBJ=1 | (1<<3)
that means that the object has the components 1 and 3.
I want to be fast when I insert, erease and when I retrieve objects from the list.
The retrieve usually is performed by components so I will look for objects with the same components.
I was using std::vector but as soon I started thinking about performance it seems to me not to be the best choice, since each time I have to delete an object from the list it will relocate all the object ( erease can be really frequent ) plus as soon as the underlying array is full vector will create a new bigger array and will copy all the elements on it.
I was thinking to have an "efficientArray" wich is a simple array but each time an element is removed I will "mark" the position as free an I will store that position inside a list of available positions, and anytime I have to add a new element I will store it in the first available position.
In this way all the elements will be stored in contiguos area, as using vector, but I avoid the "erease" problem.
In this way I'm not avoiding the "resize" problem (maybe I can't) plus the objects with the same components will not be closer (maybe).
Are there other ideas/structures wich I can use in order to have better performance?
Am I wrong when I say that I want "similar" object to be closer?
Thanks!
EDIT
Sorry maybe the title and the question was not write in a good way. I know vector is efficient and I don't want to write a better vector. Since I'm learning I would like to understand if vector IN THIS CASE is good or bad and why, if I'm wrong and if what I was thinking is bad and why, if there are better solutions and data structures (tree? map?), if yes why. I asked even if it is convinient to keep "similar" objects closer and if that MAYBE can influence things like branch prediction or something else (no answer about that) or if it is just nonsence. I just want to learn, even "wrong" answer can be useful for me and for others to learn something, but seems it was a bad idea like I asked *"my compiler works even if I write char ** which is wrong"* and I didn't understand why.
I recommend using either std::set or std::map. You want to know if an item exists in a container and both std::set and std::map have good performance for searches and "lookups".
You could use std::bitset and assign each object an ID that is a power of 2.
In any case, you need to profile. Profile your code without changes. Profile your code using different data structures. Choose the data structure with the best performance.
Some timing for different structures can be read here.
The problem with lists are that your always hunting after the link, where each link potentially is a cache miss (and maybe a TLB miss in addition).
The vector on the other hand will enjoy few cache misses and the hardware prefetcher will work optimally for this data structure.
If the data was much larger the results are not so clearcut.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
There are several open source implementations of conditional random fields (CRFs) in C++, such as CRF++, FlexCRF, etc. But from the manual, I can only understand how to use them for 1-D problems such as text tagging, it's not clear how to apply them in 2-D vision problems, suppose I have computed the association potentials at each node and the interaction potentials at each edge.
Did anyone use these packages for vision problems, e.g., segmentation? Or they simply cannot be used in this way?
All in all, is there any open source packages of CRFs for vision problems?
Thanks a lot!
The newest version of dlib has support for learning pairwise Markov random field models over arbitrary graph structures (including 2-D grids). It estimates the parameters in a max-margin sense (i.e. using a structural SVM) rather than in a maximum likelihood sense (i.e. CRF), but if all you want to do is predict a graph labeling then either method is just as good.
There is an example program that shows how to use this stuff on a simple example graph. The example puts feature vectors at each node and the structured SVM uses them to learn how to correctly label the nodes in the graph. Note that you can change the dimensionality of the feature vectors by modifying the typedefs at the top of the file. Also, if you already have a complete model and just want to find the most probable labeling then you can call the underlying min-cut based inference routine directly.
In general, I would say that the best way to approach these problems is to define the graphical model you want to use and then select a parameter learning method that works with it. So in this case I imagine you are interested in some kind of pairwise Markov random field model. In particular, the kind of model where the most probable assignment can be found with a min-cut/max-flow algorithm. Then in this case, it turns out that a structural SVM is a natural way to find the parameters of the model since a structural SVM only requires the ability to find maximum probability assignments. Finding the parameters via maximum likelihood (i.e. treating this as a CRF) would require you to additionally have some way to compute sums over the graph variables, but this is pretty hard with these kinds of models. For this kind of model, all the CRF methods I know about are approximations, while the SVM method in dlib uses an exact solver. By that I mean, one of the parameters of the algorithm is an epsilon value that says "run until you find the optimal parameters to within epsilon accuracy", and the algorithm can do this efficiently every time.
There was a good tutorial on this topic at this year's computer vision and pattern recognition conference. There is also a good book on Structured Prediction and Learning in Computer Vision written by the presenters.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have a school project to build an AI for a 2D racing game in which it will compete with several other AIs.
We are given a black and white bitmap image of the racing track, we are allowed to choose basic stats for our car (handling, acceleration, max speed and brakes) after we receive the map. The AI connects to the game's server and gives to it several times a second numbers for the current acceleration and steering. The language I chose is C++, by the way.
The question is
What is the best strategy or algorithm (since I want to try and win)? I currently have in mind some ideas found on the net and one or two of my own, but I would like before I start to code that my perspective is one of the best.
What good books are there on that matter?
What sites should I refer to?
There's no "right answer" for this problem - it's pretty open-ended and many different options might work out.
You may want to look into reinforcement learning as a way of trying to get the AI to best determine how to control the car once it's picked the different control statistics. Reinforcement learning models can train the computer to try to work toward a good system for making particular maneuvers in terms of the underlying control system.
To determine what controls you'll want to use, you could use some flavor of reinforcement learning, or you may want to investigate supervised learning algorithms that can play around with different combinations of controls and see how good of a "fit" they give for the particular map. For example, you might break the map apart into small blocks, then try seeing what controls do well in the greatest number of blocks.
In terms of plotting out the path you'll want to take, A* is a well-known algorithm for finding shortest paths. In your case, I'm not sure how useful it will be, but it's the textbook informed search algorithm.
For avoiding opponent racers and trying to drive them into trickier situations, you may need to develop some sort of opponent modeling system. Universal portfolios are one way to do this, though I'm not sure how useful they'll be in this instance. One option might be to develop a potential field around the track and opponent cars to help your car try to avoid obstacles; this may actually be a better choice than A* for pathfinding. If you're interested in tactical maneuvers, a straightforward minimax search may be a good way to avoid getting trapped or to find ways to trap opponents.
I am no AI expert, but I think the above links might be a good starting point. Best of luck with the competition!
What good books are there on that matter?
The best book I have read on this subject is "Programming Game AI by Example" by Mat Buckland. It has chapters on both path planning and steering behaviors, and much more (state machines, graph theory, the list goes on).
All the solutions above are good, and people have gone to great length to test them out. Look up "Togelius and Lucas" or "Loiacono and Lanzi". They have tries things like neuroevolution, imitation (done via reinforcement learning), force fields, etc. From my point of view the best way to go is center line. That will take an hour to implement. In contrast, neuroevolution (for example) is neither easy nor fast. I did my dissertation on that and it can easily take several months full time, if you have the right hardware.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I need to map primitive keys (int, maybe long) to struct values in a high-performance hash map data structure.
My program will have a few hundred of these maps, and each map will generally have at most a few thousand entries. However, the maps will be "refreshing" or "churning" constantly; imagine processing millions of add and delete messages a second.
What libraries in C or C++ have a data structure that fits this use case? Or, how would you recommend building your own? Thanks!
I would recommend you to try Google SparseHash (or the C11 version Google SparseHash-c11) and see if it suits your needs. They have a memory efficient implementation as well as one optimized for speed.
I did a benchmark a long time ago, it was the best hashtable implementation available in terms of speed (however with drawbacks).
What libraries in C or C++ have a data structure that fits this use case? Or, how would you recommend building your own? Thanks!
Check out the LGPL'd Judy arrays. Never used myself, but was advertised to me on few occasions.
You can also try to benchmark STL containers (std::hash_map, etc). Depending on platform/implementation and source code tuning (preallocate as much as you can dynamic memory management is expensive) they could be performant enough.
Also, if performance of the final solution trumps the cost of the solution, you can try to order the system with sufficient RAM to put everything into plain arrays. Performance of access by index is unbeatable.
The add/delete operations are much (100x) more frequent than the get operation.
That hints that you might want to concentrate on improving algorithms first. If data are only written, not read, then why write them at all?
Just use boost::unordered_map (or tr1 etc) by default. Then profile your code and see if that code is the bottleneck. Only then would I suggest to precisely analyze your requirements to find a faster substitute.
If you have a multithreaded program, you can find some useful hash tables in intel thread building blocks library. For example, tbb::concurrent_unordered_map has the same api as std::unordered_map, but it's main functions are thread safe.
Also have a look at facebook's folly library, it has high performance concurrent hash table and skip list.
khash is very efficient. There is author's detailed benchmark: https://attractivechaos.wordpress.com/2008/10/07/another-look-at-my-old-benchmark/ and it also shows khash beats many other hash libraries.
from android sources (thus Apache 2 licensed)
https://github.com/CyanogenMod/android_system_core/tree/ics/libcutils
look at hashmap.c, pick include/cutils/hashmap.h, if you don't need thread safety you can remove mutex code, a sample implementation is in libcutils/str_parms.c
First check if existing solutions like libmemcache fits your need.
If not ...
Hash maps seems to be the definite answer to your requirement. It provides o(1) lookup based on the keys. Most STL libraries provide some sort of hash these days. So use the one provided by your platform.
Once that part is done, you have to test the solution to see if the default hashing algorithm is good enough performance wise for your needs.
If it is not, you should explore some good fast hashing algorithms found on the net
good old prime number multiply algo
http://www.azillionmonkeys.com/qed/hash.html
http://burtleburtle.net/bob/
http://code.google.com/p/google-sparsehash/
If this is not good enough, you could roll a hashing module by yourself, that fixes the problem that you saw with the STL containers you have tested, and one of the hashing algorithms above. Be sure to post the results somewhere.
Oh and its interesting that you have multiple maps ... perhaps you can simplify by having your key as a 64 bit num with the high bits used to distinguish which map it belongs to and add all key value pairs to one giant hash. I have seen hashes that have hundred thousand or so symbols working perfectly well on the basic prime number hashing algorithm quite well.
You can check how that solution performs compared to hundreds of maps .. i think that could be better from a memory profiling point of view ... please do post the results somewhere if you do get to do this exercise
I believe that more than the hashing algorithm it could be the constant add/delete of memory (can it be avoided?) and the cpu cache usage profile that might be more crucial for the performance of your application
good luck
Try hash tables from Miscellaneous Container Templates. Its closed_hash_map is about the same speed as Google's dense_hash_map, but is easier to use (no restriction on contained values) and has some other perks as well.
I would suggest uthash. Just include #include "uthash.h" then add a UT_hash_handle to the structure and choose one or more fields in your structure to act as the key. A word about performance here.
http://incise.org/hash-table-benchmarks.html gcc has a very very good implementation. However, mind that it must respect a very bad standard decision :
If a rehash happens, all iterators are invalidated, but references and
pointers to individual elements remain valid. If no actual rehash
happens, no changes.
http://www.cplusplus.com/reference/unordered_map/unordered_map/rehash/
This means basically the standard says that the implementation MUST BE based on linked lists.
It prevents open addressing which has better performance.
I think google sparse is using open addressing, though in these benchmarks only the dense version outperforms the competition.
However, the sparse version outperforms all competition in memory usage. (also it doesn't have any plateau, pure straight line wrt number of elements)