is there an easy way to visualize non linear(non contiguous) data structures like linked list or binary trees of a program in gnu debugger(gdb)....
to visualize an array of structures we can simply use....
print *array#len
if more indirection is needed we can also use
print **array#len
but above works for only linear data structures like arrays...
Please let me know if non linear(non contiguous) data structures likes linked list or binary trees can be visualized in a similar way... Thanks in advance...
You might give ddd a try. It'll even create fancy maps of your data structure.
GDB 7.x contains embedded Python interpreter (if so configured) and can be used to examine arbitrarily complicated data structures.
In particular, it can print contents of std::map and std::set, which are much more complicated "inside" than binary trees.
More info here and here.
Related
We aim at using HDF5 for our data format. HDF5 has been selected because it is a hierarchical filesystem-like cross-platform data format and it supports large amounts of data.
The file will contain arrays and some parameters. The question is about how to store the parameters (which are not made up by large amounts of data), considering also file versioning issues and the efforts to build the library. Parameters inside the HDF5 could be stored as either (A) human-readable attribute/value pairs or (B) binary data in the form of HDF5 compound data types.
Just as an example, let's consider as a parameter a polygon with three vertex. Under case A we could have for instance a variable named Polygon with the string representation of the series of vertices, e.g. for instance (1, 2); (3, 4); (4, 1). Under case B, we could have instead a variable named Polygon made up by a [2 x 3] matrix.
We have some idea, but it would be great to have inputs from people who have already worked with something similar. More precisely, could you please list pro/cons of A and B and also say under what circumstances which would be preferable?
Speaking as someone who's had to do exactly what you're talking about a number of time, rr got it basically right, but I would change the emphasis a little.
For file versioning, text is basically the winner.
Since you're using an hdf5 library, I assume both serializing and parsing are equivalent human-effort.
text files are more portable. You can transfer the files across generations of hardware with the minimal risk.
text files are easier for humans to work with. If you want to extract a subset of the data and manipulate it, you can do that with many programs on many computers. If you are working with binary data, you will need a program that allows you to do so. Depending on how you see people working with your data, this can make a huge difference to the accessibility of the data and maintenance costs. You'll be able to sed, grep, and even edit the data in excel.
input and output of binary data (for large data sets) will be vastly faster than text.
working with those binary files in a new environmnet (e.g. a 128 bit little endian computer in some sci-fi future) will require some engineering.
similarly, if you write applications in other languages, you'll need to handle the encoding identically between applications. This will either mean engineering effort, or having the same libraries available on all platforms. Plain text this is easier...
If you want others to write applications that work with your data, plain text is simpler. If you're providing binary files, you'll have to provide a file specification which they can follow. With plain text, anyone can just look at the file and figure out how to parse it.
you can archive the text files with compression, so space concerns are primarily an issue for the data you are actively working with.
debugging binary data storage is significantly more work than debugging plain-text storage.
So in the end it depends a little on your use case. Is it meaningful to look at the data in the myriad tools that handle plain-text? Is it only meaningful to look at it with big-data hdf5 viewers? Will writing plain text be onerous to you in terms of time and space?
In general, when I'm faced with this issue, I basically always do the same thing: I store the data in plain text until I realize the speed problems are more irritating than working with binary would be, and then I switch. If you don't know in advance if you're crossing that threshold start with plain-text, and write your interface to your persistence layer in such a way that it will be easy to switch later. This is tiny bit of additional work, which you will probably get back thanks to plain text being easier to debug.
If you expect to edit the file by hand often (like XMLs or JSONs), then go with human readable format.
Otherwise go with binary - it's much easier to create a parser for it and it will run faster than any grammar parser.
Also note how there's nothing that prevents you from creating a converter between binary and human-readable form later.
Versioning files might sound nice, but are you really going to inspect the diffs for files "containing large arrays"?
I'm implementing C++ code communicating with hardware which runs a number of hardware-assisted data structures (direct access tables, and search trees). So I need to maintain a local cache which would store data before pushing it down on the hardware.
I think to replicate H/W tree structure I could choose std::map, but what about direct table (basically it is implemented as a sequential array of results and allows direct-access lookups)?
Are there close enough analogues in STL to implement such structures or simple array would suffice?
Thanks.
If you are working with hardware structures, you are probably best off mimicking the structures as exactly as possible using C structs and C arrays.
This will give you the ability to map the hardware structure as exactly as possible and to move the data around with a simple memcpy.
The STL will probably not be terribly useful since it does lots of stuff behind the scenes and you have no control of the memory layout. This will mean that each write to hardware will involve a complex serialization exercise that you will probably want to avoid.
I believe you're looking for std::vector. Or, if the size is known at compile time, std::array (since C++11).
C++11 has an unordered-map, and unordered-set, which are analogous to a hash table. Maps are faster for iteration, while sets are faster for look up.
But first you should run a profiler to see if your data-structures are what slows your program down
Basically i want to make my own huffman encoder/decoder but I do not want to use STL libraries (priority queues (heaps), stacks, vectors, etc).
I know that i will have to implement SOME data structures but since I am writing them all myself i want to know what ones are easy to write and will do the job for huffman encoding? I feel like a minHeap might be all i need to sort the subtrees, but obviously i need to somehow create the trees via some linked list structure.
So what are the data structures that are necessary for a huffman encoder?
P.s. any links that talk about how to create the codebook for huffmann is greatly appreciated also.
Thanks
I would like to convert a struct from c source code (same file) (which can include more structs of structs) and I would like to walk that "structure". Is there an optimal way to do this? Some sort of Tree class that I could convert to a graphical tree would be nice. At the farthest "leaf" struct, I'll want to be able to access the struct members also, not just the tree structure of the structs. I'm open to any algorithms. This strikes me as a recursive type algorithm. I just don't want to waste time reinventing the wheel. No it's not homework, but it is work related :) . If any knows of preexisting tools that already do this. I can provide more details below if anyone needs them.
If you are just documenting your code, I recommend using doxygen software.
If you are to parse your code file, take a look at boost libraries:
This example might give you some motivation.
For graphic displaying I can't recommend anything c++ but maybe NodeBox which is based on boost.
Nevertheless, C++ is probably not the right tool of choice for tasks like this. Try rather (say) python.
As a last option, you can also go and do your own recursive traverser and output your data into an / that can be read by some graph displaying program.
I need a map-like data structure (in C++) for storing pairs (Key,T) with the following functionality:
You can insert new elements (Key,T) into the current structure
You can search for elements based on Key in the current structure
You can make a "snapshot" of the current version of the structure
You can switch to one of the versions of the structures which you took the snapshot of and continue all operations from there
Completely remove one of the versions
What I don't need
Element removal from the structure
Merging of different versions of the structure into one
Iteration over all (or some of) elements currently stored in the structure
In other words, you have some search structure that you can build up, but at any point you can jump in history, and expand the earlier/different version of the structure in a different way. Later on you may jump between those different versions.
In my project, Key and T are likely to be integers or pointer values, but not strings.
The primary objective is to reduce the time complexity; space consumption is secondary (but should be reasonable as well). To clarify, for me log(N)+log(S) (where N-number of elements, S-number of snapshots) would be enough, although faster is better :)
I have some rough idea how to implement it --- for example: being the structure a binary search tree, the insertion of a new element can clone the path from the root to the insertion location, while keeping the rest of the tree intact. Switching tree versions would be equivalent to picking a different version of the root node, for which some changes are simply not visible.
However, to make this custom tree efficient (e.g. self-balancing) it will require some additional effort and careful coding. Of course I can do it myself but perhaps there are already existing libraries to do exactly that?
Also, there is probably a proper name for this kind of data structure that I simply don't know, making my Google searches (or SO searches) total failures...
Thank you for your help!
I think what you are looking for is an immutable map. Functional (or functionally inspired) programming languages (such as Haskell or Scala) have immutable versions of most of the containers you'd find in the STL. Operations such as insertion/removal etc. then return a copy of the map (preserving the original) with the copy containing your requested modification. A lot of work has gone into designing the datastructures so that the copies are able to point to as much of the original datastructure as possible to reduce time and memory complexity of each operation.
You can find a lot more details in a book such as this one: http://www.amazon.co.uk/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
While searching for some persistent search trees libraries I stumbled on this:
http://cg.scs.carleton.ca/~dana/pbst/
While it does not have the exact same functionality as needed, it seems pretty close to it. I will investigate.
(posting here, as someone may find it useful as well)