We heavily use the tree doctrine extension in our zf2 project - with some big tree data structures. We know that inserts and updates in a nested set are expensive. We also know that the tree plugin uses the "root" column to find out which tree shall be updated.
Yesterday I read the tree documentation again and found:
"Support for multiple roots in nested-set"
What does it mean and how does it work? I couldn't find any documentation for this feature.
Our hope would be that we could define a second root item of a lower branch of a big tree so that inserts and updates into this lower branch will not affect the whole tree but only this branch. Is it possible?
yes it is possible, tree root branches will be separated by level 0 nodes, see mapping example of TreeRoot column there should be examples for all mapping types to map treeRoot column. The column must be of the same type as ID, it does not support a ManyToOne relation for now, but there is a plan someday to support it.
root1
child
root2
child
child2
When updating or inserting any child on root2 or root1 branch, it will affect only that certain branch. Also note that, tree is still not concurrently safe, you have to manage locking yourself, see documentation reference here.
The doc directory contains most of the information given here.
Related
I am finding Best solutions for tree in doctrine with extension.
I need tree:
in one table
with multiple roots
add onne root item to another root item
tree items need order (weight) also roots
create list with one select and without recursion
I try from this extension: https://github.com/Atlantic18/DoctrineExtensions/blob/v2.4.x/doc/tree.md
nested
- I can not order root items
closure
- not good idea , need 2 tables
materializedPath
- maybe bes way but do not have order (weight)
Does anyone have experience how to solve it?
Short Description
I need to build a non-binary tree (language doesn't matter for now, but preferably in C++) from a list of items that do have dependencies to each other, but non-recurring and not cyclic.
The data for the nodes are read from a file and incrementally inserted into the tree.
The troubling part is how to handle those nodes which do not have parent-nodes yet that fulfill the dependency of the inserted node.
Detailed Description
Rough Outline
The assignment is easy: represent a bunch of Tasks and Subtasks in a non-binary tree.
This assignment would be quite easy to understand and implement, if not for a tiny condition: the list of Tasks has to be generated incrementally, so do the nodes in the tree.
Scenario
The Tasks are generated asynchronously and have to be added into the tree once the data to a certain Task is received.
This is "simulated" by reading a csv-file which has a certain Task in each line with some data, the most important ones being the PID and PPID attributes.
After a line is read and parsed, a Task is being created and inserted into the tree.
The tree should automatically resolve the dependencies following two simple rules:
Only show the node when the dependency is met (namely when a parent-node has been inserted before), but memorize the (now orphaned) node.
Whenever a Task(node) is added, check if it's a parentnode of one of the above meantioned orphaned ones and reconcile the nodes if rule #1 isn't infringed while doing so.
Please disregard the faulty logic behind this scenario: Normally, there can't be any SubTask without a ParentTask existing (at least in monolithic kernel designs).
And while the List of Tasks certainly do contain the ParentTasks needed to model the tree, it is unknown when the ParentNode-Data is read and inserted into the tree.
Desired outcome
Below is a figure showing the "raw data", a list of (unsorted) Tasks which has been created incrementally while adding one Task after another to the list.
The tree represents the subset of Tasks which has been inserted so far:
Please keep in mind that the tree is completely "naked" until the Tasks with the PIDs 1, 2 and 3 are inserted, because the other nodes are dependent of them.
What I did so far
I've written a Qt-C++ Code with three rough components:
TaskTree which holds a Root-Node (a node without any task-data)
TaskNode which has a field to hold the task-data and a QList<TaskNode> which is, in simple terms, a vector of TaskNodes to reference childnodes
Task has the related attributes (like pid and ppid)
It is no problem to insert a TaskNode if the parentnode already exists.
This only works though in a perfect world, in which the Tasks are sorted upon their respective dependencies AND there's a determined amount of Tasks to be added.
I don't have to tell you that such a scenario is highly unlikely though, so the tree creation has to memorize any orphaned node (which is a node that doesn't have a parent yet, duh).
I've tackled this "memorization" in different ways, but failed alltogether because I couldn't wrap my head around the algorithms behind it.
The two most promising thoughts I had were these:
Insert every orphaned node into a vector. Upon inserting a parentnode, check if it has children in the Orphan-Vector and reconcile. Do this recursively for the newly created subtree to match all possiblities.
Assign the PPID to the tree's RootNode, being 0 for the most top one. When an orphaned node appears, create a new TaskTree, assign the PPID of the orphan to the newly created tree and add the orphan to it.
This creates subtrees which can be quit intricate themselfs if several orphans match one of the trees. After each inserted Node, try to reconcile the subtrees to the root-tree.
Unfortunately I had to give up continuing those two concepts due to several spontaneous SIGSEGV's and other problems occuring because of the recursions etc.
So in the end I'm here trying to find a way to actually make this work without cutting down the complexicity of the problem through assumptions and other cheats...
Do you guys and gals have an idea which algorithm I could use for this problem or what category of problem this even is?
Approach 2 is the right one to take. The pieces that you are missing are that you need an unordered_map called node_needed that maps as yet unseen parent nodes to a vector of child trees that are waiting for it. You need a similar one mapping node_seen to the associated tree for nodes that have been seen.
Then when you see a node you perform the following:
Create TaskTree with only this node.
Add this TaskTree to the node_seen map
If this node's ID is in the parent_needed map:
Add each tree in the parent_needed map to this tree
Remove this node's ID from the parent_needed map
If this node has no parent:
Add this node's tree to the root tree
Else if this node's parent ID is in the node_seen map:
Add this node's tree to the parent tree
Else if this node's parent_ID is in the parent_needed map:
Append this node's tree to the parent_needed vector
Else:
Create a vector containing this node's tree
Add a mapping from this node's parent ID to that vector in the parent_needed map
Assuming no bugs (HAH! Bugs are part of life...), this should work.
After some deliberate design changes, I've come up with - what I think - the easiest way to implement this:
InsertTask(Task newTask)
{
Task parentTask = searchTreeForParent(newTask->ppid)
If (parentTask not found)
{
parentTask = treeRootNode;
}
If (treeRootNode has children)
{
For (every children in treeRootNode: child)
{
If (child->ppid != treeRootNode->pid AND child->ppid == newTask->pid)
{
newTask->addChild(child)
treeRootNode->remove(child)
}
}
}
parentTask->addChild(newTask)
}
The algorithm behind it is pretty easy: You add the new Tasks to the root node if there is no parent node yet and at the same time check if the newly added Task has potential children in the root node (because those orphaned ones were added to the root node before).
So if you actually insert all the Tasks to fulfill the dependencies, you end up with a complete and valid tree.
If you don't supply all the parent nodes, you end up with some of the branches being complete and valid and a bunch of orphaned ones in the root node.
But that's no problem because there is an easy trick to differenciate between a complete branch and orphans: just check if the ppid equals the root node's pid and voila, you output only those branches that are complete.
I am thinking about using this STL-like tree library for C++ http://tree.phi-sci.com/ to store hierarchical data (think organisation chart).
In my case the tree only contains the structure, the 'payload' of each node is stored elsewhere. So it will probably end up as a tree<int> or a tree<simple_class_containing_a_couple_of_ints>
I would like to find the best way to persist the tree. To be more specific I would like to find the best way to persist the tree to a SQL database so it can be loaded back into the application on startup.
So my question is: How can I persist a tree contained in a tree.hh container to a SQL database?
Note: It is not necessary to store it as a tree structure in the database (i.e. no need for nested set, adjacency list). There is no need to query the database as the whole tree will be loaded into memory.
UPDATE:
I have found this class as an alternative to tree.hh here: http://stlplus.sourceforge.net/stlplus3/docs/ntree.html
I cannot comment yet on any performance differences, but it mostly implements what I need and has a persistence class (sorry no link as not enough reputation) that I can dump to a BLOB. I haven't entered this as an answer yet because I am still interested in any alternative solutions.
I would persist each node in one SQL table (one row per node) and perhaps each node -> sibling relation in another table.
I am not sure SQL is the best way to persist. You could consider using JSON.
I have done a search for all nodes that have an attribute containing (substring) a String. These nodes can be found at different levels of the tree, sometimes 5 or 6 levels deep. I'd like to know what parent/ancestor node they correspond to at a specified level, 2 levels deep. The result for the search only should be much greater than the results for the corresponding parents.
EDIT to include code:
/xs:schema/xs:element/descendant::node()/#*[starts-with(., 'my-search-string-here')]
EDIT to clarify my intent:
When I execute the Xpath above sometimes the results are
/xs:schema/xs:element/xs:complexType/xs:attribute or
/xs:schema/xs:element/xs:complexType/xs:sequence/xs:element or
/xs:schema/xs:element/xs:complexType/xs:complexContent/xs:extension/xs:sequence/xs:element
These results indicate a place in the Schema where I have added application specific code. However, I need to remove this code now. I'm building an "adapter" schema that will redefine the original Schema (untouched) and import my schema. The String I am searching for is my prefix. What I need is the #name of the /xs:schema/node() in which the prefix is found, so I can create a new schema defining these elements. They will be imported into the adapter and redefine another schema (that I'm not supposed to modify).
To reiterate, I need to search all the attributes (descendants of /xs:schema/xs:element) for a prefix, and then get the corresponding /xs:schema/xs:element/#name for each of the matches to the search.
To reiterate, I need to search all the attributes (descendants of /xs:schema/xs:element) for a prefix, and then get the corresponding /xs:schema/xs:element/#name for each of the matches to the search.
/
xs:schema/
xs:element
[descendant::*/#*[starts-with(., 'my-search-string-here')]]/
#name
This should do it:
/xs:schema/xs:element[starts-with(descendant::node()/#*, 'my-search-string-here')]
You want to think of it as
select the xs:elements which contain a node with a matching attribute
rather than
select the matching attributes of descendant nodes of xs:elements, then work back up
As Eric mentioned, I need to change my thought process to select the xs:elements which contain a node with a matching attribute rather than select the matching attributes of descendant nodes of xs:elements, then work back up. This is critical. However, the code sample he posted to select the attributes does not work, we need to use another solution.
Here is the code that works to select an element that contains and attribute containing* (substring) a string.
/xs:schema/child::node()[descendant::node()/#*[starts-with(., 'my-prefix-here')]]
Currently, I am working on the migration mentioned in the title line. Problem is application configuration that is kept in registry has a tree like structure, for example:
X
|->Y
|->Z
|->SomeKey someValue
W
|->AnotherKey anotherValue
and so on.
How can I model this structure in SQLite (or any other DB)? If you have experience in similar problems, please send posts. Thanks in advance.
Baris, this structure its similar to a directory/file structure.
You can model this with a simple parent<>child relationship on the directories and key value pairs relatade to the directory.
Something like
Directory:
id integer auto_increment;
name string not null;
parent_id integer not null default 0;
Property:
id integer auto_increment;
key string;
value string;
directory_id integer not null;
With this you can address the root directories searching for directories with parent_id=0, child directories by looking at WHERE parent_id=someid and for properties on that looking for directory_id=someid.
Hope this helps :)
Representing hierarchies in a relational database is pretty easy. You just use self-references. For example, you have a category table that has a field called ParentCategoryId that is either null ( for leaf categories) or the id of the parent category. You can even setup foreign keys to enforce valid relationships. This kind of structure is easy to traverse in code, usually via recursion, but a pain when it comes to writing sql queries.
One way around this for a registry clone is to use the registry key path as the key. That is, have an entry where Path is "X/Y/Z/SomeKey" and Value is "someValue". This will query easier but may not express the hierarchy the way you might like. That is, you will only have the values and not the overall structure of the hierarchy.
The bottom line is you have to compromise to map a hierarchy with an unknown number of levels onto a relational database structure.
Self-referencing tables mentioned by previous posters are nice when you want to store an hierarchy, they become less attractive when you start selecting leaves of the tree.
Could you clarify the use-case of retrieving data from the configuration?
Are you going to load the whole configuration at once or to retrieve each parameter separately?
How arbitrary will be the depth of the leaf nodes?