Protege: how to express 'not hasNext'? - list

I am currently developing an ontology using protege and would like to determine if a node is a last one of a list. So basically a list points to a node and every node has some content and can have another node:
List startsWith some Node
Node hasContent some Content
Node hasNext some Node
Now I'd like to define a subclass named EndNode that doesn't point to another Node. This is what I've tried so far, but the after classifying, EndNode always equals Nothing:
Node and not(hasNext some Node)
Node and (hasNext exactly 0 Node)

First, there is a built-in List construct in RDF which you can use in the following way:
ex:mylist rdf:type rdf:List .
ex:myList rdf:first ex:firstElement .
ex:myList rdf:rest _:sublist1 .
_:sublist1 rdf:first ex:SecondElement .
_:sublist1 rdf:rest rdf:nil .
Here, in order to know you reach the end of the list, you need a special list called rdf:nil. This plays the same role as a null pointer at the end of a linked list in programming languages.
However, even though rdf:List is well used in existing data on the Web, it doesn't constrain in any way the use of the predicates rdf:first and rdf:rest, so you can have many first elements for a given list without triggering an inconsistency.
So, if you really want to model linked list in a strict way, you need pretty expressive features of OWL. I did it a while ago and it can be found at http://purl.org/az/List.
It's normal that you have an empty class as you specified that a Node must have a nextNode. You should not impose that Nodes have content or next element. You should rather say that the cardinality is maximum 1, that the domain and range of hasNext is Node, and that EndNode is a node with no next node. But it's still not enough, as it does not impose that there is an EndNode at all. You may have an infinite sequence or a loop.
If you want to avoid loops or infinite sequence, you have to define the transitive property hasFollower and say that there is at least a follower in the class EndNode.
All in all, implementing strict lists in OWL completely sucks in term of performance and is most of the time totally useless as rdf:List is sufficient for the wide majority of the situations.

Related

Should B-Tree nodes contain a pointer to their parent (C++ implementation)?

I am trying to implement a B-tree and from what I understand this is how you split a node:
Attempt to insert a new value V at a leaf node N
If the leaf node has no space, create a new node and pick a middle value of N and anything right of it move to the new node and anything to the left of the middle value leave in the old node, but move it left to free up the right indices and insert V in the appropriate of the now two nodes
Insert the middle value we picked into the parent node of N and also add the newly created node to the list of children of the parent of N (thus making N and the new node siblings)
If the parent of N has no free space, perform the same operation and along with the values also split the children between the two split nodes (so this last part applies only to non-leaf nodes)
Continue trying to insert the previous split's middle point into the parent until you reach the root and potentially split the root itself, making a new root
This brings me to the question - how do I traverse upwards, am I supposed to keep a pointer of the parent?
Because I can only know if I have to split the leaf node when I have reached it for insertion. So once I have to split it, I have to somehow go back to its parent and if I have to split the parent as well, I have to keep going back up.
Otherwise I would have to re-traverse the tree again and again each time to find the next parent.
Here is an example of my node class:
template<typename KEY, typename VALUE, int DEGREE>
struct BNode
{
KEY Keys[DEGREE];
VALUE Values[DEGREE];
BNode<KEY, VALUE, DEGREE>* Children[DEGREE + 1];
BNode<KEY, VALUE, DEGREE>* Parent;
bool IsLeaf;
};
(Maybe I should not have an IsLeaf field and instead just check if it has any children, to save space)
Even if you don't use recursion or an explicit stack while going down the tree, you can still do it without parent pointers if you split nodes a bit sooner with a slightly modified algorithm, which has this key characteristic:
When encountering a node that is full, split it, even when it is not a leaf.
With this pre-emptive splitting algorithm, you only need to keep a reference to the immediate parent (not any other ancestor) to make the split possible, since now it is guaranteed that a split will not lead to another, cascading split more upwards in the tree. This algorithm requires that the maximum degree (number of children) of the B-tree is even (as otherwise one of the two split nodes would have too few keys to be considered valid).
See also Wikipedia which describes this alternative algorithm as follows:
An alternative algorithm supports a single pass down the tree from the root to the node where the insertion will take place, splitting any full nodes encountered on the way preemptively. This prevents the need to recall the parent nodes into memory, which may be expensive if the nodes are on secondary storage. However, to use this algorithm, we must be able to send one element to the parent and split the remaining 𝑈−2 elements into two legal nodes, without adding a new element. This requires 𝑈 = 2𝐿 rather than 𝑈 = 2𝐿−1, which accounts for why some textbooks impose this requirement in defining B-trees.
The same article defines 𝑈 and 𝐿:
Every internal node contains a maximum of 𝑈 children and a minimum of 𝐿 children.
For a comparison with the standard insertion algorithm, see also Will a B-tree with preemptive splitting always have the same height for any input order of keys?
You don't need parent pointers if all your operations start at the root.
I usually code the insert recursively, such that calling node.insert(key) either returns null or a new key to insert at its parent's level. The insert starts with root.insert(key), which finds the appropriate child and calls child.insert(key).
When a leaf node is reached the insert is performed, and non-null is returned if the leaf splits. The parent would then insert the new internal key and return non-null if it splits, etc. If root.insert(key) returns non-null, then it's time to make a new root

an algorithm to find if a FORKED list is circular

a forked list is kind of a linked list that there exists a node (at least) that has two 'next's, next1 and next2. i.e., each node is made up of 3 attributes - value, next1 and next2. for each node apart from at least one, next2 may be None.
I am looking for an algorithm that takes a forked list as an input, it returns True if there's a circle in the list and otherwise returns False.
p.s. we can assume that the list starts and ends like a normal linked list (with only one next) and the node(s) which has 2 'next's is one of the inner nodes.
Treat the "forked tree" as a directed graph and look for circuits in the graph, e.g. Johnson's algorithm
Treat the "forked tree" as a a set of dependencies from an item to directly linked next items. A Topological sort of the items will fail by detecting any circularities.

Is there a way to access non leaf nodes in a C++ Boost rtree

Sorry in advance, this a very specific question and I cannot provide any piece of code as this is for my job, thus confidential.
I am using the Boost R-trees, and an algorithm that I need to implement requires to access the non leaf nodes of the tree. With Boost rtree library, I only can access leaf nodes in an easy way. I noticed that there is a function to print all the nodes including the non leaf nodes (which means they exist, they are computed), with their position, their level in the tree etc, but I cannot access them the same way than the leaf nodes.
For now, the best solution that I have is to implement a visitor for the tree and overload the operator () to gather the nodes (this is what the print method does to access the nodes).
My question is, does anybody know an easier way to access the non leaf nodes ? Because this one does not seem to be efficient, and I'm loosing time each time I want to access a non leaf node. Moreover, I need to replicate the structure of the tree without the points, and I cannot do that if I cannot access the non leaf nodes.
Thank you in advance !
I don't know what would you like to do exactly so this will be a general answer.
In order to access the tree nodes for the first time you have to traverse the tree structure. In Boost.Geometry rtree visitor pattern is used for that. You could do it manually but internally Boost.Variant is used to represent the nodes so you'll end up with variant visitor instead. At this point you have a few options depending what are you going to do with the nodes. Are you going to modify the r-tree? Will the rtree be moved in memory? Will the addresses of nodes change? How many nodes are you going to access? Do you want to store some kind of reference to a node and traverse the tree structure from that point? Do you want to traverse the structure downward or upward?
One option as you noticed is to traverse the tree structure each time. This is a good approach if the tree structure can change. The obvious drawback is that you have to check all child nodes at each node using some condition (whatever you do in order to pick the node of interest).
If the tree structure does not change but the tree is copied to a different place in memory you can represent the node as a path from the root to the node of interest as list of indexes of child nodes. E.g. a list {1, 2, 3} meaning: traverse the tree using child node 1 of root node, then at the next level pick child node 2, then your node will be child node 3 at the next level. In this case you still have to traverse the tree but doesn't have to check conditions again.
If the tree does not change and nodes stays in the same place in memory you can simply use pointers or references.

How to properly use sentinel nodes?

There will be multiple (closely) related questions instead of a single one. For our comfort, I will number them accordingly.
Based on this Wikipedia article, this question and lectures, I think I already understand the idea behind sentinel nodes and their usage in linked lists. However, a few things are still not clear to me even after reading these materials.
I was given a basic implementation of a doubly linked list (it stores only int values) and the task is to change the implementation so it uses a sentinel node like this:
Illustrative image (not allowed to embed images yet, sorry)
Question 1
I am assuming that the head variable of the list will point to the first real node (the one after sentinel node) and the tail variable will simply point to the last node. Am I correct or should the head point to the sentinel node? I am asking for a best-practice or the most standard approach here.
Question 2
I understand that when searching for a value in the list, I no longer have to check for nullptr since I am using a sentinel node. Since the list basically formed a circle thanks to the sentinel node, I have to terminate it after iterating through the whole list and reaching it. Can I do it by putting the value I am looking for in the sentinel node and use it as a sentinel value of sorts and then check if the result is returned from the sentinel node when the loop ends? Some sources claim that sentinel nodes should not store any values at all. Is my approach correct/reasonably effective?
Question 3
When simply iterating and not searching for a particular value (e.g. counting nodes, outputting the whole list into the console), do I have to check for the sentinel node the same way as I would for a nullptr (to terminate the iterating loop) or is there a different or smarter way of doing this?
Answer 1
Yes this is a valid position for the sentinelnode to take. head and tail can point to the actual beginning and end of the data. But your add and delete functions will need to be aware of the aberrations caused at the list boundaries by virtue of the sentinel node
Answer 2
Yes this is a valid search strategy and is infact called the Elephant in Cairo technique
Answer 3
Yes, the purpose of the sentinel node is to let you know that it is the sentinel node. You could just maintain a constant pointer (or whatever your lang of choice supports) to this sentinel node to check if you are at the sentinel node or just stick a flag in the node.

Prolog: Graph representation of list and difference list

I've been trying to understand how a list and a difference list would look like in a graph structure. I understand the basic structure of a list like [a1,a2,a3,..an|[]].
But I can't grasp how a difference list would look like?
like for example [1,2,3,4]-[3,4]
X-Y is the term -(X, Y). So [1,2,3,4]-[3,4] is primarily just a pair of two lists, each of which can be readily displayed in a tree like the one you show.
Consider now a list of the form [E1,E2,...,E_n|Rest], that is, where the final tail is not yet instantiated. Again, you can readily display this in a tree like the one you show, just replace end-of-list (which is wrong anyway, because it should actually be the atom nil: []) by Rest.
The idea is now to always keep track of the tail that is not yet instantiated, which is a single logical variable. By instantiating this variable, again to a list whose tail is not yet instantiated and of whose tail you again keep track separately, you can always append further elements, in time independent of the length that the original list has already reached.
You can represent such a list and its final tail as the pair [E1,E2,...,E_n|Rest]-Rest, but it's actually preferable to use two different arguments and pass around the list and its uninstantiated final tail as two separate arguments (explanation).