I've been trying to understand how a list and a difference list would look like in a graph structure. I understand the basic structure of a list like [a1,a2,a3,..an|[]].
But I can't grasp how a difference list would look like?
like for example [1,2,3,4]-[3,4]
X-Y is the term -(X, Y). So [1,2,3,4]-[3,4] is primarily just a pair of two lists, each of which can be readily displayed in a tree like the one you show.
Consider now a list of the form [E1,E2,...,E_n|Rest], that is, where the final tail is not yet instantiated. Again, you can readily display this in a tree like the one you show, just replace end-of-list (which is wrong anyway, because it should actually be the atom nil: []) by Rest.
The idea is now to always keep track of the tail that is not yet instantiated, which is a single logical variable. By instantiating this variable, again to a list whose tail is not yet instantiated and of whose tail you again keep track separately, you can always append further elements, in time independent of the length that the original list has already reached.
You can represent such a list and its final tail as the pair [E1,E2,...,E_n|Rest]-Rest, but it's actually preferable to use two different arguments and pass around the list and its uninstantiated final tail as two separate arguments (explanation).
Related
If I was given a set of lists within a list in Ocaml, for example [[3;1;3]; [4]; [1;2;3]], then how can we implement a function to return a list that is a union of the values of the nested list (so the output from the example will return [1;2;3;4])? I tried removing duplicates from the list, but it didn't work as intended. I am also restricted to using the List module only.
Restricted to using the List module only? Sounds like homework with an arbitrary limit like that. so I don't want to give a fully working solution. However, if you look through the List documentation, you'll see a couple of functions that can be combined to do what you want.
concat, which takes a list of lists and flattens them out into a single list, and sort_uniq, which sorts a list and removes duplicates.
So you just have to take your list of lists, turn it into a single list, and sort_uniq that (With an appropriate comparison function) to get your desired results.
let's say we have a list of elements:
[(a,b); (c,d); (e,f)]
What function would check if element (lets say A, where A=(x,y)) is in the list or not?
Use List.mem to do the search for a match.
let a = (3,4)
List.mem a [(1,2); (3,4); (5,6)]
You can also use List.memq if you want to check if the two items both reference the same entity in memory.
Here's a hint on how to write this yourself. The natural way to to process a list is to initially look at the first element, then check the remainder of the list recursively until you have an empty list. For this problem, you could state the algorithm in English as follows:
If the list is empty then the item is not in the list,
else if the first list element equals the item then it is in,
else it is the answer to (Is the item in the remainder of the list?)
Now you just need to translate this into OCaml code (using a recursive function).
In general, if you can describe what you want to do in terms of smaller or simpler parts of the problem, then writing the recursive code is straightforward (although you have to be careful the base cases are correct). When using a list or tree-structured data the way to decompose the problem is usually obvious.
I've always thought that appending a list to another one meant copying the objects from the first list and then pointing to the appended list as described for example here.
However, in this blog post and in its comment, it says that it is only the pointers that are copied and not the underlying objects.
So what is correct?
Drawing from Snowbear's answer, a more accurate image of combining two lists (than the one presented in the first referred article in the question) would be as shown below.
let FIRST = [1;2;3]
let SECOND = [4;5;6]
let COMBINED = FIRST # SECOND
In the functional world, lists are immutable. This means that node sharing is possible because the original lists will never change. Because the first list ends with the empty list, its nodes must be copied in order to point its last node to the second list.
If you mean this statement then the answer is seems to be pretty simple. Author of the first article is talking about list node elements when he says nodes. Node element is not the same as the list item itself. Take a look at the pictures in the first article. There are arrows going from every element to the next node. These arrows are pointers. But integer type (which is put into the list) has no such pointers. There is probably some list node type which wraps those integers and stores the pointers. When author says that nodes must be copies he is talking about these wrappers being copied. The underlying objects (if they were not value types as in this case) would not be cloned, new wrappers will point to the same object as before.
F# lists hold references (not to be confused with F#'s ref) to their elements; list operations copy those references (pointers), but not the elements themselves.
There are two ways you might append items to an existing list, which is why there seems to be a discrepancy between the articles (though they both look to be correct):
Cons operator (::): The cons operator prepends a single item to an F# list, producing a new list. It's very fast (O(1)), since it only needs to call a very simple constructor to produce the new list.
Append operator (#): The append operator appends two F# lists together, producing a new list. It's not as fast (O(n)) because in order for the elements of the combined list to be ordered correctly, it needs to traverse the entire list on the left-hand-side of the operator (so copying can start at the first element of that list). You'll still see this used in production if the list on the left-hand-side is known to be very small, but in general you'll get much better performance from using ::.
I am currently developing an ontology using protege and would like to determine if a node is a last one of a list. So basically a list points to a node and every node has some content and can have another node:
List startsWith some Node
Node hasContent some Content
Node hasNext some Node
Now I'd like to define a subclass named EndNode that doesn't point to another Node. This is what I've tried so far, but the after classifying, EndNode always equals Nothing:
Node and not(hasNext some Node)
Node and (hasNext exactly 0 Node)
First, there is a built-in List construct in RDF which you can use in the following way:
ex:mylist rdf:type rdf:List .
ex:myList rdf:first ex:firstElement .
ex:myList rdf:rest _:sublist1 .
_:sublist1 rdf:first ex:SecondElement .
_:sublist1 rdf:rest rdf:nil .
Here, in order to know you reach the end of the list, you need a special list called rdf:nil. This plays the same role as a null pointer at the end of a linked list in programming languages.
However, even though rdf:List is well used in existing data on the Web, it doesn't constrain in any way the use of the predicates rdf:first and rdf:rest, so you can have many first elements for a given list without triggering an inconsistency.
So, if you really want to model linked list in a strict way, you need pretty expressive features of OWL. I did it a while ago and it can be found at http://purl.org/az/List.
It's normal that you have an empty class as you specified that a Node must have a nextNode. You should not impose that Nodes have content or next element. You should rather say that the cardinality is maximum 1, that the domain and range of hasNext is Node, and that EndNode is a node with no next node. But it's still not enough, as it does not impose that there is an EndNode at all. You may have an infinite sequence or a loop.
If you want to avoid loops or infinite sequence, you have to define the transitive property hasFollower and say that there is at least a follower in the class EndNode.
All in all, implementing strict lists in OWL completely sucks in term of performance and is most of the time totally useless as rdf:List is sufficient for the wide majority of the situations.
I've noticed that in functional languages such as Haskell and OCaml you can do 2 actions with lists. First you can do x:xs where x is an element ans xs is a list and the resulting action is we get a new list where x is appended to the beginning of xs in constant time. Second is x++y where both x and y are lists and the resulting action is we get a new list where y gets appended to the end of x in linear time with respect to the number of elements in x.
Now I'm no expert in how languages are designed and compilers are built, but this seems to me a lot like a simple implementation of a linked list with one pointer to the first item. If I were to implement this data structure in a language like C++ I would find it to be generally trivial to add a pointer to the last element. In this case if these languages were implemented this way (assuming they do use linked lists as described) adding a "pointer" to the last item would make it much more efficient to add items to the end of a list and would allow pattern matching with the last element.
My question is are these data structures really implemented as linked lists, and if so why do they not add a reference to the last element?
Yes, they really are linked lists. But they are immutable. The advantage of immutability is that you don't have to worry about who else has a pointer to the same list. You might choose to write x++y, but somewhere else in the program might be relying on x remaining unchanged.
People who work on compilers for such languages (of whom I am one) don't worry about this cost because there are plenty of other data structures that provide efficient access:
A functional queue represented as two lists provides constant-time access to both ends and amortized constant time for put and get operations.
A more sophisticated data structure like a finger tree can provide several kinds of list access at very low cost.
If you just want constant-time append, John Hughes developed an excellent, simple representation of lists as functions, which provides exactly that. (In the Haskell library they are called DList.)
If you're interested in these sorts of questions you can get good info from Chris Okasaki's book Purely Functional Data Structures and from some of Ralf Hinze's less intimidating papers.
You said:
Second is x++y where both x and y are
lists and the resulting action is y
gets appended to the end of x in
linear time with respect to the number
of elements in x.
This is not really true in a functional language like Haskell; y gets appended to a copy of x, since anything holding onto x is depending on it not changing.
If you're going to copy all of x anyway, holding onto its last node doesn't really gain you anything.
Yes, they are linked lists. In languages like Haskell and OCaml, you don't add items to the end of a list, period. Lists are immutable. There is one operation to create new lists — cons, the : operator you refer to earlier. It takes an element and a list, and creates a new list with the element as the head and the list as the tail. The reason x++y takes linear time is because it must cons the last element of x with y, and then cons the second-to-last element of x with that list, and so on with each element of x. None of the cons cells in x can be reused, because that would cause the original list to change as well. A pointer to the last element of x would not be very helpful here — we still have to walk the whole list.
++ is just one of dozens of "things you can do with lists". The reality is that lists are so versatile that one rarely uses other collections. Also, we functional programmers almost never feel the need to look at the last element of a list - if we need to, there is a function last.
However, just because lists are convenient this does not mean that we do not have other data structures. If you're really interested, have a look at this book http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf (Purely Functional Data Structures). You'll find trees, queues, lists with O(1) append of an element at the tail, and so forth.
Here's a bit of an explanation on how things are done in Clojure:
The easiest way to avoid mutating state is to use immutable data structures. Clojure provides a set of immutable lists, vectors, sets and maps. Since they can't be changed, 'adding' or 'removing' something from an immutable collection means creating a new collection just like the old one but with the needed change. Persistence is a term used to describe the property wherein the old version of the collection is still available after the 'change', and that the collection maintains its performance guarantees for most operations. Specifically, this means that the new version can't be created using a full copy, since that would require linear time. Inevitably, persistent collections are implemented using linked data structures, so that the new versions can share structure with the prior version. Singly-linked lists and trees are the basic functional data structures, to which Clojure adds a hash map, set and vector both based upon array mapped hash tries.
(emphasis mine)
So basically it looks you're mostly correct, at least as far as Clojure is concerned.