How to concat lists in erlang without creating nested lists? - list

I'm trying to be a good erlanger and avoid "++". I need to add a tuple to the end of a list without creating a nested list (and hopefully without having to build it backwards and reverse it). Given tuple T and lists L0 and L1:
When I use [T|L0] I get [tuple,list0].
But when I use [L0|T], I get nested list [[list0]|tuple]. Similarly, [L0|L1] returns [[list0]|list1].
Removing the outside list brackets L0|[T] produces a syntax error.
Why is "|" not symmetric? Is there a way to do what I want using "|"?

| is not "symmetric" because a non-empty list has a head and a tail where the head is a single item and the tail is another list. In the expression [foo | bar] foo denotes the head of the list and bar is the tail. If the tail is not a proper list, the result won't be a proper list either. If the head is a list, the result will simply be a list with that list as its first element.
There is no way to append at the end of a linked list in less than O(n) time. This is why using ++ for that is generally shunned. If there were special syntax to append at the end of the list, it would still need to take O(n) time and using that syntax wouldn't make you any more of a "good erlanger" than using ++ would.
If you want to avoid the O(n) cost per insertion, you'll need to prepend and then reverse. If you're willing to pay the cost, you might just as well use ++.
A little more detail on how lists work:
[ x | y ] is something called a cons cell. In C terms it's basically a struct with two members. A proper list is either the empty list ([]) or a cons cell whose second member is a proper list (in which case the first member is called its head, and the second member is called its tail).
So when you write [1, 2, 3] this creates the following cons cells: [1 | [2 | [3 | []]]]. I.e. the list is represented as a cons cell whose first member (its head) is 1 and the second member (the tail) is another cons cell. That other cons cell has 2 as its head and yet another cons cell as its tail. That cell has 3 as its head and the empty list as its tail.
Traversing such a list is done recursively by first acting on the head of the list and then calling the traversal function on the tail of the list.
Now if you want to prepend an item to that list, this is very easy: you simply create another cons cell whose head is the new item and whose tail is the old list.
Appending an item however is much more expensive because creating a single cons cell does not suffice. You have to create a list that is the same as the old one, except the tail of the last cons cell must be a new cons cell whose head is the new element and whose tail is the empty list. So you can't append to a list without going through the whole list, which is O(n).

Related

OCaml: list # [x] vs x :: list [duplicate]

I've recently started learning scala, and I've come across the :: (cons) function, which prepends to a list.
In the book "Programming in Scala" it states that there is no append function because appending to a list has performance o(n) whereas prepending has a performance of o(1)
Something just strikes me as wrong about that statement.
Isn't performance dependent on implementation? Isn't it possible to simply implement the list with both forward and backward links and store the first and last element in the container?
The second question I suppose is what I'm supposed to do when I have a list, say 1,2,3 and I want to add 4 to the end of it?
The key is that x :: somelist does not mutate somelist, but instead creates a new list, which contains x followed by all elements of somelist. This can be done in O(1) time because you only need to set somelist as the successor of x in the newly created, singly linked list.
If doubly linked lists were used instead, x would also have to be set as the predecessor of somelist's head, which would modify somelist. So if we want to be able to do :: in O(1) without modifying the original list, we can only use singly linked lists.
Regarding the second question: You can use ::: to concatenate a single-element list to the end of your list. This is an O(n) operation.
List(1,2,3) ::: List(4)
Other answers have given good explanations for this phenomenon. If you are appending many items to a list in a subroutine, or if you are creating a list by appending elements, a functional idiom is to build up the list in reverse order, cons'ing the items on the front of the list, then reverse it at the end. This gives you O(n) performance instead of O(n²).
Since the question was just updated, it's worth noting that things have changed here.
In today's Scala, you can simply use xs :+ x to append an item at the end of any sequential collection. (There is also x +: xs to prepend. The mnemonic for many of Scala's 2.8+ collection operations is that the colon goes next to the collection.)
This will be O(n) with the default linked implementation of List or Seq, but if you use Vector or IndexedSeq, this will be effectively constant time. Scala's Vector is probably Scala's most useful list-like collection—unlike Java's Vector which is mostly useless these days.
If you are working in Scala 2.8 or higher, the collections introduction is an absolute must read.
Prepending is faster because it only requires two operations:
Create the new list node
Have that new node point to the existing list
Appending requires more operations because you have to traverse to the end of the list since you only have a pointer to the head.
I've never programmed in Scala before, but you could try a List Buffer
Most functional languages prominently figure a singly-linked-list data structure, as it's a handy immutable collection type. When you say "list" in a functional language, that's typically what you mean (a singly-linked list, usually immutable). For such a type, append is O(n) whereas cons is O(1).

Prolog: working with lists

I got this assignment with Prolog lists and I need some help.
Build a program in Prolog that
Check if a list is empty
Check if a list is not empty
Check if a list only has one element
Check if a list has 2 or more elements
Get the first element from a list
Get the second element from a list
Get a list without the first element (tail)
Add an element to the head of the list
It sounds like you are the very beginning of prolog. These questions mostly relate to how prolog unifies variables and expressions.
Check if a list is empty
empty([]).
In prolog, you state facts and predicates. Here, you are simply stating that any empty list is true. It is implied that all other expressions are false.
Check if a list is not empty
not_empty([_|_]).
(Improved by lurker). This rule matches a list that has at least a head and zero or more tail elements, so empty list would fail.
Check if a list only has one element
one([_]).
When prolog checks this fact, it can only bind to a list with one element. So the fact it bound already proves it is a one element list.
Check if a list has 2 or more elements
two([_,_|_]).
The first 2 underscores bind to 2 elements in the list, the 3rd underscore to zero or more trailing elements. So this will only evaluate to true on lists with two or more elements.
Get the first element from a list
first([H|_], H).
Prolog will bind H to the first element of the list in the first argument and the second argument. You call it with first([1,2,3],F).. Prolog will bind F to the first element of the list. You can also call it with first([1,2,3],1). to ask if 1 is the first element.
Get the second element from a list
second([_,I|_], I).
Just using simple binding, the first underscore binds with the first element, I with the second element, and the second underscore with the rest of the list, (if any). If you start asking for much higher elements, it is easier to use built-in predicates like nth1 to do the work for you.
Get a list without the first element (tail)
tail([_|T],T).
Prolog binds the tail to T, which must match the second T to be considered true.
Add an element to the head of the list
addelem(H,T,[H|T]).
Just using Prolog binding, the H will be bound to the front of the list in the 3rd argument, and T to the tail of the list. Call with
addelem(1,[2,3,4],T). — Binds T to [1,2,3,4].
addelem(1,[2,3,4],[1,2,3,4]). — Proves that this result is correct.
addelem(H, [2,3,4], [1,2,3,4]). — Pulls the first element of the 3rd argument, if the second argument matches the tail.
addelem(1, T, [1,2,3,4]). — another way of getting the tail, if the head is 1.

Prolog: List translation

I've been trying to figure out what list another list representation corresponds to:
[1|[2|[3|[4|[]]]]] is equivalent to the list [1,2,3,4] and [1,2,3,4|[]] etc.
But I just can't seem to figure out what list this corresponds to:
[[[[[]|1]|2]|3]|4]
If anyone can explain this for me I would be very grateful.
[H|T] is syntactic sugar for the real representation using the . functor: .(H,T) where H is the "head" (one list element), and T is the "tail" (which is itself a list in a standard list structure). So [1|[2|[3|[4|[]]]]] is .(1,.(2,.(3,.(4,[])))). Prolog also allows a non-list value for T, so [[[[[]|1]|2]|3]|4] is .(.(.(.([],1),2),3),4). The second structure doesn't really simplify any further than that from a list notational standpoint.
If you want to think in terms of "list" then [[[[[]|1]|2]|3]|4] is a list whose head is [[[[]|1]|2]|3] and tail is 4. And since 4 is not a list, the original list can be described as "improper" as #z5h indicated since the tail isn't a list (not even the empty list). Drilling down, the head [[[[]|1]|2]|3] is itself a list with head [[[]|1]|2] and tail 3. Another "improper" list. And so on. Therefore, the overall structure is an embedded list of lists, four levels deep, in which each list has a single head and a non-list tail (an "improper" list).
It's interesting to note that some of Prolog's predicates handle this type of list. For example:
append([], 1, L).
Will yield:
L = [[]|1]
You can then build your oddly formed list using append:
append([[]], 1, L1), % append 1 as the tail to [[]] giving L1
append([L1], 2, L2), % append 2 as the tail to [L1] giving L2
append([L2], 3, L3), % append 3 as the tail to [L2] giving L3
append([L3], 4, L4). % append 4 as the tail to [L3] giving L4
Which yields:
L4 = [[[[[]|1]|2]|3]|4]
Each append takes a list of one element (which is itself a list from the prior append, starting with [[]]) and appends a non-list tail to it.
It's called an "improper list".
Let me explain.
I think a few different ideas are at play here.
In general we know what a list is: it's an ordered collection.
How we represent a list literal in a programming language is another facet.
How the list is represented internally in the language, is another facet still.
What you have realized, is that there is a data structure you can represent in Prolog, with similar syntax to a list, but it somehow seems "improper". (It's not clear that you have asked much beyond "what is the meaning of this confusing thing?").
It turns out, that this structure is simply known as an "improper list". This data structure shows up often in languages that store lists internally as nested cons cells or similar. If you Google for that, you'll find plenty of resources for examples, usages, etc.

Writing a list append function in OCaml

I've defined a custom list type as part f a homework exercise.
type 'a myType =
| Item of ('a * 'a myType)
| Empty;;
I've already done 'length' and now I need a 'append' function.
My length function:
let length l =
let rec _length n = function
| Empty -> n
| Item(_, next) -> _length (n + 1) next
in _length 0 l;;
But I really don't know how to make the append function.
let append list1 list2 = (* TODO *)
I can't use the list module so I can't use either :: or #.
I guess my comments are getting too lengthy to count as mere comments. I don't really want to answer, I just want to give hints. Otherwise it defeats the purpose.
To repeat my hints:
a. The second parameter will appear unchanged in your result, so you can just
spend your time worrying about the first parameter.
b. You first need to know how to append something to an empty list. I.e., you need
to know what to do when the first parameter is Empty.
c. You next need to know how to break down the non-empty case into a smaller append
problem.
If you don't know how to create an item, then you might start by writing a function that takes (say) an integer and a list of integers and returns a new list with the integer at the front. Here is a function that takes an integer and returns a list containing just that one integer:
let list1 k =
Item (k, Empty)
One way to think of this is that every time Item appears in your code, you're creating a new item. Item is called a constructor because it constructs an item.
I hope this helps.
Your structure is a list, so you should start by defining a value nil that is the empty list, and a function cons head tail, that appends the head element in front of the list tail.
Another advice: sometimes, it helps a lot to start by taking a simple example, and trying to do it manually, i.e. decomposing what you want to do in simple operations that you do yourself. Then, you can generalize and write the code...

Lists in Haskell : data type or abstract data type?

From what I understand, the list type in Haskell is implemented internally using a linked list. However, the user of the language does not get to see the details of the implementation, nor does he have the ability to modify the "links" that make up the linked list to allow it to point to a different memory address. This, I suppose, is done internally.
How then, can the list type be qualified as in Haskell ? Is it a "data type" or an "abstract data type"? And what of the linked list type of the implementation ?
Additionally, since the list type provided by the Prelude is not a linked list type, how can the basic linked list functions be implemented ?
Take, for example, this piece of code designed to add an element a at the index n of a list :
add [] acc _ _ = reverse acc
add (x:xs) acc 0 a = add xs (x:a:acc) (-1) a
add (x:xs) acc n a = add xs (x:acc) (n-1) a
Using a "real" linked list, adding an element would just consist of modifying a pointer to a memory address. This is not possible in Haskell (or is it ?), thus the question : is my implementation of adding an element to a list the best possible one, or am I missing something (the use of the reverse function is, I think, particularly ugly, but is it possible to do without ?)
Please, do not hesitate to correct me if anything I have said is wrong, and thank you for your time.
You're confusing mutability with data structure. It is a proper list — just not one you're allowed to modify. Haskell is purely functional, meaning values are constant — you can't change an item in a list any more than you could turn the number 2 into 3. Instead, you perform calculations to create new values with the changes you desire.
You could define that function most simply this way:
add ls idx el = take idx ls ++ el : drop idx ls
The list el : drop idx ls reuses the tail of the original list, so you only have to generate a new list up to idx (which is what the take function does). If you want to do it using explicit recursion, you could define it like so:
add ls 0 el = el : ls
add (x:xs) idx el
| idx < 0 = error "Negative index for add"
| otherwise = x : add xs (idx - 1) el
add [] _ el = [el]
This reuses the tail of the list in the same way (that's the el : ls in the first case).
Since you seem to be having trouble seeing how this is a linked list, let's be clear about what a linked list is: It's a data structure consisting of cells, where each cell has a value and a reference to the next item. In C, it might be defined as:
struct ListCell {
void *value; /* This is the head */
struct ListCell *next; /* This is the tail */
}
In Lisp, it's defined as (head . tail), where head is the value and tail is the reference to the next item.
In Haskell, it's defined as data [] a = [] | a : [a], where a is the value and [a] is the reference to the next item.
As you can see, these data structures are all equivalent. The only difference is that in C and Lisp, which are not purely functional, the head and tail values are things you can change. In Haskell, you can't change them.
Haskell is a purely functional programming language. This means no change can be done at all.
The lists are non-abstract types, it's just a linked list.
You can think of them defined in this way:
data [a] = a : [a] | []
which is exactly the way a linked list is defined - A head element and (a pointer to) the rest.
Note that this is not different internally - If you want to have more efficient types, use Sequence or Array. (But since no change is allowed, you don't need to actually copy lists in order to distinguish between copies so, which might be a performance gain as opposed to imperative languages)
In Haskell, "data type" and "abstract type" are terms of art:
A "data type" (which is not abstract) has visible value constructors which you can pattern-match on in case expressions or function definitions.
An "abstract type" does not have visible value constructors, so you cannot pattern match on values of the type.
Given a type a, [a] (list of a) is a data type because you can pattern match on the visible constructors cons (written :) and nil (written []). An example of an abstract type would be IO a, which you cannot deconstruct by pattern matching.
Your code might work, but it's definitely not optimal. Take the case where you want to insert an item at index 0. An example:
add [200, 300, 400] [] 0 100
If you follow the derivation for this, you end up with:
add [200, 300, 400] [] 0 100
add [300, 400] (200:100:[]) (-1) 100
add [400] (300:[200, 100]) (-2) 300
add [] (400:[300, 200, 100]) (-3) 400
reverse [400, 300, 200, 100]
[100, 200, 300, 400]
But we are only adding an item to the beginning of the list! Such an operation is simple! It's (:)
add [200, 300, 400] [] 0 100
100:[200, 300, 400]
[100, 200, 300, 400]
Think about how much of the list really needs to be reversed.
You ask about whether the runtime modifies the pointers in the linked list. Because lists in Haskell are immutable, nobody (not even the runtime) modifies the pointers in the linked list. This is why, for example, it is cheap to append an item to the front of a list, but expensive to append an element at the back of a list. When you append an item to the front of the list, you can re-use all of the existing list. But when you append an item at the end, it has to build a completely new linked list. The immutability of data is required in order for operations at the front of a list to be cheap.
Re: adding an element to the end of a List, I'd suggest using the (++) operator and splitAt function:
add xs a n = beg ++ (a : end)
where
(beg, end) = splitAt n xs
The List is a linked-list, but it's read-only. You can't modify a List in place - you instead create a new List structure which has the elements you want. I haven't read it, but this book probably gets at your underlying question.
HTH
The compiler is free to choose any internal representation it wants for a list. And in practice it does actually vary. Clearly the list "[1..]" is not implemented as a classical series of cons cells.
In fact a lazy list is stored as a thunk which evaluates to a cons cell containing the next value and the next thunk (a thunk is basically a function pointer plus the arguments for the function, which gets replaced by the actual value once the function is called). On the other hand if the strictness analyser in the compiler can prove that the entire list will always be evaluated then the compiler does just create the entire list as a series of cons cells.