Adding elements to an immutable list in Scala - list

In Scala the way you add elements to an immutable list as follows:
val l = 1 :: 2 :: Nil
l: List[Int] = List(1, 2)
What this means is you first create a Nil (Empty) List, and to that you add 2 and then 1. i.e. These operations are right-associated. So, effectively, it can be re-written in a clearer way, like so:
val l = (1 :: (2 :: Nil))
l: List[Int] = List(1, 2)
The question is, if List is supposed to preserve the order of insertion, and if 2 is added first to an empty list and then 1 is added, then why is the answer not l: List[Int] = List(2, 1) ??

This is because elements are prepended: first 2 then 1.
From definition of cons method:
def ::[B >: A] (x: B): List[B] =
new scala.collection.immutable.::(x, this)
here you can see that each time new instance of case class scala.collection.immutable.:: is created:
case class ::[B](val head: B, var tail: List[B]) extends List[B]
you just use your new element as a head for new list and your whole previous list as its tail.
Also prepend operation for immutable List takes constant time O(1), append is linear O(n) (from Scala docs).

It's just convention. Lists are basically stacks. It's most efficient to access or modify the most-recently added items. You could just as well consider the head of the list to be the final item ordinally, in which case, your suggested notation would be appropriate.
I would speculate that the reason for the convention is that we don't typically put much care into how a list was constructed, but we do often want to consider the first item accessed to be the initial item in the ordering, and so the notation reflects that.

Related

Append an element at the end of a list [duplicate]

This question already has answers here:
how to populate existing list / array
(2 answers)
Closed 4 years ago.
How to append an element at the end of a list in ReasonML (the equivalent of Array.concat in JavaScript)?
While Neil's answer is technically correct, it glosses over some details that you might want to consider before reaching for append; specifically that while adding an element to the beginning of a list is very cheap, adding an element to the end is very expensive.
To understand why, let's look at how a list is defined and constructed. The (conceptual) definition of a list is:
// Reason
type list('a) = Cons('a, list('a)) | Nil;
(* OCaml *)
type 'a list = Cons of 'a* 'a list | Nil
where Nil represents the end of a list (and by itself an empty list) and Cons represents a node in the list, containing an element of type 'a and a pointer to the rest of the list (list('a), OCaml: 'a list).
If we took away all the syntax sugar and every helper function, you would have to construct a list like this:
// Reason
let myList = Cons(1, Cons(2, Cons(3, Nil)));
(* OCaml *)
let myList = Cons (1, Cons (2, Cons (3, Nil)))
To add an element to the head of this list then, we construct a node containing our new element and a pointer to the old list:
// Reason
let myBiggerList = Cons(0, myList);
(* OCaml *)
let myBiggerList = Cons (0, myList)
This is exactly the same as doing [0, ...myList] (OCaml: 0 :: myList). If myList could change we wouldn't be able to do this, of course, but we know it doesn't since lists are immutable. That makes this very cheap, and for the same reason it's just as cheap to pop the head off, which is why you'll usually see list processing functions implemented using recursion, like this:
// Reason
let rec map = f =>
fun | [] => []
| [x, ...xs] => [f(x), ...map(f, xs)];
(* OCaml *)
let rec map f = function
| [] -> []
| x::xs -> (f x) :: (map f xs)
Ok, so then why is it so expensive to add an element to the tail of the list? If you look back at myList, adding an element to the end means replacing the final Nil with, say, Cons(4, Nil). But then we need to replace Cons(3, ...) since that points to the old Nil, and Cons(2, ...) because it point to the old Cons(3, ...), and so on through the entire list. And you have to do that every time you add an element. That quickly adds up.
So what should you do instead?
If you're adding to the end and either just iterating through it or always taking elements off the end, like you often would in JavaScript, you cam most likely just reverse your logic. Instead of adding to and taking off the end, add to and take off the beginning.
If you actually need a FIFO data structure, where elements are inserted at one end and taken off at the other, consider using a Queue instead. In general, have a look at this comparison of the performance characteristics of the standard containers.
Or if this is all a bit much and you'd really just like to do it like you're used to from JavaScript, just use an array instead of a list. You'll find all the functions you're familiar with in the Js.Array module
You can use List.append or the # operator which is shorthand for List.append.
let lstA = [ 1 ];
let lstB = lstA # [ 2 ];
let lstC = List.append(lstB, [ 3 ]);
Here is the documentation for List methods: https://reasonml.github.io/api/List.html
See a playground link here: https://reasonml.github.io/en/try.html?reason=DYUwLgBMDOYIIQLwQNoQIwQLoG4BQokMYAQklLAgAKoQBM2+hFYAwuQDICWsAdAIYAHQSAB2AEwAUxEgBpaAZmwBKfEA

Understanding Prolog's empty lists

I am reading Bratko's Prolog: Programming for Artificial Intelligence. The easiest way for me to understand lists is visualising them as binary trees, which goes well. However, I am confused about the empty list []. It seems to me that it has two meanings.
When part of a list or enumeration, it is seen as an actual (empty) list element (because somewhere in the tree it is part of some Head), e.g. [a, []]
When it is the only item inside a Tail, it isn’t an element it literally is nothing, e.g. [a|[]]
My issue is that I do not see the logic behind 2. Why is it required for lists to have this possible ‘nothingness’ as a final tail? Simply because the trees have to be binary? Or is there another reason? (In other words, why is [] counted as an element in 1. but it isn't when it is in a Tail in 2?) Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?
In other words, why is [] counted as an element in 1. but it isn't when it is in a Tail in 2?
Those are two different things. Lists in Prolog are (degenerate) binary trees, but also very much like a singly linked list in a language that has pointers, say C.
In C, you would have a struct with two members: the value, and a pointer to the next list element. Importantly, when the pointer to next points to a sentinel, this is the end of the list.
In Prolog, you have a functor with arity 2: ./2 that holds the value in the first argument, and the rest of the list in the second:
.(a, Rest)
The sentinel for a list in Prolog is the special []. This is not a list, it is the empty list! Traditionally, it is an atom, or a functor with arity 0, if you wish.
In your question:
[a, []] is actually .(a, .([], []))
[a|[]] is actually .(a, [])
which is why:
?- length([a,[]], N).
N = 2.
This is now a list with two elements, the first element is a, the second element is the empty list [].
?- [a|[]] = [a].
true.
This is a list with a single element, a. The [] at the tail just closes the list.
Question: what kind of list is .([], [])?
Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?
Yes, you can leave a free variable there; then, you have a "hole" at the end of the list that you can fill later. Like this:
?- A = [a, a|Tail], % partial list with two 'a's and the Tail
B = [b,b], % proper list
Tail = B. % the tail of A is now B
A = [a, a, b, b], % we appended A and B without traversing A
Tail = B, B = [b, b].
You can also make circular lists, for example, a list with infinitely many x in it would be:
?- Xs = [x|Xs].
Xs = [x|Xs].
Is this useful? I don't know for sure. You could for example get a list that repeats a, b, c with a length of 7 like this:
?- ABCs = [a,b,c|ABCs], % a list that repeats "a, b, c" forever
length(L, 7), % a proper list of length 7
append(L, _, ABCs). % L is the first 7 elements of ABCs
ABCs = [a, b, c|ABCs],
L = [a, b, c, a, b, c, a].
In R at least many functions "recycle" shorter vectors, so this might be a valid use case.
See this answer for a discussion on difference lists, which is what A and Rest from the last example are usually called.
See this answer for implementation of a queue using difference lists.
Your confusion comes from the fact that lists are printed (and read) according to a special human-friendly format. Thus:
[a, b, c, d]
... is syntactic sugar for .(a, .(b, .(c, .(d, [])))).
The . predicate represents two values: the item stored in a list and a sublist. When [] is present in the data argument, it is printed as data.
In other words, this:
[[], []]
... is syntactic sugar for .([], .([], [])).
The last [] is not printed because in that context it does not need to. It is only used to mark the end of current list. Other [] are lists stored in the main list.
I understand that but I don't quite get why there is such a need for that final empty list.
The final empty list is a convention. It could be written empty or nil (like Lisp), but in Prolog this is denoted by the [] atom.
Note that in prolog, you can leave the sublist part uninstantiated, like in:
[a | T]
which is the same as:
.(a, T)
Those are known as difference lists.
Your understanding of 1. and 2. is correct -- where by "nothing" you mean, element-wise. Yes, an empty list has nothing (i.e. no elements) inside it.
The logic behind having a special sentinel value SENTINEL = [] to mark the end of a cons-cells chain, as in [1,2,3] = [1,2|[3]] = [1,2,3|SENTINEL] = .(1,.(2,.(3,SENTINEL))), as opposed to some ad-hoc encoding, like .(1,.(2,3)) = [1,2|3], is types consistency. We want the first field of a cons cell (or, in Prolog, the first argument of a . functored term) to always be treated as "a list's element", and the second -- as "a list". That's why [] in [1, []] counts as a list's element (as it appears as a 1st argument of a .-functored compound term), while the [] in [1 | []] does not (as it appears as a 2nd argument of such term).
Yes, the trees have to be binary -- i.e. the functor . as used to encode lists is binary -- and so what should we put there in the final node's tail field, that would signal to us that it is in fact the final node of the chain? It must be something, consistent and easily testable. And it must also represent the empty list, []. So it's only logical to use the representation of an empty list to represent the empty tail of a list.
And yes, having a non-[] final "tail" is perfectly valid, like in [1,2|3], which is a perfectly valid Prolog term -- it just isn't a representation of a list {1 2 3}, as understood by the rest of Prolog's built-ins.

More efficient way to update an element in a list in Elm?

Is there a more efficient way to update an element in a list in Elm than maping over each element?
{ model | items = List.indexedMap (\i x -> if i == 2 then "z" else x) model.items }
Maybe Elm's compiler is sophisticated enough to optimize this so that map or indexedMap isn't unnecessarily copying over every element except 1. What about nested lists?
Clojure has assoc-in to update an element inside a nested list or record (can be combined too). Does Elm have an equivalent?
More efficient in terms of amount of code would be (this is similar to #MichaelKohl's answer):
List.take n list ++ newN :: List.drop (n+1) list
PS: if n is < 0 or n > (length of list - 1) then the new item will be added before or at the end of the list.
PPS: I seem to recall that a :: alist is slightly better performing than [a] ++ alist.
If you mean efficient in terms of performance/ number of operations:
As soon as your lists get large, it is more efficient to use an Array (or a Dict) instead of a List as your type.
But there is a trade-off:
Array and Dict are very efficient/ performant when you frequently retrieve/ update/ add items.
List is very performant when you do frequent sorting and filtering and other operations where you actually need to map over the entire set.
That is why in my code, List is what I use a lot in view code. On the data side (in my update functions) I use Dict and Array more.
Basically, an Elm list is not meant for such a use-case. Instead, consider using an Array. Array contains a set function you can use for what is conceptually an in-pace update. Here's an example:
import Html exposing (text)
import Array
type alias Model = { items : Array.Array String }
model =
{ items = Array.fromList ["a", "b", "c"]
}
main =
let
m = { model | items = Array.set 2 "z" model.items }
z = Array.get 2 m.items
output = case z of
Just n -> n
Nothing -> "Nothing"
in
text output -- The output will be "z"
If for some reason you need model.items to be a List, note that you can convert back and forth between Array and List.
I'm not overly familiar with Elm, but given that it's immutable by default, I'd assume it uses structural sharing for its underlying data structures, so your concern re memory may be unfounded.
Personally I think there's nothing wrong with your approach posted above, but if you don't like it, you can try something like this (or List.concat):
List.take n list ++ newN :: List.drop (n+1)
I'm definitely not an Elm expert, but a look at Elm's List documentation did not reveal any function to update the element at a given index in a list.
I like Michael's answer. It's quite elegant. If you prefer a less-elegant, recursive approach, you can do something like the following. (Like I said, I'm not an Elm expert, but hopefully the intention of the code is clear if its not quite right. Also, I don't do any error handling.)
updateListAt :: List a -> Int -> a -> List a
updateListAt (head :: tail) 0 x = x :: tail
updateListAt (head :: tail) i x = head :: (updateListAt tail (i - 1) x)
However, both the runtime and space complexity will be O(n) in both the average and worst cases, regardless of the method used. This is a consequence of Elm's List being a single-linked list.
Regarding assoc-in, if you look at the Clojure source, you'll see that assoc-in is just recursively defined in terms of assoc. However, I think you'd have trouble typing it for arbitrary, dynamic depth in Elm.

Flatten a list of tuples in Scala?

I would have thought that a list of tuples could easily be flattened:
scala> val p = "abcde".toList
p: List[Char] = List(a, b, c, d, e)
scala> val q = "pqrst".toList
q: List[Char] = List(p, q, r, s, t)
scala> val pq = p zip q
pq: List[(Char, Char)] = List((a,p), (b,q), (c,r), (d,s), (e,t))
scala> pq.flatten
But instead, this happens:
<console>:15: error: No implicit view available from (Char, Char) => scala.collection.GenTraversableOnce[B].
pq.flatten
^
I can get the job done with:
scala> (for (x <- pq) yield List(x._1, x._2)).flatten
res1: List[Char] = List(a, p, b, q, c, r, d, s, e, t)
But I'm not understanding the error message. And my alternative solution seems convoluted and inefficient.
What does that error message mean and why can't I simply flatten a List of tuples?
If the implicit conversion can't be found you can supply it explicitly.
pq.flatten {case (a,b) => List(a,b)}
If this is done multiple times throughout the code then you can save some boilerplate by making it implicit.
scala> import scala.language.implicitConversions
import scala.language.implicitConversions
scala> implicit def flatTup[T](t:(T,T)): List[T]= t match {case (a,b)=>List(a,b)}
flatTup: [T](t: (T, T))List[T]
scala> pq.flatten
res179: List[Char] = List(a, p, b, q, c, r, d, s, e, t)
jwvh's answer covers the "coding" solution to your problem perfectly well, so I am not going to go into any more detail about that. The only thing I wanted to add was clarifying why the solution that both you and jwvh found is needed.
As stated in the Scala library, Tuple2 (which (,) translates to) is:
A tuple of 2 elements; the canonical representation of a Product2.
And following up on that:
Product2 is a cartesian product of 2 components.
...which means that Tuple2[T1,T2] represents:
The set of all possible pairs of elements whose components are members of two sets (all elements in T1 and T2 respectively).
A List[T], on the other hand, represents an ordered collections of T elements.
What all this means practically is that there is no absolute way to translate any possible Tuple2[T1,T2] to a List[T], simply because T1 and T2 could be different. For example, take the following tuple:
val tuple = ("hi", 5)
How could such tuple be flattened? Should the 5 be made a String? Or maybe just flatten to a List[Any]? While both of these solutions could be used, they are working around the type system, so they are not encoded in the Tuple API by design.
All this comes down to the fact that there is no default implicit view for this case and you have to supply one yourself, as both jwvh and you already figured out.
We needed to do this recently. Allow me to explain the use case briefly before noting our solution.
Use case
Given a pool of items (which I'll call type T), we want to do an evaluation of each one against all others in the pool. The result of these comparisons is a Set of failed evaluations, which we represent as a tuple of the left item and the right item in said evaluation: (T, T).
Once these evaluations are complete, it becomes useful for us to flatten the Set[(T, T)] into another Set[T] that highlights all the items that have failed any comparisons.
Solution
Our solution for this was a fold:
val flattenedSet =
set.foldLeft(Set[T]())
{ case (acc, (x, y)) => acc + x + y }
This starts with an empty set (the initial parameter to foldLeft) as the accumulator.
Then, for each element in the consumed Set[(T, T)] (named set) here, the fold function is passed:
the last value of the accumulator (acc), and
the (T, T) tuple for that element, which the case deconstructs into x and y.
Our fold function then returns acc + x + y, which returns a set containing all the elements in the accumulator in addition to x and y. That result is passed to the next iteration as the accumulator—thus, it accumulates all the values inside each of the tuples.
Why not Lists?
I appreciated this solution in particular since it avoided creating intermediate Lists while doing the flattening—instead, it directly deconstructs each tuple while building the new Set[T].
We could also have changed our evaluation code to return List[T]s containing the left and right items in each failed evaluation—then flatten would Just Work™. But we thought the tuple more accurately represented what we were going for with the evaluation—specifically one item against another, rather than an open-ended type which could conceivably represent any number of items.

SML: function with multiple outputs

I´m a newbie in SML and I´d like to update my function so that it has two outputs: a list AND 1 or 0. The function was proposed here: SML: Remove the entry from the List. It returns an updated list without a row that contains ´elem´.
fun removeElem elem myList = filter (fn x => x <> elem) myList
The function should return a new list AND 1, if a raw has been deleted. Otherwise, it should return an old list AND 0.
Any hint or example is highly appreciated. Thanks.
Note that all SML functions take a single input and return a single output. Instead, think of returning a tuple containing the new list and a flag indicating whether any elements were removed. One possibility is to use a couple of functions from the standard basis to test whether elem is in myList and build up a tuple consisting of that and the results from the filter shown in the question. The test might look like:
Option.isSome (List.find (fn x => x = elem) myList)
There are more concise ways to write that, but it shows the idea. Note that it returns a bool instead of an int; this is more precise, so I won't convert to the integers requested in the question.
A drawback of the above is that it requires traversing the list twice. To avoid that, consider the type that the function must return: a tuple of a list without elem and a flag showing whether any elems have been removed. We can then write a function that take a new value and a (valid) tuple, and returns a valid tuple. One possibility:
fun update(x, (acc, flag)) = if x = elem then (acc, true) else (x :: acc, flag)
We can then apply update to each element of myList one-by-one. Since we want the order of the list to stay the same, apart from the removed elements, we should work through myList from right to left, accumulating the results into an initially empty list. The function foldr will do this directly:
foldr update ([], false) myList
However, there is a lot of logic hidden in the foldr higher-order function.
To use this as a learning exercise, I'd suggest using this problem to implement the function in a few ways:
as a recursive function
as a tail-recursive function
using the higher order functions foldl and foldr
Understanding the differences between these versions will shed a lot of light on how SML works. For each version, let the types guide you.
As has been stated in some of your previous questions; Returning 0 or 1 as an indicator for what happened is a really bad design, as you don't get any guarantees from the types, whether or not you will get -42 as the result. Since you are working with a strongly typed language, you might as well use this to your advantage:
The most obvious thing to do instead would be to return a boolean, as that is actually what you are emulating with 0 and 1. In this case you could return the pair (true, modified_list) or (false, original_list).
Since you want to associate some data with the result, there is another -- perhaps, for some, less -- obvious thing to do; Return the result as an option, indication a change in the list as SOME modified_list and indication no change as NONE.
In either case you would have to "remember" whether or not you actually removed any elements from the original list, and thus you can't use the filter function. Instead you would have to do this for yourself using somewhat the same code as you originally posted.
One way would be like this
fun removeElem _ [] = (false, [])
| removeElem elem (x::xs) =
let
val (b, xs') = removeElem elem xs
in
if elem = x then
(true, xs')
else
(b, x::xs')
end
Another way would be to use a accumulator parameter to store the resulting list
fun removeElem elem xs =
let
fun removeElem' [] true res = SOME (rev res)
| removeElem' [] false _ = NONE
| removeElem' (x::xs) b res =
if elem = x then
removeElem' xs true res
else
removeElem' xs b (x::res)
in
removeElem' xs false []
end
Since the solution is being built in the reverse order, we reverse the result just before we return it. This makes sure that we don't have to use the costly append operation when adding elements to the result list: res # [x]