When we have two lists a and b, how can one concatenate those two (order is not relevant) to a new list in an efficient way ?
I could not figure out from the Scala API, if a ::: b and a ++ b are efficient. Maybe I missed something.
In Scala 2.9, the code for ::: (prepend to list) is as follows:
def :::[B >: A](prefix: List[B]): List[B] =
if (isEmpty) prefix
else (new ListBuffer[B] ++= prefix).prependToList(this)
whereas ++ is more generic, since it takes a CanBuildFrom parameter, i.e. it can return a collection type different from List:
override def ++[B >: A, That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[List[A], B, That]): That = {
val b = bf(this)
if (b.isInstanceOf[ListBuffer[_]]) (this ::: that.seq.toList).asInstanceOf[That]
else super.++(that)
}
So if your return type is List, the two perform identical.
The ListBuffer is a clever mechanism in that it can be used as a mutating builder, but eventually "consumed" by the toList method. So what (new ListBuffer[B] ++= prefix).prependToList(this) does, is first sequentially add all the elements in prefix (in the example a), taking O(|a|) time. It then calls prependToList, which is a constant time operation (the receiver, or b, does not need to be taken apart). Therefore, the overall time is O(|a|).
On the otherhand, as pst pointed out, we have reverse_::::
def reverse_:::[B >: A](prefix: List[B]): List[B] = {
var these: List[B] = this
var pres = prefix
while (!pres.isEmpty) {
these = pres.head :: these
pres = pres.tail
}
these
}
So with a reverse_::: b, this again takes O(|a|), hence is no more or less efficient that the other two methods (although for small list sizes, you save the overhead of having an intermediate ListBuffer creation).
In other words, if you have knowledge about the relative sizes of a and b, you should make sure that the prefix is the smaller of the two lists. If you do not have that knowledge, there is nothing you can do, because the size operation on a List takes O(N) :)
On the other hand, in a future Scala version you may see an improved Vector concatenation algorithm, as demonstrated in this ScalaDays talk. It promises to solve the task in O(log N) time.
Related
I am trying to return a value if something occurs when iterating through a list. Is it possible to return a string if X happens when iterating through the list, otherwise return another string if it never happens?
let f elem =
if not String.contains str elem then "false" in
List.iter f alphlist;
"true";
This is not working in my implemented method sadly.
OCaml is a functional language, so you pretty much need to concentrate on the values returned by functions. There are ways to return different values in exceptional cases, but (IMHO) the best way to learn is to start just with ordinary old nested function calls.
List.iter always returns the same value: (), which is known as unit.
For this reason, the expression List.iter f alphlist will also always return () no matter what f does.
There is another kind of list-handling function that works by maintaining a value across all the calls and returning that value at the end. It's called a fold.
So, if you want to compute some value that's a kind of summary of what it saw in all of the string lists in alphlist, you should probably be using a fold, say List.fold_left.
Here is a function any_has_7 that determines whether any one of the specified lists contains the integer 7:
let any_has_7 lists =
let has_7 sofar list =
sofar || List.mem 7 list
in
List.fold_left has_7 false lists
Here's how it looks when you run it:
# any_has_7 [[1;2]; [3;4]];;
- : bool = false
# any_has_7 [[1;2]; [5;7]; [8;9]];;
- : bool = true
In other words, this function does something a lot like what you're asking for. It returns true when one or more of the lists contains a certain value, and false when none of them contains the value.
I hope this helps.
I want to convert a sequence to a list using List.init. I want at each step to retrieve the i th value of s.
let to_list s =
let n = length s in
List.init n
(fun _i ->
match s () with
| Nil -> assert false
| Cons (a, sr) -> a)
This is giving me a list initialized with the first element of s only. Is it possible in OCaml to initialize the list with all the values of s?
It may help to study the definition of List.init.
There are two variations depending on the size of the list: a tail recursive one, init_tailrec_aux, whose result is in reverse order, and a basic one, init_aux. They have identical results, so we need only look at init_aux:
let rec init_aux i n f =
if i >= n then []
else
let r = f i in
r :: init_aux (i+1) n f
This function recursively increments a counter i until it reaches a limit n. For each value of the counter that is strictly less than the limit, it adds the value given by f i to the head of the list being produced.
The question now is, what does your anonymous function do when called with different values of i?:
let f_anon =
(fun _i -> match s () with
|Nil -> assert false
|Cons(a, sr) -> a)
Regardless of _i, it always gives the head of the list produced by s (), and if s () always returns the same list, then f_anon 0 = f_anon 1 = f_anon 2 = f_anon 3 = hd (s ()).
Jeffrey Scofield's answer describes a technique for giving a different value at each _i, and I agree with his suggestion that List.init is not the best solution for this problem.
The essence of the problem is that you're not saving sr, which would let you retrieve the next element of the sequence.
However, the slightly larger problem is that List.init passes only an int as an argument to the initialization function. So even if you did keep track of sr, there's no way it can be passed to your initialization function.
You can do what you want using the impure parts of OCaml. E.g., you could save sr in a global reference variable at each step and retrieve it in the next call to the initialization function. However, this is really quite a cumbersome way to produce your list.
I would suggest not using List.init. You can write a straightforward recursive function to do what you want. (If you care about tail recursion, you can write a slightly less straightforward function.)
using a recursive function will increase the complexity so i think that initializing directly the list (or array) at the corresponding length will be better but i don't really know how to get a different value at each _i like Jeffrey Scofield said i am not really familiar with ocaml especially sequences so i have some difficulties doing that:(
Is there a more efficient way to update an element in a list in Elm than maping over each element?
{ model | items = List.indexedMap (\i x -> if i == 2 then "z" else x) model.items }
Maybe Elm's compiler is sophisticated enough to optimize this so that map or indexedMap isn't unnecessarily copying over every element except 1. What about nested lists?
Clojure has assoc-in to update an element inside a nested list or record (can be combined too). Does Elm have an equivalent?
More efficient in terms of amount of code would be (this is similar to #MichaelKohl's answer):
List.take n list ++ newN :: List.drop (n+1) list
PS: if n is < 0 or n > (length of list - 1) then the new item will be added before or at the end of the list.
PPS: I seem to recall that a :: alist is slightly better performing than [a] ++ alist.
If you mean efficient in terms of performance/ number of operations:
As soon as your lists get large, it is more efficient to use an Array (or a Dict) instead of a List as your type.
But there is a trade-off:
Array and Dict are very efficient/ performant when you frequently retrieve/ update/ add items.
List is very performant when you do frequent sorting and filtering and other operations where you actually need to map over the entire set.
That is why in my code, List is what I use a lot in view code. On the data side (in my update functions) I use Dict and Array more.
Basically, an Elm list is not meant for such a use-case. Instead, consider using an Array. Array contains a set function you can use for what is conceptually an in-pace update. Here's an example:
import Html exposing (text)
import Array
type alias Model = { items : Array.Array String }
model =
{ items = Array.fromList ["a", "b", "c"]
}
main =
let
m = { model | items = Array.set 2 "z" model.items }
z = Array.get 2 m.items
output = case z of
Just n -> n
Nothing -> "Nothing"
in
text output -- The output will be "z"
If for some reason you need model.items to be a List, note that you can convert back and forth between Array and List.
I'm not overly familiar with Elm, but given that it's immutable by default, I'd assume it uses structural sharing for its underlying data structures, so your concern re memory may be unfounded.
Personally I think there's nothing wrong with your approach posted above, but if you don't like it, you can try something like this (or List.concat):
List.take n list ++ newN :: List.drop (n+1)
I'm definitely not an Elm expert, but a look at Elm's List documentation did not reveal any function to update the element at a given index in a list.
I like Michael's answer. It's quite elegant. If you prefer a less-elegant, recursive approach, you can do something like the following. (Like I said, I'm not an Elm expert, but hopefully the intention of the code is clear if its not quite right. Also, I don't do any error handling.)
updateListAt :: List a -> Int -> a -> List a
updateListAt (head :: tail) 0 x = x :: tail
updateListAt (head :: tail) i x = head :: (updateListAt tail (i - 1) x)
However, both the runtime and space complexity will be O(n) in both the average and worst cases, regardless of the method used. This is a consequence of Elm's List being a single-linked list.
Regarding assoc-in, if you look at the Clojure source, you'll see that assoc-in is just recursively defined in terms of assoc. However, I think you'd have trouble typing it for arbitrary, dynamic depth in Elm.
I would have thought that a list of tuples could easily be flattened:
scala> val p = "abcde".toList
p: List[Char] = List(a, b, c, d, e)
scala> val q = "pqrst".toList
q: List[Char] = List(p, q, r, s, t)
scala> val pq = p zip q
pq: List[(Char, Char)] = List((a,p), (b,q), (c,r), (d,s), (e,t))
scala> pq.flatten
But instead, this happens:
<console>:15: error: No implicit view available from (Char, Char) => scala.collection.GenTraversableOnce[B].
pq.flatten
^
I can get the job done with:
scala> (for (x <- pq) yield List(x._1, x._2)).flatten
res1: List[Char] = List(a, p, b, q, c, r, d, s, e, t)
But I'm not understanding the error message. And my alternative solution seems convoluted and inefficient.
What does that error message mean and why can't I simply flatten a List of tuples?
If the implicit conversion can't be found you can supply it explicitly.
pq.flatten {case (a,b) => List(a,b)}
If this is done multiple times throughout the code then you can save some boilerplate by making it implicit.
scala> import scala.language.implicitConversions
import scala.language.implicitConversions
scala> implicit def flatTup[T](t:(T,T)): List[T]= t match {case (a,b)=>List(a,b)}
flatTup: [T](t: (T, T))List[T]
scala> pq.flatten
res179: List[Char] = List(a, p, b, q, c, r, d, s, e, t)
jwvh's answer covers the "coding" solution to your problem perfectly well, so I am not going to go into any more detail about that. The only thing I wanted to add was clarifying why the solution that both you and jwvh found is needed.
As stated in the Scala library, Tuple2 (which (,) translates to) is:
A tuple of 2 elements; the canonical representation of a Product2.
And following up on that:
Product2 is a cartesian product of 2 components.
...which means that Tuple2[T1,T2] represents:
The set of all possible pairs of elements whose components are members of two sets (all elements in T1 and T2 respectively).
A List[T], on the other hand, represents an ordered collections of T elements.
What all this means practically is that there is no absolute way to translate any possible Tuple2[T1,T2] to a List[T], simply because T1 and T2 could be different. For example, take the following tuple:
val tuple = ("hi", 5)
How could such tuple be flattened? Should the 5 be made a String? Or maybe just flatten to a List[Any]? While both of these solutions could be used, they are working around the type system, so they are not encoded in the Tuple API by design.
All this comes down to the fact that there is no default implicit view for this case and you have to supply one yourself, as both jwvh and you already figured out.
We needed to do this recently. Allow me to explain the use case briefly before noting our solution.
Use case
Given a pool of items (which I'll call type T), we want to do an evaluation of each one against all others in the pool. The result of these comparisons is a Set of failed evaluations, which we represent as a tuple of the left item and the right item in said evaluation: (T, T).
Once these evaluations are complete, it becomes useful for us to flatten the Set[(T, T)] into another Set[T] that highlights all the items that have failed any comparisons.
Solution
Our solution for this was a fold:
val flattenedSet =
set.foldLeft(Set[T]())
{ case (acc, (x, y)) => acc + x + y }
This starts with an empty set (the initial parameter to foldLeft) as the accumulator.
Then, for each element in the consumed Set[(T, T)] (named set) here, the fold function is passed:
the last value of the accumulator (acc), and
the (T, T) tuple for that element, which the case deconstructs into x and y.
Our fold function then returns acc + x + y, which returns a set containing all the elements in the accumulator in addition to x and y. That result is passed to the next iteration as the accumulator—thus, it accumulates all the values inside each of the tuples.
Why not Lists?
I appreciated this solution in particular since it avoided creating intermediate Lists while doing the flattening—instead, it directly deconstructs each tuple while building the new Set[T].
We could also have changed our evaluation code to return List[T]s containing the left and right items in each failed evaluation—then flatten would Just Work™. But we thought the tuple more accurately represented what we were going for with the evaluation—specifically one item against another, rather than an open-ended type which could conceivably represent any number of items.
I've got a list of objects List[Object] which are all instantiated from the same class. This class has a field which must be unique Object.property. What is the cleanest way to iterate the list of objects and remove all objects(but the first) with the same property?
list.groupBy(_.property).map(_._2.head)
Explanation: The groupBy method accepts a function that converts an element to a key for grouping. _.property is just shorthand for elem: Object => elem.property (the compiler generates a unique name, something like x$1). So now we have a map Map[Property, List[Object]]. A Map[K,V] extends Traversable[(K,V)]. So it can be traversed like a list, but elements are a tuple. This is similar to Java's Map#entrySet(). The map method creates a new collection by iterating each element and applying a function to it. In this case the function is _._2.head which is shorthand for elem: (Property, List[Object]) => elem._2.head. _2 is just a method of Tuple that returns the second element. The second element is List[Object] and head returns the first element
To get the result to be a type you want:
import collection.breakOut
val l2: List[Object] = list.groupBy(_.property).map(_._2.head)(breakOut)
To explain briefly, map actually expects two arguments, a function and an object that is used to construct the result. In the first code snippet you don't see the second value because it is marked as implicit and so provided by the compiler from a list of predefined values in scope. The result is usually obtained from the mapped container. This is usually a good thing. map on List will return List, map on Array will return Array etc. In this case however, we want to express the container we want as result. This is where the breakOut method is used. It constructs a builder (the thing that builds results) by only looking at the desired result type. It is a generic method and the compiler infers its generic types because we explicitly typed l2 to be List[Object] or, to preserve order (assuming Object#property is of type Property):
list.foldRight((List[Object](), Set[Property]())) {
case (o, cum#(objects, props)) =>
if (props(o.property)) cum else (o :: objects, props + o.property))
}._1
foldRight is a method that accepts an initial result and a function that accepts an element and returns an updated result. The method iterates each element, updating the result according to applying the function to each element and returning the final result. We go from right to left (rather than left to right with foldLeft) because we are prepending to objects - this is O(1), but appending is O(N). Also observe the good styling here, we are using a pattern match to extract the elements.
In this case, the initial result is a pair (tuple) of an empty list and a set. The list is the result we're interested in and the set is used to keep track of what properties we already encountered. In each iteration we check if the set props already contains the property (in Scala, obj(x) is translated to obj.apply(x). In Set, the method apply is def apply(a: A): Boolean. That is, accepts an element and returns true / false if it exists or not). If the property exists (already encountered), the result is returned as-is. Otherwise the result is updated to contain the object (o :: objects) and the property is recorded (props + o.property)
Update: #andreypopp wanted a generic method:
import scala.collection.IterableLike
import scala.collection.generic.CanBuildFrom
class RichCollection[A, Repr](xs: IterableLike[A, Repr]){
def distinctBy[B, That](f: A => B)(implicit cbf: CanBuildFrom[Repr, A, That]) = {
val builder = cbf(xs.repr)
val i = xs.iterator
var set = Set[B]()
while (i.hasNext) {
val o = i.next
val b = f(o)
if (!set(b)) {
set += b
builder += o
}
}
builder.result
}
}
implicit def toRich[A, Repr](xs: IterableLike[A, Repr]) = new RichCollection(xs)
to use:
scala> list.distinctBy(_.property)
res7: List[Obj] = List(Obj(1), Obj(2), Obj(3))
Also note that this is pretty efficient as we are using a builder. If you have really large lists, you may want to use a mutable HashSet instead of a regular set and benchmark the performance.
Starting Scala 2.13, most collections are now provided with a distinctBy method which returns all elements of the sequence ignoring the duplicates after applying a given transforming function:
list.distinctBy(_.property)
For instance:
List(("a", 2), ("b", 2), ("a", 5)).distinctBy(_._1) // List((a,2), (b,2))
List(("a", 2.7), ("b", 2.1), ("a", 5.4)).distinctBy(_._2.floor) // List((a,2.7), (a,5.4))
Here is a little bit sneaky but fast solution that preserves order:
list.filterNot{ var set = Set[Property]()
obj => val b = set(obj.property); set += obj.property; b}
Although it uses internally a var, I think it is easier to understand and to read than the foldLeft-solution.
A lot of good answers above. However, distinctBy is already in Scala, but in a not-so-obvious place. Perhaps you can use it like
def distinctBy[A, B](xs: List[A])(f: A => B): List[A] =
scala.reflect.internal.util.Collections.distinctBy(xs)(f)
With preserve order:
def distinctBy[L, E](list: List[L])(f: L => E): List[L] =
list.foldLeft((Vector.empty[L], Set.empty[E])) {
case ((acc, set), item) =>
val key = f(item)
if (set.contains(key)) (acc, set)
else (acc :+ item, set + key)
}._1.toList
distinctBy(list)(_.property)
One more solution
#tailrec
def collectUnique(l: List[Object], s: Set[Property], u: List[Object]): List[Object] = l match {
case Nil => u.reverse
case (h :: t) =>
if (s(h.property)) collectUnique(t, s, u) else collectUnique(t, s + h.prop, h :: u)
}
I found a way to make it work with groupBy, with one intermediary step:
def distinctBy[T, P, From[X] <: TraversableLike[X, From[X]]](collection: From[T])(property: T => P): From[T] = {
val uniqueValues: Set[T] = collection.groupBy(property).map(_._2.head)(breakOut)
collection.filter(uniqueValues)
}
Use it like this:
scala> distinctBy(List(redVolvo, bluePrius, redLeon))(_.color)
res0: List[Car] = List(redVolvo, bluePrius)
Similar to IttayD's first solution, but it filters the original collection based on the set of unique values. If my expectations are correct, this does three traversals: one for groupBy, one for map and one for filter. It maintains the ordering of the original collection, but does not necessarily take the first value for each property. For example, it could have returned List(bluePrius, redLeon) instead.
Of course, IttayD's solution is still faster since it does only one traversal.
My solution also has the disadvantage that, if the collection has Cars that are actually the same, both will be in the output list. This could be fixed by removing filter and returning uniqueValues directly, with type From[T]. However, it seems like CanBuildFrom[Map[P, From[T]], T, From[T]] does not exist... suggestions are welcome!
With a collection and a function from a record to a key this yields a list of records distinct by key. It's not clear whether groupBy will preserve the order in the original collection. It may even depend on the type of collection. I'm guessing either head or last will consistently yield the earliest element.
collection.groupBy(keyFunction).values.map(_.head)
When will Scala get a nubBy? It's been in Haskell for decades.
If you want to remove duplicates and preserve the order of the list you can try this two liner:
val tmpUniqueList = scala.collection.mutable.Set[String]()
val myUniqueObjects = for(o <- myObjects if tmpUniqueList.add(o.property)) yield o
this is entirely a rip of #IttayD 's answer, but unfortunately I don't have enough reputation to comment.
Rather than creating an implicit function to convert your iteratble, you can simply create an implicit class:
import scala.collection.IterableLike
import scala.collection.generic.CanBuildFrom
implicit class RichCollection[A, Repr](xs: IterableLike[A, Repr]){
def distinctBy[B, That](f: A => B)(implicit cbf: CanBuildFrom[Repr, A, That]) = {
val builder = cbf(xs.repr)
val i = xs.iterator
var set = Set[B]()
while (i.hasNext) {
val o = i.next
val b = f(o)
if (!set(b)) {
set += b
builder += o
}
}
builder.result
}
}