Split the list when duplicate is found scala - list

I have a list of elements in Scala and I am looking for a way to split the list when a duplicate is found.
For example: List(x,y,z,e,r,y,g,a) would be converted to List(List(x,y,z,e,r),List(y,g,a))
or List(x,y,z,x,y,z) to List(x,y,z), List(x,y,z)
and List(x,y,z,y,g,x) to List(x,y,z), List(y,g,x)
Is there a more efficient way than iterating and and cheking for every element separately?

Quick and dirty O(n) using O(n) additional memory:
import scala.collection.mutable.HashSet
import scala.collection.mutable.ListBuffer
val list = List("x", "y", "z", "e", "r", "y", "g", "a", "x", "m", "z")
var result = new ListBuffer[ListBuffer[String]]()
var partition = new ListBuffer[String]()
list.foreach { i =>
if (partition.contains(i)) {
result += partition
partition = new ListBuffer[String]()
}
partition += i
}
if (partition.nonEmpty) {
result += partition
}
result
ListBuffer(ListBuffer(x, y, z, e, r), ListBuffer(y, g, a, x, m, z))

This solution comes with a few caveats:
I'm not making a claim as to 'performance', though I think it's better than O(n^2), which is the brute-force.
This is assuming you are splitting when you find a duplicate, where 'duplicate' means 'something that exists in the previous split'. I cheat a little by only checking the last segment. The reason is that I think it clarifies how to use foldLeft a little, which is a natural way to go about this.
Everything here is reversed, but maintains order. This can be easily corrected, but adds an additional O(n) (cumulative) call, and may not actually be needed (depending on what you're doing with it).
Here is the code:
def partition(ls: List[String]): List[ListSet[String]] = {
ls.foldLeft(List(ListSet.empty[String]))((partitionedLists, elem:String) => {
if(partitionedLists.head.contains(elem)) {
ListSet(elem) :: partitionedLists
} else {
(partitionedLists.head + elem) :: partitionedLists.tail
}
})
}
partition(List("x","y","z","e","r","y","g","a"))
// res0: List[scala.collection.immutable.ListSet[String]] = List(ListSet(r, e, z, y, x), ListSet(a, g, y))
I'm using ListSet to get both the benefits of a Set and ordering, which is appropriate to your use case.
foldLeft is a function that takes an accumulator value (in this case the List(ListSet.empty[String])) and modifies it as it moves through the elements of your collection. If we structure that accumulator, as done here, to be a list of segments, by the time we're done it will have all the ordered segments of the original list.

One statement tail-recursive version (but not very efficient because of the contains on the list)
var xs = List('x','y','z','e','r','y','g','a')
def splitAtDuplicates[A](splits: List[List[A]], right: List[A]): List[List[A]] =
if (right.isEmpty)// done
splits.map(_.reverse).reverse
else if (splits.head contains right.head) // need to split here
splitAtDuplicates(List()::splits, right)
else // continue building current sublist
splitAtDuplicates((right.head :: splits.head)::splits.tail, right.tail)
Speed it up with a Set to track what we've seen so far:
def splitAtDuplicatesOptimised[A](seen: Set[A],
splits: List[List[A]],
right: List[A]): List[List[A]] =
if (right.isEmpty)
splits.map(_.reverse).reverse
else if (seen(right.head))
splitAtDuplicatesOptimised(Set(), List() :: splits, right)
else
splitAtDuplicatesOptimised(seen + right.head,
(right.head :: splits.head) :: splits.tail,
right.tail)

You will basically need to iterate with a look-up table. I can provide help with the follwoing immutable and functional tailrec implementation.
import scala.collection.immutable.HashSet
import scala.annotation.tailrec
val list = List("x","y","z","e","r","y","g","a", "x", "m", "z", "ll")
def splitListOnDups[A](list: List[A]): List[List[A]] = {
#tailrec
def _split(list: List[A], cList: List[A], hashSet: HashSet[A], lists: List[List[A]]): List[List[A]] = {
list match {
case a :: Nil if hashSet.contains(a) => List(a) +: (cList +: lists)
case a :: Nil => (a +: cList) +: lists
case a :: tail if hashSet.contains(a) => _split(tail, List(a), hashSet, cList +: lists)
case a :: tail => _split(tail, a +: cList, hashSet + a, lists)
}
}
_split(list, List[A](), HashSet[A](), List[List[A]]()).reverse.map(_.reverse)
}
def splitListOnDups2[A](list: List[A]): List[List[A]] = {
#tailrec
def _split(list: List[A], cList: List[A], hashSet: HashSet[A], lists: List[List[A]]): List[List[A]] = {
list match {
case a :: Nil if hashSet.contains(a) => List(a) +: (cList +: lists)
case a :: Nil => (a +: cList) +: lists
case a :: tail if hashSet.contains(a) => _split(tail, List(a), HashSet[A](), cList +: lists)
case a :: tail => _split(tail, a +: cList, hashSet + a, lists)
}
}
_split(list, List[A](), HashSet[A](), List[List[A]]()).reverse.map(_.reverse)
}
splitListOnDups(list)
// List[List[String]] = List(List(x, y, z, e, r), List(y, g, a), List(x, m), List(z, ll))
splitListOnDups2(list)
// List[List[String]] = List(List(x, y, z, e, r), List(y, g, a, x, m, z, ll))

Related

Optimize solution for the given coding problem

I am a newbie in Scala and I am trying to resolve the following simple coding problem:
Write a listOfLists recursive method that takes a number of strings as varargs and then
creates a list of lists of strings, with one less string in each, so for example:
listOfLists("3","2","1") should give back: List(List("3","2","1"), List("2","1"), List("1"))
The solution I've found is the following:
def listOfLists(strings: String*): List[List[String]] = {
val strLength = strings.length
#tailrec
def recListOfList(result: List[List[String]], accumulator: Int): List[List[String]] = {
accumulator match {
case x if x < strLength =>
recListOfList(result :+ (strings.toList.takeRight(strings.length - accumulator)), accumulator + 1 )
case _ => result
}
}
val res: List[List[String]] = List(strings.toList)
recListOfList(res, 1)
}
The solution works, however I think it could be written much more better.
A problem I can see is that I convert the varargs to a List with the toList method, but a hint that the problem gave me is to use the eta expansion _* but I don't know how to use it in this context.
Then, I tried to find another way to write in a more efficient way the following instruction:
strings.toList.takeRight(strings.length - accumulator))
but this is the only solution that came up in my mind.
Any review is welcome (also say that this solution is a total mess :D (providing the right reasons))
This meets all the specified requirements.
def listOfLists(strings: String*): List[List[String]] =
if (strings.isEmpty) Nil
else strings.toList :: listOfLists(strings.tail:_*)
You can do this:
def listOfLists(strings: String*): List[List[String]] = {
#annotation.tailrec
def loop(remaining: List[String], acc: List[List[String]]): List[List[String]] =
remaining match {
case head :: tail =>
loop(remaining = tail, (head :: tail) :: acc)
case Nil =>
acc.reverse
}
loop(remaining = strings.toList, acc = List.empty)
}
I believe the code is self-explanatory; but, feel free to ask any questions you may have.
You can see the code running here.
Not a recursive method but worth noting that tails in the standard library can do most of this. Then map and filter to convert to correct type and filter out empty list.
def listOfLists(strings: String *): List[List[String]] = strings.tails.map(_.toList).filter(_.nonEmpty).toList
Test:
scala> listOfLists("a","b","c")
val res6: List[List[String]] = List(List(a, b, c), List(b, c), List(c))
Using almost the same idea you can rewrite your solution in cleaner way:
def listOfLists(strings: String*): List[List[String]] = {
#tailrec
def recListOfList(curr: List[String], accumulator: Seq[List[String]]): Seq[List[String]] = {
curr match {
case head :: tail => recListOfList(tail, curr +: accumulator)
case _ => accumulator
}
}
recListOfList(strings.toList, Nil)
.reverse
.toList
}
With the splat(_*) operator, which adapts a sequence (Array, List, Seq, Vector, etc.) to varargs parameter you can create a shorter solution, but it will not be tail-recursive:
def listOfLists(strings: String*): List[List[String]] = {
val curr = strings.toList
curr match {
case Nil => Nil
case x :: tail => curr :: listOfLists(tail:_*)
}
}
From Scala 2.13 you can use List.unfold and Option.when:
def listOfLists(strings: String*): List[List[String]] = {
List.unfold(strings) { s =>
Option.when(s.nonEmpty)(s.toList, s.tail)
}
}
Code run at Scastie.

Type mismatch when dealing with list comprehensions

def combinations(list: List[(Char,Int)]): List[List[(Char,Int)]] = {
list match {
case List() => List()
case x::xs => for(o <- List.range(0,x._2 + 1)) yield List((x._1,o)) :: combinations(xs)
}
}
This function won't compile properly as the comprehension will convert my result to a list resulting in a
List(List(List((Char,Int))))
The function is meant to find all the sub lists of a List(Char,Int) taking in consideration that ('a',2) is a sub list of ('a',5)
My question is can I somehow stop the comprehension making the end result a list? Am I missing the whole point of comprehension? Is this function even logically correct?
for comprehension has a generator of type List so it yields aList. you are putting that yielded List into another list.
Following compiles
def combinations(list: List[(Char, Int)]) : List[List[(Char, Int)]]= {
val t = List.range(0, 1)
list match {
case List() => List()
case (c,i) :: xs => val res = for {
o <- List.range(0, i + 1)
} yield (c, o)
res:: combinations(xs)
}
}

Scala for loop Replace on List

Maybe this might be easy to fix but can you help me out or guide me to a solution. I have a remove function that goes through a List of tuples "List[(String,Any)]" and im trying to replace the 1 index of the value with Nil when the list is being looped over.
But when I try to replace the current v with Nil, it say the v is assigned to "val". Now I understand that scala lists are immutable. So maybe this is what is going wrong?
I tried a Tail recursion implementation as will but when I get out of the def there is a type mismatch. ie: is unit but required: Option[Any]
// remove(k) removes one value v associated with key k
// from the dictionary, if any, and returns it as Some(v).
// It returns None if k is associated to no value.
def remove(key:String):Option[Any] = {
for((k,v) <- d){
if(k == key){
var temp:Option[Any] = Some(v)
v = Nil
return temp
}
}; None
}
Here was the other way of trying to figure out
def remove(key:String):Option[Any] = {
def removeHelper(l:List[(String,Any)]):List[(String,Any)] =
l match {
case Nil => Nil
case (k,v)::t => if (key == k) t else (k,v)::removeHelper(t)
}
d = removeHelper(d)
}
Any Suggestions? This is a homework/Project for school thought I might add that for the people that don't like to help with homework.
Well, there are many ways of answering that question. I'll be outlining the ones I can think of here with my own implementations, but the list is by no means exhaustive (nor, probably, the implementations optimal).
First, you can try with existing combinators - the usual suspects are map, flatMap, foldLeft and foldRight:
def remove_flatMap(key: String, list: List[(String, Any)]): List[(String, Any)] =
// The Java developer in me rebels against creating that many "useless" instances.
list.flatMap {a => if(a._1 == key) Nil else List(a)}
def remove_foldLeft(key: String, list: List[(String, Any)]): List[(String, Any)] =
list.foldLeft(List[(String, Any)]()) {(acc, a) =>
if(a._1 == key) acc
else a :: acc
// Note the call to reverse here.
}.reverse
// This is more obviously correct than the foldLeft version, but is not tail-recursive.
def remove_foldRight(key: String, list: List[(String, Any)]): List[(String, Any)] =
list.foldRight(List[(String, Any)]()) {(a, acc) =>
if(a._1 == key) acc
else a :: acc
}
The problem with these is that, as far as I'm aware, you cannot stop them once a certain condition has been reached: I don't think they solve your problem directly, since they remove all instances of key rather than the first.
You also want to note that:
foldLeft must reverse the list once it's done, since it appends elements in the "wrong" order.
foldRight doesn't have that flaw, but is not tail recursive: it will cause memory issues on large lists.
map cannot be used for your problem, since it only lets us modify a list's values but not its structure.
You can also use your own implementation. I've included two versions, one that is tail-recursive and one that is not. The tail-recursive one is obviously the better one, but is also more verbose (I blame the ugliness of using a List[(String, Any)] rather than Map[String, Any]:
def remove_nonTailRec(key: String, list: List[(String, Any)]): List[(String, Any)] = list match {
case h :: t if h._1 == key => t
// This line is the reason our function is not tail-recursive.
case h :: t => h :: remove_nonTailRec(key, t)
case Nil => Nil
}
def remove_tailRec(key: String, list: List[(String, Any)]): List[(String, Any)] = {
#scala.annotation.tailrec
def run(list: List[(String, Any)], acc: List[(String, Any)]): List[(String, Any)] = list match {
// We've been aggregating in the "wrong" order again...
case h :: t if h._1 == key => acc.reverse ::: t
case h :: t => run(t, h :: acc)
case Nil => acc.reverse
}
run(list, Nil)
}
The better solution is of course to use the right tool for the job: a Map[String, Any].
Note that I do not think I answer your question fully: my examples remove key, while you want to set it to Nil. Since this is your homework, I'll let you figure out how to change my code to match your requirements.
List is the wrong collection to use if any key should only exist once. You should be using Map[String,Any]. With a list,
You have to do extra work to prevent duplicate entries.
Retrieval of a key will be slower, the further down the list it appears. Attempting to retrieve a non-existent key will be slow in proportion to the size of the list.
I guess point 2 is maybe why you are trying to replace it with Nil rather than just removing the key from the list. Nil is not the right thing to use here, really. You are going to get different things back if you try and retrieve a non-existent key compared to one that has been removed. Is that really what you want? How much sense does it make to return Some(Nil), ever?
Here's a couple of approaches which work with mutable or immutable lists, but which don't assume that you successfully stopped duplicates creeping in...
val l1: List[(String, Any)] = List(("apple", 1), ("pear", "violin"), ("banana", Unit))
val l2: List[(Int, Any)] = List((3, 1), (4, "violin"), (7, Unit))
def remove[A,B](key: A, xs: List[(A,B)]) = (
xs collect { case x if x._1 == key => x._2 },
xs map { case x if x._1 != key => x; case _ => (key, Nil) }
)
scala> remove("apple", l1)
res0: (List[(String, Any)], List[(String, Any)]) = (List((1)),List((apple, List()),(pear,violin), (banana,object scala.Unit)))
scala> remove(4, l2)
res1: (List[(Int, Any)], List[(Int, Any)]) = (List((violin)),List((3,1), (4, List()), (7,object scala.Unit)))
scala> remove("snark", l1)
res2: (List[Any], List[(String, Any)]) = (List(),List((apple,1), (pear,violin), (banana,object scala.Unit)))
That returns a list of matching values (so an empty list rather than None if no match) and the remaining list, in a tuple. If you want a version that just completely removes the unwanted key, do this...
def remove[A,B](key: A, xs: List[(A,B)]) = (
xs collect { case x if x._1 == key => x._2 },
xs filter { _._1 != key }
)
But also look at this:
scala> l1 groupBy {
case (k, _) if k == "apple" => "removed",
case _ => "kept"
}
res3: scala.collection.immutable.Map[String,List[(String, Any)]] = Map(removed -> List((apple,1)), kept -> List((pear,violin), (banana,object scala.Unit)))
That is something you could develop a bit. All you need to do is add ("apple", Nil) to the "kept" list and extract the value(s) from the "removed" list.
Note that I am using the List combinator functions rather than writing my own recursive code; this usually makes for clearer code and is often as fast or faster than a hand-rolled recursive function.
Note also that I don't change the original list. This means my function works with both mutable and immutable lists. If you have a mutable list, feel free to assign my returned list as the new value for your mutable var. Win, win.
But please use a map for this. Look how simple things become:
val m1: Map[String, Any] = Map(("apple", 1), ("pear", "violin"), ("banana", Unit))
val m2: Map[Int, Any] = Map((3, 1), (4, "violin"), (7, Unit))
def remove[A,B](key: A, m: Map[A,B]) = (m.get(key), m - key)
scala> remove("apple", m1)
res0: (Option[Any], scala.collection.immutable.Map[String,Any]) = (Some(1),Map(pear -> violin, banana -> object scala.Unit))
scala> remove(4, m2)
res1: (Option[Any], scala.collection.immutable.Map[Int,Any]) = (Some(violin),Map(3 -> 1, 7 -> object scala.Unit))
scala> remove("snark", m1)
res2: res26: (Option[Any], scala.collection.immutable.Map[String,Any]) = (None,Map(apple -> 1, pear -> violin, banana -> object scala.Unit))
The combinator functions make things easier, but when you use the right collection, it becomes so easy that it is hardly worth writing a special function. Unless, of course, you are trying to hide the data structure - in which case you should really be hiding it inside an object.

Run length encoding using Scala

Given a list of elements of which some are repeated multiple times, i need to produce a new list with tuples, where each tuple contains number of times an element is repeated in a row and an element itself.
For example, given
println(func(List())) // should be empty list
println(func(List(1, 1))) // (2,1) <- 1 is repeated 2 times
println(func(List(1, 1, 2, 1))) // (2,1)(1,2)(1,1)
This is my best attempt at this point. I feel that i am missing something very basic, please help me understand what
def func[X](xs: List[X]): List[(Int, X)] = xs match {
case Nil => Nil
case y :: ys => ys match {
case Nil => (1, y) :: Nil
case z :: zs => if (y != z) (ys.prefixLength(_ == ys.head), y) :: func(ys)
else func(ys)
}
}
After analyzing what the problem is, it seems to me that at the point when i recursively call func(ys), ys does not have enough information to figure out the count of elements. Say we're dealing with List(1,1,1,2). Ok, so, y is 1, z is 1 and (1::(2::Nil)) is zs. Following my logic above, the fact that 1 was seen 2 times is lost for the next call.
The problem may be that i am not thinking about the problem the right way. What i have in mind is "go along the list until you find that this element is not the same as a previous elements, at which point, count the number of occurrences of an element and make it into the tuple")
I recognize that in the above scenario (in my code) the problem is that when numbers are in fact the same (1,1) the fact that we already saw a number is not reflected anywhere. But where can this be done please, given that i am not yet ready to compose a tuple
In answering this question, please stick to case structure. I realize that there maybe other better, cleaner ways to address this problem, i would like to better understand what i am doing wrong here
You're on the right track. The problem is that you can't just incrementally build the result list here—you'll have to pull the head off the list you get from the recursive call and check whether you need to add a new pair or increment the count of the last one:
def func[X](xs: List[X]): List[(Int, X)] = xs match {
case Nil => Nil
case y :: ys => func(ys) match {
case (c, `y`) :: rest => (c + 1, y) :: rest
case rest => ( 1, y) :: rest
}
}
Note the backticks around y in the nested match pattern—this is necessary to avoid just defining a new variable named y.
Here's a simpler solution using span:
def runLength[T](xs: List[T]): List[(Int, T)] = xs match {
case Nil => List()
case x :: l => {
val (front, back) = l.span(_ == x)
(front.length + 1, x) :: runLength(back)
}
}
It is indeed run-length encoding.
Here's a straightforward, though generic,attempt...
package rrs.scribble
object RLE {
def rle[T](tSeq: List[T]): List[(Int, T)] = {
def doRLE(seqT: List[T], rle: List[(Int, T)]): List[(Int, T)] =
seqT match {
case t :: moreT if t == rle.head._2 => doRLE(moreT, (rle.head._1 + 1, t) :: rle.tail)
case t :: moreT => doRLE(moreT, (1, t) :: rle)
case Nil => rle
}
if (tSeq.isEmpty)
List.empty[(Int, T)]
else
doRLE(tSeq, List((0, tSeq.head))).reverse
}
}
In the REPL:
scala> import rrs.scribble.RLE._
import rrs.scribble.RLE._
scala> rle(List(1, 1, 2, 1))
res0: List[(Int, Int)] = List((2,1), (1,2), (1,1))
This is called run-length encoding. Check out problem 10 of 99 Scala Problems (click on the problem numbers for solutions).

How to replace(fill) None entries on List of Options from another List using idiomatic Scala?

I have a List[Option[MyClass]] with None in random positions and I need to 'fill' that list again, from a List[MyClass], maintaining the order.
Here are sample lists and expected result:
val listA = List(Some(3),None,Some(5),None,None)
val listB = List(7,8,9)
val expectedList = List(Some(3), Some(7), Some(5), Some(8), Some(9))
So, how would be a idiomatic Scala to process that list?
def fillL[T](a:List[Option[T]], b:List[T]) = {
val iterB = b.iterator
a.map(_.orElse(Some(iterB.next)))
}
The iterator solution is arguably idiomatic Scala, and is definitely concise and easy to understand, but it's not functional—any time you call next on an iterator you're firmly in the land of side effects.
A more functional approach would be to use a fold:
def fillGaps[A](gappy: List[Option[A]], filler: List[A]) =
gappy.foldLeft((List.empty[Option[A]], filler)) {
case ((current, fs), Some(item)) => (current :+ Some(item), fs)
case ((current, f :: fs), None) => (current :+ Some(f), fs)
case ((current, Nil), None) => (current :+ None, Nil)
}._1
Here we move through the gappy list while maintaining two other lists: one for the items we've processed, and the other for the remaining filler elements.
This kind of solution isn't necessarily better than the other—Scala is designed to allow you to mix functional and imperative constructions in that way—but it does have potential advantages.
I'd just write it in the straightforward way, matching on the heads of the lists and handling each case appropriately:
def fill[A](l1: List[Option[A]], l2: List[A]) = (l1, l2) match {
case (Nil, _) => Nil
case (_, Nil) => l1
case (Some(x) :: xs, _) => Some(x) :: fill(xs, l2)
case (None :: xs, y :: ys) => Some(y) :: fill(xs, ys)
}
Presumably once you run out of things to fill it with, you just leave the rest of the Nones in there.