Scala partition into more than two lists

Scala partition into more than two lists - list

I have a list in Scala that I'm trying to partition into multiple lists based on a predicate that involves multiple elements of the list. For example, if I have
a: List[String] = List("a", "ab", "b", "abc", "c")
I want to get b: List[List[String]] which is a List of List[String] such that the sum of the lengths of the inner List[String] == 3. i.e List(List("a", "b", "c"), List("abc"), List("ab", "a"), ...)
[Edit] Needs to take a reasonable time for lists of length 50 or less.

It is not possible to build an efficient algorithm that is cheaper than O(2^n * O(p)) for any arbitrary predicate, p. This is because every subset must be evaluated. You will never achieve something that works for n == 50.

Build all possible sublists and filter:
def filter[A](list: List[A])(predicate: (List[A] => Boolean)): List[List[A]] = {
(for {i <- 1 to list.length
subList <- list.combinations(i)
if predicate(subList)
} yield subList).toList
}
val a = List("a", "ab", "b", "abc", "c")
val result = filter(a)(_.foldLeft(0)(_ + _.length) == 3)

I think Sergey is on a good track here, but we can optimize his code a little bit. First of all, we can notice that if the sum of string lengths is N then for sure we don't need to check combinations composed of more than N strings, as the shortest string is at least one character long. And, additionally, we can get away without for synctatic sugar and use the sum method instead of a much more generic (and thus, probably, not so quick) foldLeft.
For clarity's sake, let's first define a small helper function which will compute the sum of strings lengths:
def sumOfStr(list: List[String]) = list.map(_.length).sum
And now the main method:
def split(list: List[String], sum: Int) =
(1 to sum).map(list.combinations(_).filter(sumOfStr(_) == sum)).flatten.toList
EDIT: With our powers combined, we give you a still very inefficient, but hey-that's-the-best-we-can-do-in-reasonable-time version:
def sumOfStr(lst: List[String]) = {
var sum = 0
lst.foreach{ sum += _.length }
sum
}
def split(lst: List[String], sum: Int) =
(1 to sum).par
.map(lst.combinations(_).filter(sumOfStr(_) == sum))
.flatten.toList

Related

Scala reduce a List based on a condition

I have a List of certain type that I want to reduce based on a condition. I have a type where the Interval is a DateTime interval with a start and an end:
case class MyType(a: Interval, value: Double)
I have got a List[MyType] entries that I want to reduce to a List[MyType] based on MyType that contains same DateTime and value. I do not want to go over the List twice which I already do.
Say I have:
val a = MyType(interval1, 2)
val b = MyType(interval2, 2)
val c = MyType(interval3, 1)
val d = MyType(interval4, 6)
val e = MyType(interval5, 2)
val original = List(a, b, c, d, e)
I have to now reduce the original List based on the following conditions:
1. interval should be continuous, then take the start of the first entry and the end of the second entry
2. the double value should be the same
So assuming that interval1, interval2 are continuous, the result should look like:
val result = Seq(MyType(new Interval(a.interval.start, b.interval.end),2), c, d, e)
Is there a much more elegant solution or an idea?

In the reduce function, check if the condition is true, and if it is, return the current accumulator instead of what would you otherwise compute.
Here's how you would sum only even numbers:
Seq(1,4,6,3).foldLeft(0)( (acc, a) =>
if (a % 2 == 0) acc + a else acc
)
res5: Int = 10
Response to the edited question: It appears you have some conditions that have to hold about the consecuitve elements. Then you can apply the function .sliding.
Seq(a,b,c,d,e).sliding(2).foldLeft(0)(
case (acc, Seq(MyType(ai, a), MyType(bi, b))) =>
if (ai.max == bi.min) acc + a else acc
)
Buuut... You have probably guessed it would not be as performant as you would like. I hope you are not doing any premature optimization, because you know, that's the root of all evil. But if you really need performance, rewrite the code in terms of while loops (fall back to Java).

This should work:
def reduce(xs: List[MyType]) = {
xs match {
case a :: b :: tail =>
if(a.interval.end == b.interval.start && a.value == b.value)
reduce(MyType(new Interval(a.interval.start, b.interval.end) a.value) :: tail)
else
a :: reduce(b :: tail)
case _ => xs
}
}
The if condition might need minor tweaking depending on your exact needs, but the algorithm should work.
Given a list xs
If the first two items a and b can be merged into c, merge them and go back to step 1 with xs = c :: tail
If a and b cannot be merged, try reducing all elements but the first, and append the result to a
Otherwise (list has 1 element or is empty), return xs

Pay attantion that your task could result in multiple distinct solutions, which cannot be further reduced.
So as result you will get a set of solutions: Set[Set[MyType]]
I use Set[MyType] instead of proposed List[MyType] and Seq[MyType] because order is not important and my answer needs possibility to compare different solutions (in order to avoid duplicates).
My answer doesn't make assumptions about order of items, any order is OK.
Besides that in order to simplify the code I have replaced Interval with 2 fields from and to, which can be easily converted.
Here is the code for reduction:
case class MyType(from: Long, to: Long, value: Double)
object MyType {
//Returns all possible varians of reduced source.
//If reduction is not possible, returns empty set.
private def strictReduce(source: Set[MyType]): Set[Set[MyType]] = {
if (source.size <= 1) {Set.empty} else {
val active = source.head //get some item
val otherItems = source.tail //all other items
val reducedWithActive: Set[Set[MyType]] = otherItems.flatMap {
case after if active.to == after.from =>
//we have already found a reduction (active->after),
// so further reductions are not strictly required
reduce(otherItems - after + MyType(active.from, after.to, active.value))
case before if before.to == active.from =>
//we have already found a reduction (before->active),
// so further reductions are not strictly required
reduce(otherItems - before + MyType(before.from, active.to, active.value))
case notContinuos => Set.empty[Set[MyType]]
}
//check if we can reduce items without active
val reducedIgnoringActive = strictReduce(otherItems).
//if so, re-insert active and try to reduce it further, but not strictly anymore
flatMap (reducedOther => reduce(reducedOther + active))
reducedWithActive ++ reducedIgnoringActive
}
}
//Returns all possible varians of reduced source.
//If reduction is not possible, returns source as single result.
private def reduce(source: Set[MyType]): Set[Set[MyType]] = strictReduce(source) match {
case empty if empty.isEmpty => Set(source)
case reduced => reduced
}
//Reduces source, which contains items with different values
def reduceAll(source: Set[MyType]): Set[Set[MyType]] = source.
groupBy(_.value). //divide by values, because they are not merge-able
mapValues(reduce). //reduce for every group
values.reduceLeft((solutionSetForValueA, solutionSetForValueB) =>
//merge solutions for different groups
for(subSolutionForValueA <- solutionSetForValueA;
subSolutionForValueB <- solutionSetForValueB)
yield (subSolutionForValueA ++ subSolutionForValueB) //merge subSolutions
)
}
And here is the sample, which uses it:
object Example extends App {
val source = Set(
MyType(0L, 1L, 1.0),
MyType(1L, 2L, 2.0), //different value
MyType(1L, 3L, 1.0), //competing with next
MyType(1L, 4L, 1.0), //competing with prev
MyType(3L, 5L, 1.0), //joinable with pre-prev
MyType(2L, 4L, 2.0), //joinable with second
MyType(0L, 4L, 3.0) //lonely
)
val solutions: Set[Set[MyType]] = MyType.reduceAll(source)
//here you could choose the best solution (for example by size)
//printing out
solutions.foreach(solution => println(solution.toList.sortBy(_.from).sortBy(_.value).
map(item => s"${item.from}->${item.to}(${item.value})").mkString(", ")))
}
My result is:
0->5(1.0), 1->4(1.0), 1->4(2.0), 0->4(3.0)
0->4(1.0), 1->5(1.0), 1->4(2.0), 0->4(3.0)

Here is what I came up with:
def reduce(accumulator: Seq[MyType], original: Seq[MyType]): Seq[MyType] = original match {
case Nil => accumulator
case head :: xs => {
val found = xs.find(_.timeSpan.getStart().equals(head.timeSpan.getEnd))
if (found.isDefined && found.get.value == head.value) {
reduce(
accumulator :+ (MyType(new Interval(head.timeSpan.getStart, found.get.timeSpan.getEnd), head.value)),
original.diff(Seq(found.get, head))
)
}
else
reduce(
accumulator :+ head,
xs
)
}
}

Finding index of row from a list

I am trying to get the index of a row using Scala from a list consisting of lists of integers List[List[Int]]. I already have two functions that given the row index/column index and the grid as parameters, it outputs all the elements in that row. What I need is a function that given a particular element (eg: 0), it finds its row index and column index and puts them in a list: List[(Int, Int)]. I tried to code a function that gives back an index when encountering 0 and then I passed the function to the whole grid. I don't know if I'm doing it the right way. Also, I couldn't figure out how to return the list.
Also, I cannot use any loops.
Thanks in advance.
def Possibilities(): List[Int] = {
def getRowIndex(elem: Int): Int = elem match
{
case 0 => sudoku.grid.indexOf(sudoku.row(elem))
case x => x
}
val result1 = sudoku.grid map {row => row map getRowIndex}
}

I think with two dimensions it is much easier to write such a method with for comprehensions.
Given a List[List[Int]] like this:
val grid = List(
List(1, 2, 3),
List(4, 5, 6),
List(3, 2, 1))
we can simply walk through all the rows and columns, and check whether each element is the one we are looking for:
def possibilities(findElem: Int): List[(Int, Int)] = {
for (
(row, rowIndex) <- grid.zipWithIndex;
(elem, colIndex) <- row.zipWithIndex
if elem == findElem
) yield (rowIndex, colIndex)
}
The yield keyword creates a collection of the results of the for loop. You can find more details on Scala's forloop syntax here (and a more thorough discussion on how this relates to map, flatMap, etc. here).
So, if you don't want to use a for loop, simply 'translate' it into an equivalent expression using map. flatMap, and withFilter:
def possibilities(findElem: Int): List[(Int, Int)] = {
grid.zipWithIndex flatMap { rowAndIndex =>
rowAndIndex._1.zipWithIndex.withFilter(_._1 == findElem) map { colAndIndex =>
(rowAndIndex._2, colAndIndex._2)
}
}
}

Step 1, create a collection of all possible tuples of indices, with a for comprehension (for looks like a loop but it is not)
val tuples = for (i <- grid.indices; j <- grid.head.indices) yield (i, j)
Step 2, filter this collection
tuples.filter { case (i, j) => grid(i)(j) == valueToFind }

Generate a List with values generated by function in Scala

I must generate some random numbers and sum them. Something like
result = generateList(range(0, max), generatorFunctionReturningInt()).foreach(sum _)
If generateList generates a List with size = max and values generated by generatorFunctionReturningInt
Or may be something like
result = range(0, max).map(generatorFunctionReturningInt).foreach(sum _)

How about this?
Stream.continually(generatorFunctionReturningInt()).take(max).sum

The companion objects for various collection types have some handy factory methods. Try:
Seq.fill(max)(generate)

Simply
(0 to max).map(_ => (new util.Random).nextInt(max)).sum
where max defines both number of numbers and random range.
foreach method intended to be used with side-effect functions (like println) which returns nothing (Unit).

foreach is not for returning results, use map instead:
val result = Range (0, max).map (generatorFunctionReturningInt).map (sum _)
specifically, sum is already predefined, isn't it?
val result = Range (0, max).map (generatorFunctionReturningInt).sum
working code (we all like working code)
val result = Range (0, 15).map (3 + _).sum
result: Int = 150
For this trivial case it is the same as Range (3, 18).sum.

Scala insert into list at specific locations

This is the problem that I did solve, however being a total imperative Scala noob, I feel I found something totally not elegant. Any ideas of improvement appreciated.
val l1 = 4 :: 1 :: 2 :: 3 :: 4 :: Nil // original list
val insert = List(88,99) // list I want to insert on certain places
// method that finds all indexes of a particular element in a particular list
def indexesOf(element:Any, inList:List[Any]) = {
var indexes = List[Int]()
for(i <- 0 until inList.length) {
if(inList(i) == element) indexes = indexes :+ i
}
indexes
}
var indexes = indexesOf(4, l1) // get indexes where 4 appears in the original list
println(indexes)
var result = List[Any]()
// iterate through indexes and insert in front
for(i <- 0 until indexes.length) {
var prev = if(i == 0) 0 else indexes(i-1)
result = result ::: l1.slice(prev, indexes(i)) ::: insert
}
result = result ::: l1.drop(indexes.last) // append the last bit from original list
println(result)
I was thinking more elegant solution would be achievable with something like this, but that's just pure speculation.
var final:List[Any] = (0 /: indexes) {(final, i) => final ::: ins ::: l1.slice(i, indexes(i))

def insert[A](xs: List[A], extra: List[A])(p: A => Boolean) = {
xs.map(x => if (p(x)) extra ::: List(x) else List(x)).flatten
}
scala> insert(List(4,1,2,3,4),List(88,99)){_ == 4}
res3: List[Int] = List(88, 99, 4, 1, 2, 3, 88, 99, 4)
Edit: explanation added.
Our goal here is to insert a list (called extra) in front of selected elements in another list (here called xs--commonly used for lists, as if one thing is x then lots of them must be the plural xs). We want this to work on any type of list we might have, so we annotate it with the generic type [A].
Which elements are candidates for insertion? When writing the function, we don't know, so we provide a function that says true or false for each element (p: A => Boolean).
Now, for each element in the list x, we check--should we make the insertion (i.e. is p(x) true)? If yes, we just build it: extra ::: List(x) is just the elements of extra followed by the single item x. (It might be better to write this as extra :+ x--add the single item at the end.) If no, we have only the single item, but we make it List(x) instead of just x because we want everything to have the same type. So now, if we have something like
4 1 2 3 4
and our condition is that we insert 5 6 before 4, we generate
List(5 6 4) List(1) List(2) List(3) List(5 6 4)
This is exactly what we want, except we have a list of lists. To get rid of the inner lists and flatten everything into a single list, we just call flatten.

The flatten trick is cute, I wouldn't have thought of using map here myself. From my perspective this problem is a typical application for a fold, as you want go through the list and "collect" something (the result list). As we don't want our result list backwards, foldRight (a.k.a. :\) is here the right version:
def insert[A](xs: List[A], extra: List[A])(p: A => Boolean) =
xs.foldRight(List[A]())((x,xs) => if (p(x)) extra ::: (x :: xs) else x :: xs)

Here's another possibility, using Seq#patch to handle the actual inserts. You need to foldRight so that later indices are handled first (inserts modify the indices of all elements after the insert, so it would be tricky otherwise).
def insert[A](xs: Seq[A], ys: Seq[A])(pred: A => Boolean) = {
val positions = xs.zipWithIndex filter(x => pred(x._1)) map(_._2)
positions.foldRight(xs) { (pos, xs) => xs patch (pos, ys, 0) }
}

scala return on first Some in list

I have a list l:List[T1] and currently im doing the following:
myfun : T1 -> Option[T2]
val x: Option[T2] = l.map{ myfun(l) }.flatten.find(_=>true)
The myfun function returns None or Some, flatten throws away all the None's and find returns the first element of the list if any.
This seems a bit hacky to me. Im thinking that there might exist some for-comprehension or similar that will do this a bit less wasteful or more clever.
For example: I dont need any subsequent answers if myfun returns any Some during the map of the list l.

How about:
l.toStream flatMap (myfun andThen (_.toList)) headOption
Stream is lazy, so it won't map everything in advance, but it won't remap things either. Instead of flattening things, convert Option to List so that flatMap can be used.

In addition to using toStream to make the search lazy, we can use Stream::collectFirst:
List(1, 2, 3, 4, 5, 6, 7, 8).toStream.map(myfun).collectFirst { case Some(d) => d }
// Option[String] = Some(hello)
// given def function(i: Int): Option[String] = if (i == 5) Some("hello") else None
This:
Transforms the List into a Stream in order to stop the search early.
Transforms elements using myFun as Option[T]s.
Collects the first mapped element which is not None and extract it.
Starting Scala 2.13, with the deprecation of Streams in favor of LazyLists, this would become:
List(1, 2, 3, 4, 5, 6, 7, 8).to(LazyList).map(function).collectFirst { case Some(d) => d }

Well, this is almost, but not quite
val x = (l flatMap myfun).headOption
But you are returning a Option rather than a List from myfun, so this may not work. If so (I've no REPL to hand) then try instead:
val x = (l flatMap(myfun(_).toList)).headOption

Well, the for-comprehension equivalent is pretty easy
(for(x<-l, y<-myfun(x)) yield y).headOption
which, if you actually do the the translation works out the same as what oxbow_lakes gave. Assuming reasonable laziness of List.flatmap, this is both a clean and efficient solution.

As of 2017, the previous answers seem to be outdated. I ran some benchmarks (list of 10 million Ints, first match roughly in the middle, Scala 2.12.3, Java 1.8.0, 1.8 GHz Intel Core i5). Unless otherwise noted, list and map have the following types:
list: scala.collection.immutable.List
map: A => Option[B]
Simply call map on the list: ~1000 ms
list.map(map).find(_.isDefined).flatten
First call toStream on the list: ~1200 ms
list.toStream.map(map).find(_.isDefined).flatten
Call toStream.flatMap on the list: ~450 ms
list.toStream.flatMap(map(_).toList).headOption
Call flatMap on the list: ~100 ms
list.flatMap(map(_).toList).headOption
First call iterator on the list: ~35 ms
list.iterator.map(map).find(_.isDefined).flatten
Recursive function find(): ~25 ms
def find[A,B](list: scala.collection.immutable.List[A], map: A => Option[B]) : Option[B] = {
list match {
case Nil => None
case head::tail => map(head) match {
case None => find(tail, map)
case result # Some(_) => result
}
}
}
Iterative function find(): ~25 ms
def find[A,B](list: scala.collection.immutable.List[A], map: A => Option[B]) : Option[B] = {
for (elem <- list) {
val result = map(elem)
if (result.isDefined) return result
}
return None
}
You can further speed up things by using Java instead of Scala collections and a less functional style.
Loop over indices in java.util.ArrayList: ~15 ms
def find[A,B](list: java.util.ArrayList[A], map: A => Option[B]) : Option[B] = {
var i = 0
while (i < list.size()) {
val result = map(list.get(i))
if (result.isDefined) return result
i += 1
}
return None
}
Loop over indices in java.util.ArrayList with function returning null instead of None: ~10 ms
def find[A,B](list: java.util.ArrayList[A], map: A => B) : Option[B] = {
var i = 0
while (i < list.size()) {
val result = map(list.get(i))
if (result != null) return Some(result)
i += 1
}
return None
}
(Of course, one would usually declare the parameter type as java.util.List, not java.util.ArrayList. I chose the latter here because it's the class I used for the benchmarks. Other implementations of java.util.List will show different performance - most will be worse.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Scala partition into more than two lists - list

It is not possible to build an efficient algorithm that is cheaper than O(2^n * O(p)) for any arbitrary predicate, p. This is because every subset must be evaluated. You will never achieve something that works for n == 50.

Related

Scala reduce a List based on a condition

Finding index of row from a list

Generate a List with values generated by function in Scala

Scala insert into list at specific locations

scala return on first Some in list

Categories

Resources