I'm trying to code the fast Non Dominated Sorting algorithm (NDS) of Deb used in NSGA2 in immutable way using Scala.
But the problem seems more difficult than i think, so i simplify here the problem to make a MWE.
Imagine a population of Seq[A], and each A element is decoratedA with a list which contains pointers to other elements of the population Seq[A].
A function evalA(a:decoratedA) take the list of linkedA it contains, and decrement value of each.
Next i take a subset list decoratedAPopulation of population A, and call evalA on each. I have a problem, because between each iteration on element on this subset list decoratedAPopulation, i need to update my population of A with the new decoratedA and the new updated linkedA it contain ...
More problematic, each element of population need an update of 'linkedA' to replace the linked element if it change ...
Hum as you can see, it seem complicated to maintain all linked list synchronized in this way. I propose another solution bottom, which probably need recursion to return after each EvalA a new Population with element replaced.
How can i do that correctly in an immutable way ?
It's easy to code in a mutable way, but i don't find a good way to do this in an immutable way, do you have a path or an idea to do that ?
object test extends App{
case class A(value:Int) {def decrement()= new A(value - 1)}
case class decoratedA(oneAdecorated:A, listOfLinkedA:Seq[A])
// We start algorithm loop with A element with value = 0
val population = Seq(new A(0), new A(0), new A(8), new A(1))
val decoratedApopulation = Seq(new decoratedA(population(1),Seq(population(2),population(3))),
new decoratedA(population(2),Seq(population(1),population(3))))
def evalA(a:decoratedA) = {
val newListOfLinked = a.listOfLinkedA.map{e => e.decrement()
new decoratedA(a.oneAdecorated,newListOfLinked)}
}
def run()= {
//decoratedApopulation.map{
// ?
//}
}
}
Update 1:
About the input / output of the initial algorithm.
The first part of Deb algorithm (Step 1 to Step 3) analyse a list of Individual, and compute for each A : (a) domination count, the number of A which dominate me (the value attribute of A) (b) a list of A i dominate (listOfLinkedA).
So it return a Population of decoratedA totally initialized, and for the entry of Step 4 (my problem) i take the first non dominated front, cf. the subset of elements of decoratedA with A value = 0.
My problem start here, with a list of decoratedA with A value = 0; and i search the next front into this list by computing each listOfLinkedA of each of this A
At each iteration between step 4 to step 6, i need to compute a new B subset list of decoratedA with A value = 0. For each , i decrementing first the domination count attribute of each element into listOfLinkedA, then i filter to get the element equal to 0. A the end of step 6, B is saved to a list List[Seq[DecoratedA]], then i restart to step 4 with B, and compute a new C, etc.
Something like that in my code, i call explore() for each element of B, with Q equal at the end to new subset of decoratedA with value (fitness here) = 0 :
case class PopulationElement(popElement:Seq[Double]){
implicit def poptodouble():Seq[Double] = {
popElement
}
}
class SolutionElement(values: PopulationElement, fitness:Double, dominates: Seq[SolutionElement]) {
def decrement()= if (fitness == 0) this else new SolutionElement(values,fitness - 1, dominates)
def explore(Q:Seq[SolutionElement]):(SolutionElement, Seq[SolutionElement])={
// return all dominates elements with fitness - 1
val newSolutionSet = dominates.map{_.decrement()}
val filteredSolution:Seq[SolutionElement] = newSolutionSet.filter{s => s.fitness == 0.0}.diff{Q}
filteredSolution
}
}
A the end of algorithm, i have a final list of seq of decoratedA List[Seq[DecoratedA]] which contain all my fronts computed.
Update 2
A sample of value extracted from this example.
I take only the pareto front (red) and the {f,h,l} next front with dominated count = 1.
case class p(x: Int, y: Int)
val a = A(p(3.5, 1.0),0)
val b = A(p(3.0, 1.5),0)
val c = A(p(2.0, 2.0),0)
val d = A(p(1.0, 3.0),0)
val e = A(p(0.5, 4.0),0)
val f = A(p(0.5, 4.5),1)
val h = A(p(1.5, 3.5),1)
val l = A(p(4.5, 1.0),1)
case class A(XY:p, value:Int) {def decrement()= new A(XY, value - 1)}
case class ARoot(node:A, children:Seq[A])
val population = Seq(
ARoot(a,Seq(f,h,l),
ARoot(b,Seq(f,h,l)),
ARoot(c,Seq(f,h,l)),
ARoot(d,Seq(f,h,l)),
ARoot(e,Seq(f,h,l)),
ARoot(f,Nil),
ARoot(h,Nil),
ARoot(l,Nil))
Algorithm return List(List(a,b,c,d,e), List(f,h,l))
Update 3
After 2 hour, and some pattern matching problems (Ahum...) i'm comming back with complete example which compute automaticaly the dominated counter, and the children of each ARoot.
But i have the same problem, my children list computation is not totally correct, because each element A is possibly a shared member of another ARoot children list, so i need to think about your answer to modify it :/ At this time i only compute children list of Seq[p], and i need list of seq[A]
case class p(x: Double, y: Double){
def toSeq():Seq[Double] = Seq(x,y)
}
case class A(XY:p, dominatedCounter:Int) {def decrement()= new A(XY, dominatedCounter - 1)}
case class ARoot(node:A, children:Seq[A])
case class ARootRaw(node:A, children:Seq[p])
object test_stackoverflow extends App {
val a = new p(3.5, 1.0)
val b = new p(3.0, 1.5)
val c = new p(2.0, 2.0)
val d = new p(1.0, 3.0)
val e = new p(0.5, 4.0)
val f = new p(0.5, 4.5)
val g = new p(1.5, 4.5)
val h = new p(1.5, 3.5)
val i = new p(2.0, 3.5)
val j = new p(2.5, 3.0)
val k = new p(3.5, 2.0)
val l = new p(4.5, 1.0)
val m = new p(4.5, 2.5)
val n = new p(4.0, 4.0)
val o = new p(3.0, 4.0)
val p = new p(5.0, 4.5)
def isStriclyDominated(p1: p, p2: p): Boolean = {
(p1.toSeq zip p2.toSeq).exists { case (g1, g2) => g1 < g2 }
}
def sortedByRank(population: Seq[p]) = {
def paretoRanking(values: Set[p]) = {
//comment from #dk14: I suppose order of values isn't matter here, otherwise use SortedSet
values.map { v1 =>
val t = (values - v1).filter(isStriclyDominated(v1, _)).toSeq
val a = new A(v1, values.size - t.size - 1)
val root = new ARootRaw(a, t)
println("Root value ", root)
root
}
}
val listOfARootRaw = paretoRanking(population.toSet)
//From #dk14: Here is convertion from Seq[p] to Seq[A]
val dominations: Map[p, Int] = listOfARootRaw.map(a => a.node.XY -> a.node.dominatedCounter) //From #dk14: It's a map with dominatedCounter for each point
val listOfARoot = listOfARootRaw.map(raw => ARoot(raw.node, raw.children.map(p => A(p, dominations.getOrElse(p, 0)))))
listOfARoot.groupBy(_.node.dominatedCounter)
}
//Get the first front, a subset of ARoot, and start the step 4
println(sortedByRank(Seq(a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p)).head)
}
Talking about your problem with distinguishing fronts (after update 2):
val (left,right) = population.partition(_.node.value == 0)
List(left, right.map(_.copy(node = node.copy(value = node.value - 1))))
No need for mutating anything here. copy will copy everything but fields you specified with new values. Talking about the code, the new copy will be linked to the same list of children, but new value = value - 1.
P.S. I have a feeling you may actually want to do something like this:
case class A(id: String, level: Int)
val a = A("a", 1)
val b = A("b", 2)
val c = A("c", 2)
val d = A("d", 3)
clusterize(List(a,b,c,d)) === List(List(a), List(b,c), List(d))
It's simple to implement:
def clusterize(list: List[A]) =
list.groupBy(_.level).toList.sortBy(_._1).map(_._2)
Test:
scala> clusterize(List(A("a", 1), A("b", 2), A("c", 2), A("d", 3)))
res2: List[List[A]] = List(List(A(a,1)), List(A(b,2), A(c,2)), List(A(d,3)))
P.S.2. Please consider better naming conventions, like here.
Talking about "mutating" elements in some complex structure:
The idea of "immutable mutating" some shared (between parts of a structure) value is to separate your "mutation" from the structure. Or simply saying, divide and conquerror:
calculate changes in advance
apply them
The code:
case class A(v: Int)
case class AA(a: A, seq: Seq[A]) //decoratedA
def update(input: Seq[AA]) = {
//shows how to decrement each value wherever it is:
val stats = input.map(_.a).groupBy(identity).mapValues(_.size) //domination count for each A
def upd(a: A) = A(a.v - stats.getOrElse(a, 0)) //apply decrement
input.map(aa => aa.copy(aa = aa.seq.map(upd))) //traverse and "update" original structure
}
So, I've introduced new Map[A, Int] structure, that shows how to modify the original one. This approach is based on highly simplified version of Applicative Functor concept. In general case, it should be Map[A, A => A] or even Map[K, A => B] or even Map[K, Zipper[A] => B] as applicative functor (input <*> map). *Zipper (see 1, 2) actually could give you information about current element's context.
Notes:
I assumed that As with same value are same; that's default behaviour for case classess, otherwise you need to provide some additional id's (or redefine hashCode/equals).
If you need more levels - like AA(AA(AA(...)))) - just make stats and upd recursive, if dеcrement's weight depends on nesting level - just add nesting level as parameter to your recursive function.
If decrement depends on parent node (like decrement only A(3)'s, which belongs to A(3)) - add parent node(s) as part of stats's key and analise it during upd.
If there is some dependency between stats calculation (how much to decrement) of let's say input(1) from input(0) - you should use foldLeft with partial stats as accumulator: val stats = input.foldLeft(Map[A, Int]())((partialStats, elem) => partialStats ++ analize(partialStats, elem))
Btw, it takes O(N) here (linear memory and cpu usage)
Example:
scala> val population = Seq(A(3), A(6), A(8), A(3))
population: Seq[A] = List(A(3), A(6), A(8), A(3))
scala> val input = Seq(AA(population(1),Seq(population(2),population(3))), AA(population(2),Seq(population(1),population(3))))
input: Seq[AA] = List(AA(A(6),List(A(8), A(3))), AA(A(8),List(A(6), A(3))))
scala> update(input)
res34: Seq[AA] = List(AA(A(5),List(A(7), A(3))), AA(A(7),List(A(5), A(3))))
I want a shifter list for a input buffer. My code is:
//Simulate input whit Slider. Is work perfect. Only work for changes by the user.
list = Table[0, {10}];
Slider[Dynamic[b, (b = #; list = Take[Join[list, {b}], -10]) &], {0,
10, 1}]
Dynamic#list
//x is a simulation of data input
Dynamic[x = RandomInteger[10], UpdateInterval -> 1]
//Shifter list. As 'a' change, the code is replayed.
Dynamic[Take[AppendTo[a, x], -10],UpdateInterval -> 1]
I want to run the code only for 'x' changes. No for changes of 'a'. Help me, please.
Not sure how your Slider is related to the question but here's an answer:
use TrackedSymbols to specify what can trigger second Dynamic.
Dynamic[x = RandomInteger[10], UpdateInterval -> .2]
a = {};
Dynamic[a = PadLeft[Flatten#{a, x}, 10], TrackedSymbols :> {x}]
no need for UpdateInterval then.
Keep in mind that operations in Dynamic will be performed only when such cell is visible. Maybe better approach is to use ScheduledTasks or regular Do + Pause.
I am counting values in each window and find the top values and want to save only the top 10 frequent values of each window to hdfs rather than all the values.
eegStreams(a) = KafkaUtils.createStream(ssc, zkQuorum, group, Map(args(a) -> 1),StorageLevel.MEMORY_AND_DISK_SER).map(_._2)
val counts = eegStreams(a).map(x => (math.round(x.toDouble), 1)).reduceByKeyAndWindow(_ + _, _ - _, Seconds(4), Seconds(4))
val sortedCounts = counts.map(_.swap).transform(rdd => rdd.sortByKey(false)).map(_.swap)
ssc.sparkContext.parallelize(rdd.take(10)).saveAsTextFile("hdfs://ec2-23-21-113-136.compute-1.amazonaws.com:9000/user/hduser/output/" + (a+1))}
//sortedCounts.foreachRDD(rdd =>println("\nTop 10 amplitudes:\n" + rdd.take(10).mkString("\n")))
sortedCounts.map(tuple => "%s,%s".format(tuple._1, tuple._2)).saveAsTextFiles("hdfs://ec2-23-21-113-136.compute-1.amazonaws.com:9000/user/hduser/output/" + (a+1))
I can print top 10 as above (commented).
I have also tried
sortedCounts.foreachRDD{ rdd => ssc.sparkContext.parallelize(rdd.take(10)).saveAsTextFile("hdfs://ec2-23-21-113-136.compute-1.amazonaws.com:9000/user/hduser/output/" + (a+1))}
but I get the following error. My Array is not serializable
15/01/05 17:12:23 ERROR actor.OneForOneStrategy:
org.apache.spark.streaming.StreamingContext
java.io.NotSerializableException:
org.apache.spark.streaming.StreamingContext
Can you try this?
sortedCounts.foreachRDD(rdd => rdd.filterWith(ind => ind)((v, ind) => ind <= 10).saveAsTextFile(...))
Note: I didn't test the snippet...
Your first version should work. Just declare #transient ssc = ... where the Streaming Context is first created.
The second version won't work b/c StreamingContext cannot be serialized in a closure.
I have a list of items and for each item I am computing a value. Computing this value is a bit computationally intensive so I want to minimise it as much as possible.
The algorithm I need to implement is this:
I have a value X
For each item
a. compute the value for it, if it is < 0 ignore it completely
b. if (value > 0) && (value < X)
return pair (item, value)
Return all (item, value) pairs in a List (that have the value > 0), ideally sorted by value
To make it a bit clearer, step 3 only happens if none of the items have a value less than X. In step 2, when we encounter the first item that is less than X we should not compute the rest and just return that item (we can obviously return it in a Set() by itself to match the return type).
The code I have at the moment is as follows:
val itemValMap = items.foldLeft(Map[Item, Int)]()) {
(map : Map[Item, Int], key : Item) =>
val value = computeValue(item)
if ( value >= 0 ) //we filter out negative ones
map + (key -> value)
else
map
}
val bestItem = itemValMap.minBy(_._2)
if (bestItem._2 < bestX)
{
List(bestItem)
}
else
{
itemValMap.toList.sortBy(_._2)
}
However, what this code is doing is computing all the values in the list and choosing the best one, rather than stopping as a 'better' one is found. I suspect I have to use Streams in some way to achieve this?
OK, I'm not sure how your whole setup looks like, but I tried to prepare a minimal example that would mirror your situation.
Here it is then:
object StreamTest {
case class Item(value : Int)
def createItems() = List(Item(0),Item(3),Item(30),Item(8),Item(8),Item(4),Item(54),Item(-1),Item(23),Item(131))
def computeValue(i : Item) = { Thread.sleep(3000); i.value * 2 - 2 }
def process(minValue : Int)(items : Seq[Item]) = {
val stream = Stream(items: _*).map(item => item -> computeValue(item)).filter(tuple => tuple._2 >= 0)
stream.find(tuple => tuple._2 < minValue).map(List(_)).getOrElse(stream.sortBy(_._2).toList)
}
}
Each calculation takes 3 seconds. Now let's see how it works:
val items = StreamTest.createItems()
val result = StreamTest.process(2)(items)
result.foreach(r => println("Original: " + r._1 + " , calculated: " + r._2))
Gives:
[info] Running Main
Original: Item(3) , calculated: 4
Original: Item(4) , calculated: 6
Original: Item(8) , calculated: 14
Original: Item(8) , calculated: 14
Original: Item(23) , calculated: 44
Original: Item(30) , calculated: 58
Original: Item(54) , calculated: 106
Original: Item(131) , calculated: 260
[success] Total time: 31 s, completed 2013-11-21 15:57:54
Since there's no value smaller than 2, we got a list ordered by the calculated value. Notice that two pairs are missing, because calculated values are smaller than 0 and got filtered out.
OK, now let's try with a different minimum cut-off point:
val result = StreamTest.process(5)(items)
Which gives:
[info] Running Main
Original: Item(3) , calculated: 4
[success] Total time: 7 s, completed 2013-11-21 15:55:20
Good, it returned a list with only one item, the first value (second item in the original list) that was smaller than 'minimal' value and was not smaller than 0.
I hope that the example above is easily adaptable to your needs...
A simple way to avoid the computation of unneeded values is to make your collection lazy by using the view method:
val weigthedItems = items.view.map{ i => i -> computeValue(i) }.filter(_._2 >= 0 )
weigthedItems.find(_._2 < X).map(List(_)).getOrElse(weigthedItems.sortBy(_._2))
By example here is a test in the REPL:
scala> :paste
// Entering paste mode (ctrl-D to finish)
type Item = String
def computeValue( item: Item ): Int = {
println("Computing " + item)
item.toInt
}
val items = List[Item]("13", "1", "5", "-7", "12", "3", "-1", "15")
val X = 10
val weigthedItems = items.view.map{ i => i -> computeValue(i) }.filter(_._2 >= 0 )
weigthedItems.find(_._2 < X).map(List(_)).getOrElse(weigthedItems.sortBy(_._2))
// Exiting paste mode, now interpreting.
Computing 13
Computing 1
defined type alias Item
computeValue: (item: Item)Int
items: List[String] = List(13, 1, 5, -7, 12, 3, -1, 15)
X: Int = 10
weigthedItems: scala.collection.SeqView[(String, Int),Seq[_]] = SeqViewM(...)
res27: Seq[(String, Int)] = List((1,1))
As you can see computeValue was only called up to the first value < X (that is, up to 1)
I ve got the following class and I want to write some Spec test cases, but I am really new to it and I don't know how to start. My class do loke like this:
class Board{
val array = Array.fill(7)(Array.fill(6)(None:Option[Coin]))
def move(x:Int, coin:Coin) {
val y = array(x).indexOf(None)
require(y >= 0)
array(x)(y) = Some(coin)
}
def apply(x: Int, y: Int):Option[Coin] =
if (0 <= x && x < 7 && 0 <= y && y < 6) array(x)(y)
else None
def winner: Option[Coin] = winner(Cross).orElse(winner(Naught))
private def winner(coin:Coin):Option[Coin] = {
val rows = (0 until 6).map(y => (0 until 7).map( x => apply(x,y)))
val cols = (0 until 7).map(x => (0 until 6).map( y => apply(x,y)))
val dia1 = (0 until 4).map(x => (0 until 6).map( y => apply(x+y,y)))
val dia2 = (3 until 7).map(x => (0 until 6).map( y => apply(x-y,y)))
val slice = List.fill(4)(Some(coin))
if((rows ++ cols ++ dia1 ++ dia2).exists(_.containsSlice(slice)))
Some(coin)
else None
}
override def toString = {
val string = new StringBuilder
for(y <- 5 to 0 by -1; x <- 0 to 6){
string.append(apply(x, y).getOrElse("_"))
if (x == 6) string.append ("\n")
else string.append("|")
}
string.append("0 1 2 3 4 5 6\n").toString
}
}
Thank you!
I can only second Daniel's suggestion, because you'll end up with a more practical API by using TDD.
I also think that your application could be nicely tested with a mix of specs2 and ScalaCheck. Here the draft of a Specification to get you started:
import org.specs2._
import org.scalacheck.{Arbitrary, Gen}
class TestSpec extends Specification with ScalaCheck { def is =
"moving a coin in a column moves the coin to the nearest empty slot" ! e1^
"a coin wins if" ^
"a row contains 4 consecutive coins" ! e2^
"a column contains 4 consecutive coins" ! e3^
"a diagonal contains 4 consecutive coins" ! e4^
end
def e1 = check { (b: Board, x: Int, c: Coin) =>
try { b.move(x, c) } catch { case e => () }
// either there was a coin before somewhere in that column
// or there is now after the move
(0 until 6).exists(y => b(x, y).isDefined)
}
def e2 = pending
def e3 = pending
def e4 = pending
/**
* Random data for Coins, x position and Board
*/
implicit def arbitraryCoin: Arbitrary[Coin] = Arbitrary { Gen.oneOf(Cross, Naught) }
implicit def arbitraryXPosition: Arbitrary[Int] = Arbitrary { Gen.choose(0, 6) }
implicit def arbitraryBoardMove: Arbitrary[(Int, Coin)] = Arbitrary {
for {
coin <- arbitraryCoin.arbitrary
x <- arbitraryXPosition.arbitrary
} yield (x, coin)
}
implicit def arbitraryBoard: Arbitrary[Board] = Arbitrary {
for {
moves <- Gen.listOf1(arbitraryBoardMove.arbitrary)
} yield {
val board = new Board
moves.foreach { case (x, coin) =>
try { board.move(x, coin) } catch { case e => () }}
board
}
}
}
object Cross extends Coin {
override def toString = "x"
}
object Naught extends Coin {
override def toString = "o"
}
sealed trait Coin
The e1 property I've implemented is not the real thing because it doesn't really check that we moved the coin to the nearest empty slot, which is what your code and your API suggests. You will also want to change the generated data so that the Boards are generated with an alternation of x and o. That should be a great way to learn how to use ScalaCheck!
I suggest you throw all that code out -- well, save it somewhere, but start from zero using TDD.
The Specs2 site has plenty examples of how to write tests, but use TDD -- test driven design -- to do it. Adding tests after the fact is suboptimal, to say the least.
So, think of the most simple case you want to handle of the most simple feature, write a test for that, see it fail, write the code to fix it. Refactor if necessary, and repeat for the next most simple case.
If you want help with how to do TDD in general, I heartily endorse the videos about TDD available on Clean Coders. At the very least, watch the second part where Bob Martin writes a whole class TDD-style, from design to end.
If you know how to do testing in general but are confused about Scala or Specs, please be much more specific about what your questions are.