Can't get flatten of Try into a for comprehension - list

This is a combination of a stylistic question, and my attempts to broaden my Scala understanding.
I've got a list containing Future's, I want to compute the values of the futures, transform into Option's, and flatten the list using a for comprehension:
import scala.util.Try
import scala.concurrent._
import ExecutionContext.Implicits.global
val l= List(Future.successful(1),Future.failed(new IllegalArgumentException))
implicit def try2Traversible[A](xo: Try[A]): Iterable[A] = xo.toOption.toList
val res = for{f <- l; v <- f.value} yield v
scala> res: List[scala.util.Try[Int]] = List(Success(1), Failure(java.lang.IllegalArgumentException))
res.flatten
res16: List[Int] = List(1)
What I want to do is get the flatten stage into the for comprehension, anyone got any suggestions?

Doing this is incorrect:
for{f <- l; v <- f.value} yield v
It appears to work in your case only because the futures are already fulfiled, which is why their value member is defined.
However in the general case they might not yet be fulfilled when you execute the for comprehension, and thus value will return None
(despite the fact that at some point they will eventually be fulfilled).
By example, try this in the REPL:
val f1 = Future{
Thread.sleep(3000) // Just a test to illustrate, never do this!
1
}
val f2 = Future{
Thread.sleep(3000) // Just a test to illustrate, never do this!
throw new IllegalArgumentException
}
val l = List( f1, f2 )
for{f <- l; v <- f.value} yield v
The result is an empty list, because none of the futures in l is fulfilled yet. Then wait a bit (at most 3 seconds) and reexecute the for comprehension (the last line), and you will get a non empty list because the futures have finally been fulfilled.
To fix this, you will have to either block (that is, wait for all the futures to be fulfiled) using scala.concurrent.Await, or stay in the asynchronous world by using something like Future.map or Future.flatMap.
By example, if you want to block, you could do:
Await.result( Future.sequence( l ), duration.Duration.Inf )
Await.result waits for the result of the future, allowing to go from the asynchronous world to the synchronous world. The result of the above is a List[Int]
The problem now is that you lose the failure cases (the result is not List[Try[Int]] as you wanted), and will actually rethrow the first exception.
To fix this, you can use this helper method that I posted in another answer: https://stackoverflow.com/a/15776974/1632462
Using it, you can do:
Await.result( Future.sequence( l map mapValue ), duration.Duration.Inf )
This will wait until all the futures are fulfiled (either with a correct value, or with an error) and return the expected List[Try[Int]]

The idea is to traverse to Try object as if it were an Option (i.e. a 0 or 1 element collection) within the for-comprehension itself.
For this traversal to work there has to be a conversion from the Try type to the Option type.
This should work:
implicit def try2option[A](xo: Try[A]) = xo.toOption
val res = for (f <- l; t <- f.value; x <- t) yield x

You should keep a Future around your final result to retain the asynchronous nature of the computation.
The nice way to do this (and obtain a Future[List[Int]]) would be (probably what you tried):
for {
f <- l // Extract individual future
v <- f // Extract value from future
} yield v
Unfortunately this translates to:
l.flatMap(f => f.map(v => v))
Which does not work, because Future does not inherit GenTraversableOnce (and probably shouldn't), but List needs this trait for its flatMap.
However, we can do this manually:
val res = l.foldRight(Future.successful(List.empty[Int])) {
case (x,xs) => xs.flatMap(vxs => x.map(vx => vx :: vxs))
}
We can use Future.sequence to do that:
Future.sequence(l)
This will return a Future[List[Int]] which only completes when all futures are completed and will contain all values of the futures that completed successfully.

Related

Haskell append to a list conditionally

I have 2 lists which I am trying to fill will items. While reading from stdin, depending on the value of one of the things read, I want to append to a different list. Example,
import Control.Monad(replicateM)
main = do
n <- getLine
let l1 = [], l2 = []
in replicateM (read n) (getLine >>= (\line ->
case line of "Yes" ->
-- do something with line
-- and append value of that thing to l1
"No" ->
-- do something else
-- append this value to l2
putStrLn line))
I realise the above code has syntax errors and such, but hopefully you can see what I am trying to and suggest something.
This is the answer I came up with
While we are at it, can someone explain why this gives me an infinite list:
let g = []
let g = 1:g
-- g now contains an infinite list of 1's
This is what I finally came up with:
import Control.Monad(replicateM)
import Data.Either
getEither::[String] -> [Either Double Double]
getEither [] = []
getEither (line:rest) = let [n, h] = words line
fn = read f :: Double
e = case heist of "Yes" -> Left fn
"No" -> Right fn
in e : getEither rest
main = do
n <- getLine
lines <- replicateM (read n) getLine
let tup = partitionEithers $ getEither lines :: ([Double], [Double])
print tup
Not sure how fmap could have been used in this instance
Here is a short ghci session that may give you some ideas:
> :m + Control.Monad Data.Either
> partitionEithers <$> replicateM 3 readLn :: IO ([Int], [Bool])
Left 5
Right True
Left 7
([5,7],[True])
The answer to your second question is that let is recursive; so the two gs in let g = 1:g are referring to the same in-memory object.
You are thinking in term of mutable variables: you are "initializing" l1,l2 to the empty list and then reasoning about updating them with longer lists. This design works fine in imperative programming, but not so simply in pure functional programming since it involves mutation.
Now, even in pure functional programming we have ways to simulate mutation, through monads. For instance, once can achieve mutation here through IORefs or StateT IO. In this case, though, is would be an unnecessarily complex way to solve the task.
You want to append data to form two lists. You want to use replicateM, which is fine. The point is that replicateM will build just one list, instead of two. The question now is: how can we create a list which is easily split into two?
A first ugly attempt is to generate a list of tagged values, i.e. a list of pairs:
case line of
"Yes" -> let value = ... in
return ("for l1", value)
"No" -> let value = ... in
return ("for l2", value)
Doing this would make replicateM produce a list such as
[("for l1", value1), ("for l1", value2), ("for l2", value3), ...]
which we can then split into two lists.
The use of strings for tags looks however a bit unelegant, since a boolean would suffice:
case line of
"Yes" -> let value = ... in
return (True, value)
"No" -> let value = ... in
return (False, value)
An even better approach would be to use the Either a b type:
case line of
"Yes" -> let value1 = ... in
return (Left value1)
"No" -> let value2 = ... in
return (Right value2)
The nice consequence of the above is that value1 and value2 can even be of different types. The previous snippets forced them to share their type: since we build a list of pairs each pair must have the same type. The new list is now instead of type [Either a b] where a is the type of values to be put in l1, and b that for l2.
Once you get a [Either a b] you want to split it in [a] and [b]. As #DanielWagner suggests in his answer, you can exploit partitionEithers for this.

List of functions with duplicates

Suppose I have a list List[() => Int] and need to invoke all the functions to get the list of results.
def invoke(fs: List[() => Int]): List[Int] = fs map (_())
What if fs has duplicates ? I can probably memoize the results but I need to invoke these functions concurrently. It looks like I need to do some preprocessing to make sure each function is invoked only once.
What would you suggest ?
Generically, there's no way of knowing whether two functions are equal. Even if you were looking at two copies of the same function instance, they might invoke some side effect (e.g. generating a random number) so it would be incorrect in some sense to elide the second call. In the cases where the function provably doesn't have side effects, the JVM can probably figure that out for itself. So I honestly think you're solving the wrong problem here.
But if you really want to memoize, I'd use scalaz Memo. The different kinds of Memo document what thread safety guarantees they offer.
def execute(fs: List[() => Int]) = {
val m = Memo.mutableHashMapMemo({f: (() => Int) => f()})
fs map m
}
May be using lazy:
def foo1(): Int = { println("f1"); 1 }
def foo2(): Int = { println("f2"); 2 }
def toLazy(f: () => Int): () => Int = {
lazy val res = f()
() => res
}
val f1 = toLazy(foo1)
val f2 = toLazy(foo2)
val flist = List(f1, f2, f1)
println("invoking...")
val res = invoke(flist)
println(res)
// invoking...
// f1
// f2
// List(1, 2, 1)
If f1 requests fail and you handle Exception to allow invocation chain to proceed, lazy for f1 will be initialized by the first successful response for f1.

How to flatten a List of Futures in Scala

I want to take this val:
val f = List(Future(1), Future(2), Future(3))
Perform some operation on it (I was thinking flatten)
f.flatten
And get this result
scala> f.flatten = List(1,2,3)
If the flatten method isn't appropriate here, that's fine. As long as I get to the result.
Thanks!
Future.sequence takes a List[Future[T]] and returns a Future[List[T]].
You can do
Future.sequence(f)
and then use map or onComplete on it to access the list of values.

How to stay true to functional style in Scala for expressions

I've struggled to find a way to stay true to functional style in for expressions when I need to collect multiple parameters of an object into a List.
As an example, say I have a Notification object, which has both a fromId (the user id the notification is from) and an objectOwnerId (the id of the user who created the original object). These can differ in facebook style notifications ("X also commented on Y's post").
I can collect the userIds with a for expression like so
val userIds = for { notification <- notifications } yield notification.fromId
however say I want to collect both the fromIds and the objectOwnerIds into a single list, is there any way to do this in a single for expression without the user of vars?
I've done something like this in the past:
var ids = List()
for {
notification <- notifications
ids = ids ++ List(notification.fromId, notification.objectOwnerId)
}
ids = ids.distinct
but it feels like there must be a better way. The use of a var, and the need to call distinct after I complete the collection are both ugly. I could avoid the distinct with some conditionals, but I'm trying to learn the proper functional methods to do things.
Thanks in advance for any help!
For such cases, there is foldLeft:
(notifications foldLeft Set.empty[Id]) { (set, notification) =>
set ++ Seq(notification.fromId, notification.ownerId)
}
or in short form:
(Set.empty[Id] /: notifications) { (set, notification) =>
set ++ Seq(notification.fromId, notification.ownerId)
}
A set doesn't hold duplicates. After the fold you can convert the set to another collection if you want.
val userIds = for {
notification <- notifications
id <- List(notification.fromId, notification.objectOwnerId)
} yield id
Apply distinct afterwards if required. If the id can only be duplicated on a single notification, you can apply distinct on the second generator instead.
Sure, instead of just yielding the fromId, yield a tuple
val idPairs:List[(String, String)] = for(notification <- notifications) yield(notification.fromId, notification.objectOwnerId)
Well, here is my answer to the following:
How to map from [Y(x1, x2), Y(x3, x4)] to [x1,x2,x3,x4]?
Use flatMap (see Collection.Traversable, but note it's actually first defined higher up).
case class Y(a: Int, b: Int)
var in = List(Y(1,2), Y(3,4))
var out = in.flatMap(x => List(x.a, x.b))
> defined class Y
> in: List[Y] = List(Y(1,2), Y(3,4))
> out: List[Int] = List(1, 2, 3, 4)
Also, since for..yield is filter, map and flatMap in one (but also see "sugar for flatMap?" that points out that this isn't as efficient as it could be: there is an extra map):
var out = for { i <- in; x <- Seq(i.a, i.b) } yield x
I would likely pick one of the other answers, however, as this does not directly address the final problem being solved.
Happy coding.
You can also use Stream to transform the pairs into a stream of individual items:
def toStream(xs: Iterable[Y]): Stream[Int] = {
xs match {
case Y(a, b) :: t => a #:: b #:: toStream(t)
case _ => Stream.empty
}
}
But like pst said, this doesn't solve your final problem of getting the distinct values, but once you have the stream it's trivial:
val result = toStream(ys).toList.removeDuplicates
Or a slight modification to the earlier suggestions to use flatten - add a function that turns a Y into a List:
def yToList(y: Y) = List(y.a, y.b)
Then you can do:
val ys = List(Y(1, 2), Y(3, 4))
(ys map yToList flatten).removeDuplicates
I agree with Dave's solution but another approach is to fold over the list, producing your map of id to User object as you go. The function to apply In the fold will query the db for both users and add them to the map being accumulated.
What about simple map? AFAIK for yield gets converted to series of flatMap and map anyway. Your problem could be solved simply as follows:
notifications.map(n => (n.fromId, n.objectOwnerId)).distinct

scala return on first Some in list

I have a list l:List[T1] and currently im doing the following:
myfun : T1 -> Option[T2]
val x: Option[T2] = l.map{ myfun(l) }.flatten.find(_=>true)
The myfun function returns None or Some, flatten throws away all the None's and find returns the first element of the list if any.
This seems a bit hacky to me. Im thinking that there might exist some for-comprehension or similar that will do this a bit less wasteful or more clever.
For example: I dont need any subsequent answers if myfun returns any Some during the map of the list l.
How about:
l.toStream flatMap (myfun andThen (_.toList)) headOption
Stream is lazy, so it won't map everything in advance, but it won't remap things either. Instead of flattening things, convert Option to List so that flatMap can be used.
In addition to using toStream to make the search lazy, we can use Stream::collectFirst:
List(1, 2, 3, 4, 5, 6, 7, 8).toStream.map(myfun).collectFirst { case Some(d) => d }
// Option[String] = Some(hello)
// given def function(i: Int): Option[String] = if (i == 5) Some("hello") else None
This:
Transforms the List into a Stream in order to stop the search early.
Transforms elements using myFun as Option[T]s.
Collects the first mapped element which is not None and extract it.
Starting Scala 2.13, with the deprecation of Streams in favor of LazyLists, this would become:
List(1, 2, 3, 4, 5, 6, 7, 8).to(LazyList).map(function).collectFirst { case Some(d) => d }
Well, this is almost, but not quite
val x = (l flatMap myfun).headOption
But you are returning a Option rather than a List from myfun, so this may not work. If so (I've no REPL to hand) then try instead:
val x = (l flatMap(myfun(_).toList)).headOption
Well, the for-comprehension equivalent is pretty easy
(for(x<-l, y<-myfun(x)) yield y).headOption
which, if you actually do the the translation works out the same as what oxbow_lakes gave. Assuming reasonable laziness of List.flatmap, this is both a clean and efficient solution.
As of 2017, the previous answers seem to be outdated. I ran some benchmarks (list of 10 million Ints, first match roughly in the middle, Scala 2.12.3, Java 1.8.0, 1.8 GHz Intel Core i5). Unless otherwise noted, list and map have the following types:
list: scala.collection.immutable.List
map: A => Option[B]
Simply call map on the list: ~1000 ms
list.map(map).find(_.isDefined).flatten
First call toStream on the list: ~1200 ms
list.toStream.map(map).find(_.isDefined).flatten
Call toStream.flatMap on the list: ~450 ms
list.toStream.flatMap(map(_).toList).headOption
Call flatMap on the list: ~100 ms
list.flatMap(map(_).toList).headOption
First call iterator on the list: ~35 ms
list.iterator.map(map).find(_.isDefined).flatten
Recursive function find(): ~25 ms
def find[A,B](list: scala.collection.immutable.List[A], map: A => Option[B]) : Option[B] = {
list match {
case Nil => None
case head::tail => map(head) match {
case None => find(tail, map)
case result # Some(_) => result
}
}
}
Iterative function find(): ~25 ms
def find[A,B](list: scala.collection.immutable.List[A], map: A => Option[B]) : Option[B] = {
for (elem <- list) {
val result = map(elem)
if (result.isDefined) return result
}
return None
}
You can further speed up things by using Java instead of Scala collections and a less functional style.
Loop over indices in java.util.ArrayList: ~15 ms
def find[A,B](list: java.util.ArrayList[A], map: A => Option[B]) : Option[B] = {
var i = 0
while (i < list.size()) {
val result = map(list.get(i))
if (result.isDefined) return result
i += 1
}
return None
}
Loop over indices in java.util.ArrayList with function returning null instead of None: ~10 ms
def find[A,B](list: java.util.ArrayList[A], map: A => B) : Option[B] = {
var i = 0
while (i < list.size()) {
val result = map(list.get(i))
if (result != null) return Some(result)
i += 1
}
return None
}
(Of course, one would usually declare the parameter type as java.util.List, not java.util.ArrayList. I chose the latter here because it's the class I used for the benchmarks. Other implementations of java.util.List will show different performance - most will be worse.)