Related
I would have thought that a list of tuples could easily be flattened:
scala> val p = "abcde".toList
p: List[Char] = List(a, b, c, d, e)
scala> val q = "pqrst".toList
q: List[Char] = List(p, q, r, s, t)
scala> val pq = p zip q
pq: List[(Char, Char)] = List((a,p), (b,q), (c,r), (d,s), (e,t))
scala> pq.flatten
But instead, this happens:
<console>:15: error: No implicit view available from (Char, Char) => scala.collection.GenTraversableOnce[B].
pq.flatten
^
I can get the job done with:
scala> (for (x <- pq) yield List(x._1, x._2)).flatten
res1: List[Char] = List(a, p, b, q, c, r, d, s, e, t)
But I'm not understanding the error message. And my alternative solution seems convoluted and inefficient.
What does that error message mean and why can't I simply flatten a List of tuples?
If the implicit conversion can't be found you can supply it explicitly.
pq.flatten {case (a,b) => List(a,b)}
If this is done multiple times throughout the code then you can save some boilerplate by making it implicit.
scala> import scala.language.implicitConversions
import scala.language.implicitConversions
scala> implicit def flatTup[T](t:(T,T)): List[T]= t match {case (a,b)=>List(a,b)}
flatTup: [T](t: (T, T))List[T]
scala> pq.flatten
res179: List[Char] = List(a, p, b, q, c, r, d, s, e, t)
jwvh's answer covers the "coding" solution to your problem perfectly well, so I am not going to go into any more detail about that. The only thing I wanted to add was clarifying why the solution that both you and jwvh found is needed.
As stated in the Scala library, Tuple2 (which (,) translates to) is:
A tuple of 2 elements; the canonical representation of a Product2.
And following up on that:
Product2 is a cartesian product of 2 components.
...which means that Tuple2[T1,T2] represents:
The set of all possible pairs of elements whose components are members of two sets (all elements in T1 and T2 respectively).
A List[T], on the other hand, represents an ordered collections of T elements.
What all this means practically is that there is no absolute way to translate any possible Tuple2[T1,T2] to a List[T], simply because T1 and T2 could be different. For example, take the following tuple:
val tuple = ("hi", 5)
How could such tuple be flattened? Should the 5 be made a String? Or maybe just flatten to a List[Any]? While both of these solutions could be used, they are working around the type system, so they are not encoded in the Tuple API by design.
All this comes down to the fact that there is no default implicit view for this case and you have to supply one yourself, as both jwvh and you already figured out.
We needed to do this recently. Allow me to explain the use case briefly before noting our solution.
Use case
Given a pool of items (which I'll call type T), we want to do an evaluation of each one against all others in the pool. The result of these comparisons is a Set of failed evaluations, which we represent as a tuple of the left item and the right item in said evaluation: (T, T).
Once these evaluations are complete, it becomes useful for us to flatten the Set[(T, T)] into another Set[T] that highlights all the items that have failed any comparisons.
Solution
Our solution for this was a fold:
val flattenedSet =
set.foldLeft(Set[T]())
{ case (acc, (x, y)) => acc + x + y }
This starts with an empty set (the initial parameter to foldLeft) as the accumulator.
Then, for each element in the consumed Set[(T, T)] (named set) here, the fold function is passed:
the last value of the accumulator (acc), and
the (T, T) tuple for that element, which the case deconstructs into x and y.
Our fold function then returns acc + x + y, which returns a set containing all the elements in the accumulator in addition to x and y. That result is passed to the next iteration as the accumulator—thus, it accumulates all the values inside each of the tuples.
Why not Lists?
I appreciated this solution in particular since it avoided creating intermediate Lists while doing the flattening—instead, it directly deconstructs each tuple while building the new Set[T].
We could also have changed our evaluation code to return List[T]s containing the left and right items in each failed evaluation—then flatten would Just Work™. But we thought the tuple more accurately represented what we were going for with the evaluation—specifically one item against another, rather than an open-ended type which could conceivably represent any number of items.
(Using Julia 0.3.11)
I'm having trouble type-annotating correctly some of my code, in an initial version we've been using ASCIIString - to annotate any String, to "avoid" abstract types, but let's start with the example, this might be related to what I've seen refereed to as "triangular dispatch" in some discussions here:
# How to type annotate this (sortof "Dictionary with default)
function pushval!(dict, key, val)
key in keys(dict) ? push!(dict[key], val) : dict[key] = [val]
return dict
end
d1 = Dict{ASCIIString, Vector{Int}}()
d2 = Dict{String, Vector{Int}}()
pushval!(d1, "a", 1)
pushval!(d2, "a", 1)
Ok (firstly - if there's a more idiomatic way to construct a dictionary with defaults, in this case an empty array, I'd love to hear about it)
So now, I've tried to type annotated it:
function pushval!{K, V} (dict::Dict{K, Vector{V}} , key, val)
Much more documenting, and works.
But now comes the trickier part - I want 'key' to be any subtype of K, and val - any subtypes of V (right?) eg - I would like to make a dictionary of String - which is an abstract type, but use concrete keys - which are ASCIIString/ByteString/UTF8String,
I thought I should write one of the followings:
function pushval!{K, V} (dict::Dict{K, Vector{V}} , key::KK <: K, val:: VV <: V)
function pushval!{K, V, KK <: K, VV <: V} (dict::Dict{K, Vector{V}} , key::KK, val::VV)
One solution would be as suggest in ( Can I use a subtype of a function parameter in the function definition? ) something with 'convert'.
But this whole thing made me wonder about the Julia code I'm writing, I've started a write a system - using String, FloatingPoint, Number and such abstract types, when I actually tried running it, I've reverted to convert everything to concrete types just to get thing running for now...
Is there a recommended codebase to read as a reference to idiomatic Julia code?
Like the very implementation of Julia's dictionary-assign operator even. Is there a part of the standard library considered good to start with as a reference? thanks
Although I don't like this this solution very much (low performance), it can be helpful:
function pushval!{K, V} (dict::Dict{K, Vector{V}} , key , val)
fun = function inner{tt1<:K,tt2<:V}(key::tt1,val::tt2)
key in keys(dict) ? push!(dict[key], val) : dict[key] = [val]
return dict
end
return fun(key,val)
end
# => pushval! (generic function with 1 method)
d1 = Dict{ASCIIString, Vector{Int}}()
# => Dict{ASCIIString,Array{Int32,1}} with 0 entries
d2 = Dict{String, Vector{Int}}()
# => Dict{String,Array{Int32,1}} with 0 entries
pushval!(d1, "a", 1)
# => Dict{ASCIIString,Array{Int32,1}} with 1 entry:
# "a" => [1]
pushval!(d2, "a", 1)
# => Dict{String,Array{Int32,1}} with 1 entry:
# "a" => [1]
I know this is only partly what you asked for, but maybe you find it sufficient.
What you call pushval! can be achieved using push!(get!(d1, "a", []), 1) (although it will return the dictionary value that was appended to instead of the dictionary itself). If you need to constrain the type of the inner collection's values, you can, for example, use:
push!(get!(d1, "a", Number[]), 1)
If you really need to define this as a function, I am afraid that, at the moment, you cannot define the types in the way you describe. As the accepted answer to the question you referenced notes, Julia does not implement triangular dispatch yet, although it is targeted for 0.5.
I could recommend looking at Julia Style Guide. There are some advices about using type annotations.
For your case you don't need type annotations for pushval! function at all. Julia will get enough info from Dict creation and deduce appropriate types for pushval! arguments.
function pushval!(dict, key, val)
key in keys(dict) ? push!(dict[key], val) : dict[key] = [val]
return dict
end
d = Dict{String, Vector{Int} # Here is all annotations you need.
pushval!(d, "a", 1) # OK, "a" is ASCIIString which is subtype of AbstractString
pushval!(d, 1, 1) # ERROR, 1 is Int64 which is not subtype of AbstractString
I have a list of Objects (Items, in this case) which have category ids and properties (which itself is a list of custom types).
I am trying to def a function that takes a list of integers e.g. List(101, 102, 102, 103, 104) that correspond to the category ids for the Items and creates a list of tuples that include the category type (which is an Option) and each property type from a list of properties that go along with each category. So far I have the below, but I am getting an error that value _2 is not a member of Product with Serializable.
def idxToData(index: List[Int], items: Seq[Item]): List[(Option[Category], Property[_])] = {
def getId(ic: Option[Category]): Int => {
ic match {
case Some(e) => e._id
case None => 0
}
}
index.flatMap(t => items.map(i => if(t == getId(i.category)){
(i.category, i.properties.list.map(_.property).toList.sortWith(_._id < _._id))
} else {
None
}.filter(_ != None )
))
.map(x => x._2.map(d => (x._1, d)))
.toList
}
I am not sure how it is assigning that type (I am assuming at that point that I should have a list of tuples that I am trying to map).
Overall, is there a better way in scala to achieve the desired result from taking in a list of indices and using that to access the specific items in a list where a tuple of two parts of each corresponding item would "replace" the index to create the new list structure?
You should split your code, give names to things (add some vals and some defs), and when the compiler does not agree with you, write types, so that the compiler will tell you early where it disagrees (don't worry, we all did that when starting with FP)
Also, when posting such a question, you might want to give (relevant parts of) the interface of elements that are referenced but not defined. What are "is" (is that items?), Item, category, properties...., or simplify your code so that they do not appear.
Now, to the problem :
if(t == (i.category match { case Some(e) => e._id})){
(i.category, i.properties.list.map(_.property).toList.sortWith(_._id < _._id))
} else {
None
}
The first branch is the type Tuple2(Int, whatever) while the second branch is of the completely unrelated type None. Clearly, there is no common super type better than AnyRef, so that is the type of the if expression. Then the type of is.map (supposing is is some sort of Seq) will be Seq[AnyRef]. filter does not change the type, so still Seq[AnyRef], and in the map(x =>...), x is an AnyRef too, not a Tuple2, so it has no _2.
Of course, the list actually contains only tuples, because originally it had tuples and Nones and you have removed the Nones. But that was lost to the compiler when it typed that AnyRef.
(as the compiler error message tells and as noted by Imm, the compiler finds a slightly more precise type than AnyRef, Product with Serializable; however, that will not do you any good, all of the useful typing information is still lost there).
To preserve the type, in general you should do something such as
if(....) {
Some(stuff)
else
None
That would have been typed Option[type of stuff], where type of stuff is your Pair.
However, there is something simpler with routine collect.
It is a bit like match, except that it takes a partial function, and it discard elements for which the partial function is not defined.
So that would be
is.collect { case i if categoryId(i) == Some(t) =>
(i.catetory, i.properties....)
}
supposing you have defined
def categoryId(item: Item): Option[Int] = item.category.map(._id)
When you do this:
is.map(i => if(t == getId(i.category)){
(i.category, i.properties.list.map(_.property).toList.sortWith(_._id < _._id))
} else {
None
}
you get a List[Product with Serializable] (what you should probably get is a type error, but that could be a long digression), because that's the only supertype of None and (Category, List[Property[_]]) or whatever that tuple type is. The compiler isn't smart enough to carry the union type through and figure out that when you filter(_ != None) anything left in the list must be the tuple.
Try to rephrase this part. E.g. you could do is.filter(i => t == getId(i.category)) first, before the map, and then you wouldn't need to mess around with Nones in your list.
Question is simple.
How to access a tuple by using Index variable in SML?
val index = 5;
val tuple1 = (1,2,3,4,5,6,7,8,9,10);
val correctValue = #index tuple1 ??
I hope, somebody would be able to help out.
Thanks in advance!
There doesn't exist a function which takes an integer value and a tuple, and extracts that element from the tuple. There are of course the #1, #2, ... functions, but these do not take an integer argument. That is, the name of the "function" is #5, it is not the function # applied to the value 5. As such, you cannot substitute the name index instead of the 5.
If you don't know in advance at which place in the tuple the element you want will be at, you're probably using them in a way they're not intended to be used.
You might want a list of values, for which the 'a list type is more natural. You can then access the nth element using List.nth.
To clarify a bit, why you can't do that you need some more knowledge of what a tuple is in SML.
Tuples are actually represented as records in SML. Remember that records has the form {id = expr, id = expr, ..., id = expr} where each identifier is a label.
The difference of tuples and records is given away by the way you index elements in a tuple: #1, #2, ... (1, "foo", 42.0) is a derived form of (equivalent with) {1 = 1, 2 = "foo", 3 = 42.0}. This is perhaps better seen by the type that SML/NJ gives that record
- {1 = 1, 2 = "foo", 3 = 42.0};
val it = (1,"foo",42.0) : int * string * real
Note the type is not shown as a record type such as {1: int, 2: string, 3: real}. The tuple type is again a derived form of the record type.
Actually #id is not a function, and thus it can't be called with a variable as "argument". It is actually a derived form of (note the wildcard pattern row, in the record pattern match)
fn {id=var, ...} => var
So in conclusion, you won't be able to do what you wan't, since these derived forms (or syntactic sugar if you will) aren't dynamic in any ways.
One way is as Sebastian Paaske said to use lists. The drawback is that you need O(n) computations to access the nth element of a list. If you need to access an element in O(1) time, you may use arrays, which are in basic sml library.
You can find ore about arrays at:
http://sml-family.org/Basis/array.html
I've got a list of objects List[Object] which are all instantiated from the same class. This class has a field which must be unique Object.property. What is the cleanest way to iterate the list of objects and remove all objects(but the first) with the same property?
list.groupBy(_.property).map(_._2.head)
Explanation: The groupBy method accepts a function that converts an element to a key for grouping. _.property is just shorthand for elem: Object => elem.property (the compiler generates a unique name, something like x$1). So now we have a map Map[Property, List[Object]]. A Map[K,V] extends Traversable[(K,V)]. So it can be traversed like a list, but elements are a tuple. This is similar to Java's Map#entrySet(). The map method creates a new collection by iterating each element and applying a function to it. In this case the function is _._2.head which is shorthand for elem: (Property, List[Object]) => elem._2.head. _2 is just a method of Tuple that returns the second element. The second element is List[Object] and head returns the first element
To get the result to be a type you want:
import collection.breakOut
val l2: List[Object] = list.groupBy(_.property).map(_._2.head)(breakOut)
To explain briefly, map actually expects two arguments, a function and an object that is used to construct the result. In the first code snippet you don't see the second value because it is marked as implicit and so provided by the compiler from a list of predefined values in scope. The result is usually obtained from the mapped container. This is usually a good thing. map on List will return List, map on Array will return Array etc. In this case however, we want to express the container we want as result. This is where the breakOut method is used. It constructs a builder (the thing that builds results) by only looking at the desired result type. It is a generic method and the compiler infers its generic types because we explicitly typed l2 to be List[Object] or, to preserve order (assuming Object#property is of type Property):
list.foldRight((List[Object](), Set[Property]())) {
case (o, cum#(objects, props)) =>
if (props(o.property)) cum else (o :: objects, props + o.property))
}._1
foldRight is a method that accepts an initial result and a function that accepts an element and returns an updated result. The method iterates each element, updating the result according to applying the function to each element and returning the final result. We go from right to left (rather than left to right with foldLeft) because we are prepending to objects - this is O(1), but appending is O(N). Also observe the good styling here, we are using a pattern match to extract the elements.
In this case, the initial result is a pair (tuple) of an empty list and a set. The list is the result we're interested in and the set is used to keep track of what properties we already encountered. In each iteration we check if the set props already contains the property (in Scala, obj(x) is translated to obj.apply(x). In Set, the method apply is def apply(a: A): Boolean. That is, accepts an element and returns true / false if it exists or not). If the property exists (already encountered), the result is returned as-is. Otherwise the result is updated to contain the object (o :: objects) and the property is recorded (props + o.property)
Update: #andreypopp wanted a generic method:
import scala.collection.IterableLike
import scala.collection.generic.CanBuildFrom
class RichCollection[A, Repr](xs: IterableLike[A, Repr]){
def distinctBy[B, That](f: A => B)(implicit cbf: CanBuildFrom[Repr, A, That]) = {
val builder = cbf(xs.repr)
val i = xs.iterator
var set = Set[B]()
while (i.hasNext) {
val o = i.next
val b = f(o)
if (!set(b)) {
set += b
builder += o
}
}
builder.result
}
}
implicit def toRich[A, Repr](xs: IterableLike[A, Repr]) = new RichCollection(xs)
to use:
scala> list.distinctBy(_.property)
res7: List[Obj] = List(Obj(1), Obj(2), Obj(3))
Also note that this is pretty efficient as we are using a builder. If you have really large lists, you may want to use a mutable HashSet instead of a regular set and benchmark the performance.
Starting Scala 2.13, most collections are now provided with a distinctBy method which returns all elements of the sequence ignoring the duplicates after applying a given transforming function:
list.distinctBy(_.property)
For instance:
List(("a", 2), ("b", 2), ("a", 5)).distinctBy(_._1) // List((a,2), (b,2))
List(("a", 2.7), ("b", 2.1), ("a", 5.4)).distinctBy(_._2.floor) // List((a,2.7), (a,5.4))
Here is a little bit sneaky but fast solution that preserves order:
list.filterNot{ var set = Set[Property]()
obj => val b = set(obj.property); set += obj.property; b}
Although it uses internally a var, I think it is easier to understand and to read than the foldLeft-solution.
A lot of good answers above. However, distinctBy is already in Scala, but in a not-so-obvious place. Perhaps you can use it like
def distinctBy[A, B](xs: List[A])(f: A => B): List[A] =
scala.reflect.internal.util.Collections.distinctBy(xs)(f)
With preserve order:
def distinctBy[L, E](list: List[L])(f: L => E): List[L] =
list.foldLeft((Vector.empty[L], Set.empty[E])) {
case ((acc, set), item) =>
val key = f(item)
if (set.contains(key)) (acc, set)
else (acc :+ item, set + key)
}._1.toList
distinctBy(list)(_.property)
One more solution
#tailrec
def collectUnique(l: List[Object], s: Set[Property], u: List[Object]): List[Object] = l match {
case Nil => u.reverse
case (h :: t) =>
if (s(h.property)) collectUnique(t, s, u) else collectUnique(t, s + h.prop, h :: u)
}
I found a way to make it work with groupBy, with one intermediary step:
def distinctBy[T, P, From[X] <: TraversableLike[X, From[X]]](collection: From[T])(property: T => P): From[T] = {
val uniqueValues: Set[T] = collection.groupBy(property).map(_._2.head)(breakOut)
collection.filter(uniqueValues)
}
Use it like this:
scala> distinctBy(List(redVolvo, bluePrius, redLeon))(_.color)
res0: List[Car] = List(redVolvo, bluePrius)
Similar to IttayD's first solution, but it filters the original collection based on the set of unique values. If my expectations are correct, this does three traversals: one for groupBy, one for map and one for filter. It maintains the ordering of the original collection, but does not necessarily take the first value for each property. For example, it could have returned List(bluePrius, redLeon) instead.
Of course, IttayD's solution is still faster since it does only one traversal.
My solution also has the disadvantage that, if the collection has Cars that are actually the same, both will be in the output list. This could be fixed by removing filter and returning uniqueValues directly, with type From[T]. However, it seems like CanBuildFrom[Map[P, From[T]], T, From[T]] does not exist... suggestions are welcome!
With a collection and a function from a record to a key this yields a list of records distinct by key. It's not clear whether groupBy will preserve the order in the original collection. It may even depend on the type of collection. I'm guessing either head or last will consistently yield the earliest element.
collection.groupBy(keyFunction).values.map(_.head)
When will Scala get a nubBy? It's been in Haskell for decades.
If you want to remove duplicates and preserve the order of the list you can try this two liner:
val tmpUniqueList = scala.collection.mutable.Set[String]()
val myUniqueObjects = for(o <- myObjects if tmpUniqueList.add(o.property)) yield o
this is entirely a rip of #IttayD 's answer, but unfortunately I don't have enough reputation to comment.
Rather than creating an implicit function to convert your iteratble, you can simply create an implicit class:
import scala.collection.IterableLike
import scala.collection.generic.CanBuildFrom
implicit class RichCollection[A, Repr](xs: IterableLike[A, Repr]){
def distinctBy[B, That](f: A => B)(implicit cbf: CanBuildFrom[Repr, A, That]) = {
val builder = cbf(xs.repr)
val i = xs.iterator
var set = Set[B]()
while (i.hasNext) {
val o = i.next
val b = f(o)
if (!set(b)) {
set += b
builder += o
}
}
builder.result
}
}