Turning Some(List()) into None - list

I have optional lists, for example:
val optionalEmptyList = Option(List[String]())
val optionalNonEmptyList = Option(List[String]("1","2"))
And I would like to replace the optional empty lists by None while keeping the optional non-empty lists as-is.
I came up with the following solution:
optionalEmptyList.flatMap(l => if (l.isEmpty) None else Option(l))
optionalNonEmptyList.flatMap(l => if (l.isEmpty) None else Option(l))
It works but seems convoluted. Any simpler solution?

optionalEmptyList.filter(_.nonEmpty)

Related

Find maximum w.r.t. substring within each group of formatted strings

I am struggling to find solution for a scenario. I have few files in a directory. lets say
vbBaselIIIData_201802_3_d.data.20180405.txt.gz
vbBaselIIIData_201802_4_d.data.20180405.txt.gz
vbBaselIIIData_201803_4_d.data.20180405.txt.gz
vbBaselIIIData_201803_5_d.data.20180405.txt.gz
Here suppose the single digit number after the second underscore is called runnumber. I have to pick only files with latest runnumber. so in this case I need to pick only two out of the four files and put it in a mutable scala list. The ListBuffer should contain :
vbBaselIIIData_201802_4_d.data.20180405.txt.gz
vbBaselIIIData_201803_5_d.data.20180405.txt.gz
Can anybody suggest me how to implement this. I am using Scala, but only algorithm is also appreciated. What could be the right sets of datastructure we can use? What are the functions we need to implement? Any suggestions.
Here is a hopefully somewhat inspiring proposal that demonstrates a whole bunch of different language features and useful methods on collections:
val list = List(
"vbBaselIIIData_201802_3_d.data.20180405.txt.gz",
"vbBaselIIIData_201802_4_d.data.20180405.txt.gz",
"vbBaselIIIData_201803_4_d.data.20180405.txt.gz",
"vbBaselIIIData_201803_5_d.data.20180405.txt.gz"
)
val P = """[^_]+_(\d+)_(\d+)_.*""".r
val latest = list
.map { str => {val P(id, run) = str; (str, id, run.toInt) }}
.groupBy(_._2) // group by id
.mapValues(_.maxBy(_._3)._1) // find the last run for each id
.values // throw away the id
.toList
.sorted // restore ordering, mostly for cosmetic purposes
latest foreach println
Brief explanation of the not-entirely-trivial parts that you might have missed when reading an introduction to Scala:
"regex pattern".r converts a string into a compiled regex pattern
A block { stmt1 ; stmt2 ; stmt3 ; ... ; stmtN; result } evaluates to the last expression result
Extractor syntax can be used for compiled regex patterns
val P(id, run) = str matches the second and third _-separated values
_.maxBy(_._3)._1 finds the triple with highest run number, then extracts the first component str again
Output:
vbBaselIIIData_201802_4_d.data.20180405.txt.gz
vbBaselIIIData_201803_5_d.data.20180405.txt.gz
It's not clear what performance needs you have, even though you're mentioning an 'algorithm'.
Provided you don't have more specific needs, something like this is easy to do with Scala's Collection API. Even if you were dealing with huge directories, you could probably achieve some good performance characteristics by moving to Streams (at least in memory usage).
So assuming you have a function like def getFilesFromDir(path: String): List[String] where the List[String] is a list of filenames, you need to do the following:
Group files by date (List[String] => Map[String, List[String]]
Extract the Runnumbers, preserving the original filename (List[String] => List[(String, Int)])
Select the max Runnumber (List[(String, Int)] => (String, Int))
Map to just the filename ((String, Int) => String)
Select just the values of the resulting Map (Map[Date, String] => String)
(Note: if you want to go the pure functional route, you'll want a function something like def getFilesFromDir(path: String): IO[List[String]])
With Scala's Collections API you can achieve the above with something like this:
def extractDate(fileName: String): String = ???
def extractRunnumber(fileName: String): String = ???
def getLatestRunnumbersFromDir(path: String): List[String] =
getFilesFromDir(path)
.groupBy(extractDate) // List[String] => Map[String, List[String]]
.mapValues(selectMaxRunnumber) // Map[String, List[String]] => Map[String, String]
.values // Map[String, String] => List[String]
def selectMaxRunnumber(fileNames: List[String]): String =
fileNames.map(f => f -> extractRunnumber(f))
.maxBy(p => p._2)
._1
I've left the extractDate and extractRunnumber implementations blank. These can be done using simple regular expressions — let me know if you're having trouble with that.
If you have the file-names as a list, like:
val list = List("vbBaselIIIData_201802_3_d.data.20180405.txt.gz"
, "vbBaselIIIData_201802_4_d.data.20180405.txt.gz"
, "vbBaselIIIData_201803_4_d.data.20180405.txt.gz"
, "vbBaselIIIData_201803_5_d.data.20180405.txt.gz")
Then you can do:
list.map{f =>
val s = f.split("_").toList
(s(1), f)
}.groupBy(_._1)
.map(_._2.max)
.values
This returns:
MapLike.DefaultValuesIterable(vbBaselIIIData_201803_5_d.data.20180405.txt.gz, vbBaselIIIData_201802_4_d.data.20180405.txt.gz)
as you wanted.

filter a List according to multiple contains

I want to filter a List, and I only want to keep a string if the string contains .jpg,.jpeg or .png:
scala> var list = List[String]("a1.png","a2.amr","a3.png","a4.jpg","a5.jpeg","a6.mp4","a7.amr","a9.mov","a10.wmv")
list: List[String] = List(a1.png, a2.amr, a3.png, a4.jpg, a5.jpeg, a6.mp4, a7.amr, a9.mov, a10.wmv)
I am not finding that .contains will help me!
Required output:
List("a1.png","a3.png","a4.jpg","a5.jpeg")
Use filter method.
list.filter( name => name.contains(pattern1) || name.contains(pattern2) )
If you have undefined amount of extentions:
val extensions = List("jpg", "png")
list.filter( p => extensions.exists(e => p.matches(s".*\\.$e$$")))
To select anything that contains one of an arbitrary number of extensions:
list.filter(p => extensions.exists(e => p.contains(e)))
Which is what #SergeyLagutin said above, but I thought I'd point out it doesn't need matches.
Why not use filter() with an appropriate function performing your selection/predicate?
e.g.
list.filter(x => x.endsWith(".jpg") || x.endsWith(".jpeg")
etc.

Scala: Create new list of same type

I'm stuck and the solutions Google offered me (not that many) didn't work somehow. It sounds trivial but kept me busy for two hours now (maybe I should go for a walk...).
I've got a list of type XY, oldList: List[XY] with elements in it. All I need is a new, empty List of the same type.
I've already tried stuff like:
newList[classOf(oldList[0])]
newList = oldList.clone()
newList.clear()
But it didn't work some how or takes MutableList, which I don't like. :/
Is there a best (or any working) practice to create a new List of a certain type?
Grateful for any advice,
Teapot
P.S. please don't be too harsh if it's simple, I'm new to Scala. :(
Because empty lists do not contain anything, the type parameter doesn't actually matter - an empty List[Int] is the same as an empty List[String]. And in fact, they are exactly the same object, Nil. Nil has type List[Nothing], and can be upcast to any other kind of List.
In general though when working with types that take type parameters, the type parameter for the old value will be in scope, and it can be used to create a new instance. So a generic method for a mutable collection, where the type parameter matters even if empty:
def processList[T](oldList: collection.mutable.Buffer[T]) = {
val newList = collection.mutable.Buffer[T]()
// Do something with oldList and newList
}
There are probably nicer solutions, but from the top of my head:
def nilOfSameType[T](l: List[T]) = List.empty[T]
val xs = List(1,2,3)
nilOfSameType(xs)
// List[Int] = List()
if you want it to be empty, all you should have to do is:
val newList = List[XY]()
That's really it, as long as I'm understanding your question.

convert List[Tuple2[A,B]] to Tuple2[Seq[A],Seq[B]]

Stuck here, trying to convert a List of case class tuples to a tuple of sequences and multi-assign the result.
val items = repo.foo.list // gives me a List[(A,B)]
I can pull off multi-assignment like so:
val(a,b) = (items.map(_._1).toSeq, items.map(_._2).toSeq)
but it would be nicer to do in 1 step, along the lines of:
val(a,b) = repo.foo.list.map{case(a,b) => (a,b)}
I am not sure if I understood the question correctly. Maybe unzip works for what you want?
Here is a link with some examples: http://daily-scala.blogspot.de/2010/03/unzip.html
For a more general case you can look at product-collections. A CollSeqN is both a Seq[TupleN[A1..An]] and a TupleN[Seq[A1..An]]
In your example you could extract the Seqs like so:
items._1
items._2
...

Ocaml Error: Unbound record field label length

This is the error I'm getting and I have no idea why: "Error: Unbound record field label length "
Does anyonw know?
let rastavi str =
let sublist = ref [] in
let list = ref [] in
for i = ((str.length str)1) [down]to 0 do
if str.[i] =' ' then (str.[i] :: !sublist)
else (list := (!sublist:: !list)) sublist = []
done ;;
You're using OO notation to get the length of a string. OCaml uses functional notation. So it looks like this:
String.length str
Not like this:
str.length (* OO notation, not in OCaml *)
Edit:
Side comment: this solution is very much an imperative take on the problem. If you're trying to learn the FP mindset, you should try to think recursively and immutably. Since this looks like homework, it's very likely a functional solution is what you want.
But here are a few other problems in your original code:
You have two expressions next to each other with nothing in between. If you want to "do" two things, you need to separate them with a semicolon ; (however, this is imperative style)
You're using = which compares two values for equality. If you want to assign a value to a reference you need to use :=. (Imperative style, again.)