How to find string in List using Scala? - list

I have a following list-
List( naa.60a9800042704577762b45634476337a ,
naa.6d867d9c7acd60001aed76eb2c70bd53 ,
naa.600a09804270457a7a5d455448735330)
I want to find a string 42704577762b45634476337a in above list.
Like first string in list contains given string 42704577762b45634476337a.
I dont want to fully match given string with list elements
How do I find string in list using scala??

Looking for a substring
scala> val x = List("123", "abc")
x: List[String] = List(123, abc)
scala> x.find(_.contains("12"))
res0: Option[String] = Some(123)
scala> x.find(_.contains("foo"))
res1: Option[String] = None
If you need an exact match, just replace contains with ==.

Related

how to print list all first name

I had a list with string first name and last name
val dataList = List("Narendra MODI","Amit SHA","Donald TRUMP","Ratan TATA","Abdul KALAM")
I want to print all the first from the list like Narendra,Amit,Donald,Ratan,Abdul
could you please help me on this in scala
The simplest option is to take the initial non-space characters from each string:
dataList.map(_.takeWhile(!_.isSpaceChar))
you can map over your list and use split on space and select the 1st index.
scala> val dataList = List("Narendra MODI","Amit SHA","Donald TRUMP","Ratan TATA","Abdul KALAM")
dataList: List[String] = List(Narendra MODI, Amit SHA, Donald TRUMP, Ratan TATA, Abdul KALAM)
scala> dataList.map( _.split(" ").headOption.getOrElse(None))
res2: List[java.io.Serializable] = List(Narendra, Amit, Donald, Ratan, Abdul)

strange behaviour with filter?

I want to extract MIME-like headers (starting with [Cc]ontent- ) from a multiline string:
scala> val regex = "[Cc]ontent-".r
regex: scala.util.matching.Regex = [Cc]ontent-
scala> headerAndBody
res2: String =
"Content-Type:application/smil
Content-ID:0.smil
content-transfer-encoding:binary
<smil><head>
"
This fails
scala> headerAndBody.lines.filter(x => regex.pattern.matcher(x).matches).toList
res4: List[String] = List()
but the "related" cases work as expected:
scala> headerAndBody.lines.filter(x => regex.pattern.matcher("Content-").matches).toList
res5: List[String] = List(Content-Type:application/smil, Content-ID:0.smil, content-transfer-encoding:binary, <smil><head>)
and:
scala> headerAndBody.lines.filter(x => x.startsWith("Content-")).toList
res8: List[String] = List(Content-Type:application/smil, Content-ID:0.smil)
what am I doing wrong in
x => regex.pattern.matcher(x).matches
since it returns an empty List??
The reason for the failure with the first line is that you use the java.util.regex.Matcher.matches() method that requires a full string match.
To fix that, use the Matcher.find() method that searches for the match anywhere inside the input string and use the "^[Cc]ontent-" regex (note that the ^ symbol will force the match to appear at the start of the string).
Note that this line of code does not work as you expect:
headerAndBody.lines.filter(x => regex.pattern.matcher("Content-").matches).toList
You run the regex check against the pattern Content-, and it is always true (that is why you get all the lines in the result).
See this IDEONE demo:
val headerAndBody = "Content-Type:application/smil\nContent-ID:0.smil\ncontent-transfer-encoding:binary\n<smil><head>"
val regex = "^[Cc]ontent-".r
val s1 = headerAndBody.lines.filter(x => regex.pattern.matcher(x).find()).toList
println(s1)
val s2 = headerAndBody.lines.filter(x => regex.pattern.matcher("Content-").matches).toList
print (s2)
Results (the first is the fix, and the second shows that your second line of code fails):
List(Content-Type:application/smil, Content-ID:0.smil, content-transfer-encoding:binary)
List(Content-Type:application/smil, Content-ID:0.smil, content-transfer-encoding:binary, <smil><head>)
Your regexp should match all line but not only first sub-string.
val regex = "[Cc]ontent-.*".r

regular expression matching string in scala

I have a string like this
result: String = /home/administrator/com.supai.common-api-1.8.5-DEV- SNAPPSHOT/com/a/infra/UserAccountDetailsMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV- SNAPSHOT/com/a/infra/UserAccountDetailsMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/UserAccountMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/UserAccountMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/UserOverridenFunctionMetaDataMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/UserOverridenFunctionMetaDataMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/UserOverridenPermissionMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/UserOverridenPermissionMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/UserRoleMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/UserRoleMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV- SNAPSHOT/com/a/infra/VendorAddressMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/VendorAddressMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/reactore/infra/VendorContactMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/reactore/infra/VendorContactMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/reactore/infra/VendorMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/VendorMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/WeekMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/WeekMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/WorkflowMetadataMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/WorkflowMetadataMetaData.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/WorkflowNotificationMetaData$.class
/home/administrator/com.supai.common-api-1.8.5-DEV-SNAPSHOT/com/a/infra/WorkflowNotificationMetaData.class
/home/a/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/a/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar
/home/common/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/raghav/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/sysadmin/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/tmp/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar
/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar
regex: scala.util.matching.Regex = (\\/([u|s|r])\\/([s|h|a|r|e]))
x: scala.util.matching.Regex.MatchIterator = empty iterator`
and out of this how can I get only this part /usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jarand this part can be anywhere in the string, how can I achieve this, I tried using regular expression in Scala but don't know how to use forward slashes, so anybody plz explain how to do this in scala.
What is your search criteria? Your pattern seems to be wrong.
In your rexexp, I see u|s|r which means to search for either u, or s or r . See here for more information
how can I get only this part
/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jarand
this part can be anywhere in the string
If you are looking for a path, see the below example:
scala> val input = """/home/common/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
| /home/raghav/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
| /home/sysadmin/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
| /home/tmp/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
| /home/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
| /home/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar
| /usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar"""
input: String =
/home/common/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/raghav/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/sysadmin/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/tmp/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/usr/share/common-api/lib/com.supai.common-api-1.3-SNAPSHOT.jar
/home/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar
/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar
scala> val myRegExp = "/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar".r
myRegExp: scala.util.matching.Regex = /usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar
scala> val myRegExp2 = "helloWorld.jar".r
myRegExp2: scala.util.matching.Regex = helloWorld.jar
scala> (myRegExp findAllIn input) foreach( println)
/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar
/usr/share/common-api/lib/com.supai.common-api-1.8.5-DEV-SNAPSHOT.jar
scala> (myRegExp2 findAllIn input) foreach( println)
scala>

Scala regex pattern match groups different from that using findAllIn

I find that the groups extracted by Pattern-matching on regex's in Scala are different from those extracted using findAllIn function.
1) Here is an example of extraction using pattern match -
scala> val fullRegex = """(.+?)=(.+?)""".r
fullRegex: scala.util.matching.Regex = (.+?)=(.+?)
scala> val x = """a='b'"""
x: String = a='b'
scala> x match { case fullRegex(l,r) => println( l ); println(r) }
a
'b'
2) And here is an example of extraction using the findAllIn function -
scala> fullRegex.findAllIn(x).toArray
res4: Array[String] = Array(a=')
I was expecting the returned Array using findAllIn to be Array(a, 'b'). Why is it not so?
This is because you have not specified to what extent the second lazy match should go. So after = it consumes just one character and stops as it is in lazy mode.
See here.
https://regex101.com/r/dU7oN5/10
Change it to .+?=.+ to get full array
In particular, the pattern match's use of unapplySeq uses Matcher.matches, while findAllIn uses Matcher.find. matches tries to match entire input.
scala> import java.util.regex._
import java.util.regex._
scala> val p = Pattern compile ".+?"
p: java.util.regex.Pattern = .+?
scala> val m = p matcher "hello"
m: java.util.regex.Matcher = java.util.regex.Matcher[pattern=.+? region=0,5 lastmatch=]
scala> m.matches
res0: Boolean = true
scala> m.group
res1: String = hello
scala> m.reset
res2: java.util.regex.Matcher = java.util.regex.Matcher[pattern=.+? region=0,5 lastmatch=]
scala> m.find
res3: Boolean = true
scala> m.group
res4: String = h
scala>

Extract numbers from string with rich string magic

I want to extract a list of ID of a string pattern in the following:
{(2),(4),(5),(100)}
Note: no leading or trailing spaces.
The List can have up to 1000 IDs.
I want to use rich string pattern matching to do this. But I tried for 20 minutes with frustration.
Could anyone help me to come up with the correct pattern? Much appreciated!
Here's brute force string manipulation.
scala> "{(2),(4),(5),(100)}".replaceAll("\\(", "").replaceAll("\\)", "").replaceAll("\\{","").replaceAll("\\}","").split(",")
res0: Array[java.lang.String] = Array(2, 4, 5, 100)
Here's a regex as #pst noted in the comments. If you don't want the parentheses change the regular expression to """\d+""".r.
val num = """\(\d+\)""".r
"{(2),(4),(5),(100)}" findAllIn res0
res33: scala.util.matching.Regex.MatchIterator = non-empty iterator
scala> res33.toList
res34: List[String] = List((2), (4), (5), (100))
"{(2),(4),(5),(100)}".split ("[^0-9]").filter(_.length > 0).map (_.toInt)
Split, where char is not part of a number, and only convert non-empty results.
Might be modified to include dots or minus signs.
Use Extractor object:
object MyList {
def apply(l: List[String]): String =
if (l != Nil) "{(" + l.mkString("),(") + ")}"
else "{}"
def unapply(str: String): Some[List[String]] =
Some(
if (str.indexOf("(") > 0)
str.substring(str.indexOf("(") + 1, str.lastIndexOf(")")) split
"\\p{Space}*\\)\\p{Space}*,\\p{Space}*\\(\\p{Space}*" toList
else Nil
)
}
// test
"{(1),(2)}" match { case MyList(l) => l }
// res23: List[String] = List(1, 2)