Regex matching for number throws MatchError - regex

I was trying to break the time string which is in hrs:min format using regex but it fails and gives MatchError.
what is going on here? using Scala 2.10.
scala> val minsecs = """\d+:\d+""".r
minsecs: scala.util.matching.Regex = \d+:\d+
scala> val minsecs(m,s) = "03:45"
scala.MatchError: 03:45 (of class java.lang.String)
at .<init>(<console>:8)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:731)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:980)
at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:570)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:601)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.reallyInterpret$1(ILoop.scala:745)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:790)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:702)
at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:566)
at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:573)
at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:576)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:867)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:822)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:822)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:822)
at scala.tools.nsc.interpreter.ILoop.main(ILoop.scala:889)
at xsbt.ConsoleInterface.run(ConsoleInterface.scala:57)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sbt.compiler.AnalyzingCompiler.call(AnalyzingCompiler.scala:73)
at sbt.compiler.AnalyzingCompiler.console(AnalyzingCompiler.scala:64)
at sbt.Console.console0$1(Console.scala:23)
at sbt.Console$$anonfun$apply$2$$anonfun$apply$1.apply$mcV$sp(Console.scala:24)
at sbt.TrapExit$.executeMain$1(TrapExit.scala:33)
at sbt.TrapExit$$anon$1.run(TrapExit.scala:42)

You need to use two capturing groups in order to extract values:
val minsecs = """(\d+):(\d+)""".r
val minsecs(m,s) = "03:45"
Note the added parenthesis around \d+.

Related

Regex pattern with [:] returns as invalid

I'm searching a text for a specific pattern using regex and it seems to work fine until I want to include a ":"
the function I use to find the text is:
func matches(for regex: String, in text: String) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex)
let results = regex.matches(in: text,
range: NSRange(text.startIndex..., in: text))
return results.map {
String(text[Range($0.range, in: text)!])
}
} catch let error {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
The example array I use to test the pattern is:
var textstringarray = ["10:50 - 13:40","ABC"]
And here is the loop that checks the different items:
for myString in textstringarray{
let matched2 = matches(for: "[0-9][0-9][:][0-9][0-9] [-] [0-9][0-9][:][0-9][0-9]", in: myString)
if !matched2.isEmpty{
print(matched2)
}
}
I expect it to return only the first item, but the Log in Playground only says
invalid regex: The value “[0-9][0-9][:][0-9][0-9] [-] [0-9][0-9][:][0-9][0-9]” is invalid
So far I figured out that the problem the second [:] is, because when I delete it everything works fine. Anyone any idea, what I could do?
Thanks a lot
Maybe it thinks [: ... :] is an invalid posix character class? Seems like a bug to me.
Does [\:] fix it? (Though there's no need to use a character class for a single character; you could just have : instead of [:].)

Unpack all groups from regex match

There is a very nice way to unpack matched groups from regex:
scala> val regex = "(first):(second)".r
regex: scala.util.matching.Regex = (first):(second)
scala> val regex(a, b) = "first:second"
a: String = first
b: String = second
Unfortunately, this throws exception when there is no match:
scala> val regex(a, b) = "first:third"
scala.MatchError: first:third (of class java.lang.String)
at .<init>(<console>:10)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:704)
at scala.tools.nsc.interpreter.IMain$Request$$anonfun$14.apply(IMain.scala:920)
at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
at java.lang.Thread.run(Thread.java:744)
In this case I could use findFirstMatchIn to get None is there is no match:
scala> val result = regex.findFirstMatchIn("first:third")
result: Option[scala.util.matching.Regex.Match] = None
But in case of match, I want to have something as good as the first variant with unpacking. Now I have to deal with this:
scala> val result = regex.findFirstMatchIn("first:second")
result: Option[scala.util.matching.Regex.Match] = Some(first:second)
What I came up with is this:
scala> val content = result.get
content: scala.util.matching.Regex.Match = first:second
scala> 1 to content.groupCount map content.group
res0: scala.collection.immutable.IndexedSeq[String] = Vector(first, second)
Is there any better way to get all groups from regex match object (ideally as succinct as unpacking in the first code snippet in this question)?
They came up with Groups for that:
scala> regex findFirstMatchIn "first:second" map { case Regex.Groups(a,b) => (a,b) }
res8: Option[(String, String)] = Some((first,second))
I think that's the same now as:
scala> regex findFirstMatchIn "first:second" map { case regex(a,b) => (a,b) }
res9: Option[(String, String)] = Some((first,second))
since it doesn't recompute the match.
You could use pattern matching
Something along this (untested)
val regex = "(first):(second)".r
val myString = "first:second"
myString match {
case regex(first, second) => do something
case _ => None
}

Why does this clojure hello world throw a NullPointerException?

This is my very first program:
(println "hello, what is your name?")
(let [myname (read-line)]
((println (str "hello " myname))))
It kinda works:
hello, what is your name?
Joel
hello Joel
Exception in thread "main" java.lang.NullPointerException, compiling:(/home/joel/workspace/coolstuff/clojure/hello.clj:1:38)
at clojure.lang.Compiler.load(Compiler.java:7142)
at clojure.lang.Compiler.loadFile(Compiler.java:7086)
at clojure.main$load_script.invoke(main.clj:274)
at clojure.main$script_opt.invoke(main.clj:336)
at clojure.main$main.doInvoke(main.clj:420)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.lang.Var.invoke(Var.java:379)
at clojure.lang.AFn.applyToHelper(AFn.java:154)
at clojure.lang.Var.applyTo(Var.java:700)
at clojure.main.main(main.java:37)
Caused by: java.lang.NullPointerException
at user$eval3.invoke(hello.clj:3)
at clojure.lang.Compiler.eval(Compiler.java:6703)
at clojure.lang.Compiler.load(Compiler.java:7130)
... 9 more
Why does it throw an exception?
((println (str "hello " myname)))
...is running the thing returned by println as a function.
println doesn't return anything. Hence, it returns null. Hence, running its return value as a function throws a NullPointerException.
Take out the extra parenthesis:
(println (str "hello " myname))

Matching and replace all occurencies of backslash in a string

I use the following in GWT to find the backslash from a string and replace with \\.
String name = "\path\item";
name = RegExp.compile("/\\/g").replace(name, "\\\\");
But it does not work, because for name=\path\item returns name=\path\item.
ok i follow the recomendation of Thomas Broyer and the first RegExp.compile("\\", "g").replace(bgPath, "\\\\") gives:
Caused by: com.google.gwt.core.client.JavaScriptException: (SyntaxError): trailing \ in regular expression
at com.google.gwt.dev.shell.BrowserChannelServer.invokeJavascript(BrowserChannelServer.java:237)
at com.google.gwt.dev.shell.ModuleSpaceOOPHM.doInvoke(ModuleSpaceOOPHM.java:132)
at com.google.gwt.dev.shell.ModuleSpace.invokeNative(ModuleSpace.java:561)
at com.google.gwt.dev.shell.ModuleSpace.invokeNativeObject(ModuleSpace.java:269)
at com.google.gwt.dev.shell.JavaScriptHost.invokeNativeObject(JavaScriptHost.java:91)
at com.google.gwt.regexp.shared.RegExp$.compile(RegExp.java)
at com.ait.gwt.authtool.client.ui.TicketViewer.<init>(TicketViewer.java:197)
at com.ait.gwt.authtool.client.AuthTool.onViewTicketBtnClicked(AuthTool.java:1942)
at com.ait.gwt.authtool.client.AuthTool.onMessageReceived(AuthTool.java:1995)
at com.ait.gwt.authtool.client.events.MessageReceivedEvent.dispatch(MessageReceivedEvent.java:44)
at com.ait.gwt.authtool.client.events.MessageReceivedEvent.dispatch(MessageReceivedEvent.java:1)
at com.google.gwt.event.shared.GwtEvent.dispatch(GwtEvent.java:1)
at com.google.web.bindery.event.shared.SimpleEventBus.doFire(SimpleEventBus.java:193)
at com.google.web.bindery.event.shared.SimpleEventBus.fireEvent(SimpleEventBus.java:88)
at com.google.gwt.event.shared.SimpleEventBus.fireEvent(SimpleEventBus.java:52)
and the second bgPath.replaceAll("\\", "\\\\") gives:
Caused by: java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
\
^
at java.util.regex.Pattern.error(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.util.regex.Pattern.<init>(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.lang.String.replaceAll(Unknown Source)
at com.ait.gwt.authtool.client.ui.TicketViewer.<init>(TicketViewer.java:198)
but when i type: bgPath = bgPath.replaceAll(Pattern.quote("\"), Matcher.quoteReplacement("\\"));
works normally(!!) as it gives: [INFO] [gwt_app] - !!! bgPath=Background\\Cartoon\\image
RegExp.compile is the equivalent to new Regexp in JS, so the argument is not a regexp literal. Your code should read RegExp.compile("\\", "g").
But for this particular case, name.replace("\\", "\\\\") should be enough.

NullPointerException while working with stateful PartialFunction and collectFirst

Consider this (very ugly code):
object ExternalReferences2 {
import java.util.regex._
implicit def symbol2string(sym: Symbol) = sym.name
object Mapping {
def fromXml(mapping: scala.xml.NodeSeq) = {
new Mapping(mapping \ 'vendor text,
mapping \ 'match text,
mapping \ 'format text)
}
}
case class Mapping(vendor: String,
matches: String,
format: String) extends PartialFunction[String, String] {
private val pattern = Pattern.compile(matches)
private var _currentMatcher: Matcher = null
private def currentMatcher =
{ println("Getting matcher: " + _currentMatcher); _currentMatcher }
private def currentMatcher_=(matcher: Matcher) =
{ println("Setting matcher: " + matcher); _currentMatcher = matcher }
def isDefinedAt(entity: String) =
{ currentMatcher = pattern.matcher(entity); currentMatcher.matches }
def apply(entity: String) = apply
def apply = {
val range = 0 until currentMatcher.groupCount()
val groups = range
map (currentMatcher.group(_))
filterNot (_ == null)
map (_.replace('.', '/'))
format.format(groups: _*)
}
}
val config =
<external-links>
<mapping>
<vendor>OpenJDK</vendor>
<match>{ """^(javax?|sunw?|com.sun|org\.(ietf\.jgss|omg|w3c\.dom|xml\.sax))(\.[^.]+)+$""" }</match>
<format>{ "http://download.oracle.com/javase/7/docs/api/%s.html" }</format>
</mapping>
</external-links>
def getLinkNew(entity: String) =
(config \ 'mapping)
collectFirst({ case m => Mapping.fromXml(m)})
map(_.apply)
def getLinkOld(entity: String) =
(config \ 'mapping).view
map(m => Mapping.fromXml(m))
find(_.isDefinedAt(entity))
map(_.apply)
}
I tried to improve the getLinkOld method by using collectFirst as shown in getLinkNew, but I always get a NullPointerException because _currentMatcher is still set to null
scala> ExternalReferences2.getLinkNew("java.util.Date")
Getting matcher: null
java.lang.NullPointerException
at ExternalReferences2$Mapping.apply(<console>:32)
at ExternalReferences2$$anonfun$getLinkNew$2.apply(<console>:58)
at ExternalReferences2$$anonfun$getLinkNew$2.apply(<console>:58)
at scala.Option.map(Option.scala:131)
at ExternalReferences2$.getLinkNew(<console>:58)
at .<init>(<console>:13)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
while it works perfectly with getLinkOld.
What is the problem here?
Your matcher is created as a side-effect in isDefined. Passing side effecting functions to routine such as map is usually a recipe for disaster, but this is not even what happens here. Your code requires isDefined to have been called just before apply is, with the same argument. That makes your code very fragile, and that is what you should change.
Clients of PartialFunction do not have to do follow that protocol in general. Imagine for instance
if (f.isDefinedAt(x) && f.isDefinedAt(y)) {fx = f(x); fy = f(y)}.
And here the code that calls apply is not even yours, but the collection classes', so you do not control what happens.
Your specific problem in getLinkNew is that isDefined is simply never called.The PartialFunction argument of collectFirst is {case m => ...}. The isDefined that is called is the isDefined of this function. As m is an irrefutable pattern, it is allways true, and collectFirst will always return the first element if there is one. That the partial function returns another partial function (the Mapping) which happens not to be defined at m, is irrelevant.
Edit - Possible workaround
A very light change would be to check whether a matcher is available and create it if it is not. Better, keep the entity string that has been used to create it too, so that you can check it is the proper one. This should make the side effect benign as long as there is no multithreading. But the way, do not use null, use Option, so the compiler will not let you ignore the possibility that it may be None.
var _currentMatcher : Option[(String, Matcher)] = None
def currentMatcher(entity: String) : Matcher = _currentMatcher match{
case Some(e,m) if e == entity => m
case _ => {
_currentMatcher = (entity, pattern.matcher(entity))
_currentmatcher._2
}
}
Edit again. Stupid me
Sorry, the so called workaround indeed makes the class safer, but it does not make the collectFirst solution work. Again, the case m => partial function is always defined (note: entity does not even appears in your getLinkNew code, which should be worrying). The problem is that one would need a PartialFunction of a NodeSeq (not of entity, which will be known to the function, but not passed as argument). isDefined will be called, then apply. The pattern and the matcher depends on the NodeSeq, so they cannnot be created beforehand, but only in isDefined and/or apply. In the same spirit, you can cache what is computed in isDefined to reuse in Apply. This is definitely not pretty
def linkFor(entity: String) = new PartialFunction[NodeSeq, String] {
var _matcher : Option[String, Matcher] = None
def matcher(regexp: String) = _matcher match {
case Some(r, m) where r == regexp => m
case None => {
val pattern = Pattern.compile(regexp)
_matcher = (regexp, pattern.matcher(entity))
_matcher._2
}
}
def isDefined(mapping: NodeSeq) = {
matcher(mapping \ "match" text).matches
}
def apply(mapping: NodeSeq) = {
// call matcher(...), it is likely to reuse previous matcher, build result
}
}
You use that with (config \ mapping).collectFirst(linkFor(entity))