Use PullParser to deserialize Range in Crystal - crystal-lang

I'm trying to convert some json { "range": {"start": 1, "stop": 10} } into a Range object equivalent to Range.new(1,10).
It seems that if I want to do this in my Foo struct I'll need a custom converter (see below) which uses the JSON::PullParser to consume each token. I tried things like the below to see if I could understand how the pull parser is supposed to be used. But it looks like it expects everything to be a string and chokes on the first Int it finds. So the following isn't helpful but illustrates what I'm confused about:
require "json"
module RangeConverter
def self.from_json(pull : JSON::PullParser)
pull.read_object do |key, key_location|
puts key # => puts `start` then chokes on the `int`
# Expected String but was Int at 1:22
end
Range.new(1,2)
end
end
struct Foo
include JSON::Serializable
#[JSON::Field(converter: RangeConverter)]
property range : Range(Int32, Int32)
end
Foo.from_json %({"range": {"start": 1, "stop": 10}})
The only way I was able to figure this out was to just read the raw json string and work with it directly but it feels like I'm side-stepping the parser because I don't understand it. The following works:
require "json"
module RangeConverter
def self.from_json(pull : JSON::PullParser)
h = Hash(String, Int32).from_json(pull.read_raw)
Range.new(h["start"],h["stop"])
end
end
struct Foo
include JSON::Serializable
#[JSON::Field(converter: RangeConverter)]
property range : Range(Int32, Int32)
end
Foo.from_json %({"range": {"start": 1, "stop": 10}})
So how am I actually supposed to be using the Parser here?

The answer by Oleh Prypin is great. As he said, the second approach is good except that it allocates a Hash so it consumes extra memory.
Instead of Hash you can use a NamedTuple which is allocated on the stack so it's much more efficient. This is a good use case for such type:
require "json"
module RangeConverter
def self.from_json(pull : JSON::PullParser)
tuple = NamedTuple(start: Int32, stop: Int32).new(pull)
tuple[:start]..tuple[:stop]
end
end
struct Foo
include JSON::Serializable
#[JSON::Field(converter: RangeConverter)]
property range : Range(Int32, Int32)
end
p Foo.from_json %({"range": {"start": 1, "stop": 10}})
An alternative to NamedTuple is to use a plain struct with getters, which is what record is for:
require "json"
record JSONRange, start : Int32, stop : Int32 do
include JSON::Serializable
def to_range
start..stop
end
end
module RangeConverter
def self.from_json(pull : JSON::PullParser)
JSONRange.new(pull).to_range
end
end
struct Foo
include JSON::Serializable
#[JSON::Field(converter: RangeConverter)]
property range : Range(Int32, Int32)
end
p Foo.from_json %({"range": {"start": 1, "stop": 10}})

Your latter option isn't bad at all. It just reuses the implementation from a Hash but it's fully workable and composable. The only downside is it needs to allocate and then discard that Hash.
Based on this sample I deduce that you're expected to call .begin_object? first. But actually that's just a nicety for error detection. The main thing is that you're also supposed to explicitly read ("consume") the values, based on this sample. In the code below this is represented with Int32.new(pull).
require "json"
module RangeConverter
def self.from_json(pull : JSON::PullParser)
start = stop = nil
unless pull.kind.begin_object?
raise JSON::ParseException.new("Unexpected pull kind: #{pull.kind}", *pull.location)
end
pull.read_object do |key, key_location|
case key
when "start"
start = Int32.new(pull)
when "stop"
stop = Int32.new(pull)
else
raise JSON::ParseException.new("Unexpected key: #{key}", *key_location)
end
end
raise JSON::ParseException.new("No start", *pull.location) unless start
raise JSON::ParseException.new("No stop", *pull.location) unless stop
Range.new(start, stop)
end
end
struct Foo
include JSON::Serializable
#[JSON::Field(converter: RangeConverter)]
property range : Range(Int32, Int32)
end
p Foo.from_json %({"range": {"start": 1, "stop": 10}})

Related

Crystal: a good way for method options

I need to pass some options to a method, some of these options are optional (something like Object destructuring in JS).
My current code:
def initialize( arg1 : String, options = {} of Symbol => String )
opt = MyClass.get_option options, :opt1
#opt1 = !opt.empty? ? opt : "Def value"
opt = MyClass.get_option options, :opt2
#opt2 = !opt.empty? ? opt : "False"
# ...
end
def self.get_option( options, key : Symbol )
( options && options[key]? ) ? options[key].strip : ""
end
And I call it: MyClass.new "Arg", { opt2: "True", opt4: "123" }
It works but I'm looking for a better way. It would be useful to set the type of each option and to have default values directly in the function signature.
Using a NamedTuple seems a good way but I had problems with optional values - options : NamedTuple( opt1: String, opt2: Bool, opt3: String, opt4: Int ) | Nil = nil
Another way that I tried is with a struct but it seems to complicate the situation.
Crystal has optional and named method arguments as core language features, and does not require writing special code for handling the arguments. See the official documentation about Method arguments. In particular, here is an example:
def method(arg1 : String, *, opt1 = "Def value", opt2 = false)
The asterisk is not always needed, it only ensures that the following optional arguments can only be passed by name:
method("test", opt1: "other value", opt2: false)

mm/DD/YYYY or mmDDYYYY RegEx Validation

I have a RegEx that validates a date coming in. What I want it to allow:
MM/dd/YYYY
M/d/YYYY
MM-dd-YYYY
M-d-YYYY
MM.dd.YYYY
M.d.YYYY
MMddYYYY
And a few other variants.
Here's my expression:
^((0[1-9]|1[012])[- /.]?(0[1-9]|[12][0-9]|3[01])[- /.]?(19|20)\d\d)|((((0?[13578])|(1[02]))[- /.]?((0?[1-9])|([12][0-9])|(3[01]))|((0?[469])|(11))[- /.]?((0?[1-9])|([12][0-9])|(30))|(0?[2])[- /.]?((0?[1-9])|([1][0-9])|([2][0-8])))[- /.]?(19\d{2}|20\d{2}))|(((0?[2]))[- /.]?((0?[1-9])|([12][0-9]))[- /.]?((19|20)(04|08|[2468][048]|[13579][26])|2000))$
I'm getting the majority to work, but the dates that I do not want to work is MdYYYY, MMdYYYY, or MddYYYY
I want the RegEx to be the only thing changed because it's being called in multiple spots for the same reason, limiting the amount of code I need to adjust.
I'm calling this RegEx from this Case statement which is in my custom TextBoxPlus.ascx:
Case TextBoxPlusType.DateOnlyMMDDYYYY
WatermarkText = "mmddyyyy"
ValidationExpression = "^((0[1-9]|1[012])[- /.]?(0[1-9]|[12][0-9]|3[01])[- /.]?(19|20)\d\d)|((((0?[13578])|(1[02]))[- /.]?((0?[1-9])|([12][0-9])|(3[01]))|((0?[469])|(11))[- /.]?((0?[1-9])|([12][0-9])|(30))|(0?[2])[- /.]?((0?[1-9])|([1][0-9])|([2][0-8])))[- /.]?(19\d{2}|20\d{2}))|(((0?[2]))[- /.]?((0?[1-9])|([12][0-9]))[- /.]?((19|20)(04|08|[2468][048]|[13579][26])|2000))$"
ErrorMessage = "Please enter a valid date format<br><b>mm/dd/yyyy<br>mmddyyyy</b>"
This is on the actual aspx.vb page calling TextBoxPlus (my custom control):
If (Not (Date.TryParseExact(IssueDate.Text, "MMddyyyy", System.Globalization.DateTimeFormatInfo.InvariantInfo, Globalization.DateTimeStyles.None, New Date))) Then
If (Not (Date.TryParseExact(IssueDate.Text, "MM/dd/yyyy", System.Globalization.DateTimeFormatInfo.InvariantInfo, Globalization.DateTimeStyles.None, New Date))) Then
showIfBadDate.Visible = True
BadDate_AM.Show()
Else
IssueDate_ = Date.ParseExact(IssueDate.Text, "MM/dd/yyyy", System.Globalization.DateTimeFormatInfo.InvariantInfo)
End If
Else
IssueDate_ = Date.ParseExact(IssueDate.Text, "MMddyyyy", System.Globalization.DateTimeFormatInfo.InvariantInfo)
End If
If you're using it in a few places, it would be best to use a function to determine the validity of the strings as dates in the acceptable formats:
Option Infer On
Option Strict On
Imports System.Globalization
Module Module1
Function IsValidDate(s As String) As Boolean
Dim validFormats = {"MM/dd/yyyy", "M/d/yyyy", "MM-dd-yyyy", "M-d-yyyy", "MM.dd.yyyy", "M.d.yyyy", "MMddyyyy"}
Dim dt As DateTime
Dim ci As New CultureInfo("en-US")
Return DateTime.TryParseExact(s, validFormats, ci, DateTimeStyles.None, dt)
End Function
Sub Main()
Dim stringsToTry = {"01/31/2016", "1/31/2016", "01-31-2016", "1-9-2016", "01.31.2016", "1.9.2016", "01312016", "112016", "1212016", "1122016"}
For Each s In stringsToTry
Console.WriteLine("{0,-10}: {1}", s, IsValidDate(s))
Next
Console.ReadLine()
End Sub
End Module
Outputs:
01/31/2016: True
1/31/2016 : True
01-31-2016: True
1-9-2016 : True
01.31.2016: True
1.9.2016 : True
01312016 : True
112016 : False
1212016 : False
1122016 : False
With a small change, you could get the function to return a Nullable(Of DateTime) if it is desirable to get the parsed date if it exists.

Scala Map[Regex, String] collectFirst error

I am trying to automatically convert a string to Date based on regex matches. My code thus far is as below:
package be.folks.date
import java.util.Date
import scala.util.matching.Regex
import org.joda.time.format.DateTimeFormat
class StringToDate(underlying:String) {
val regmap : Map[Regex, String] = Map(
("""\d\d-\d\d-\d\d\d\d""".r, "dd-MM-yyyy"),
("""\d\d-\w\w\w-\d\d\d\d""".r, "dd-MMM-yyyy")
)
def toDate() : Date = {
DateTimeFormat.forPattern((regmap collectFirst { case (_(underlying) , v) => v } get)).parseDateTime(underlying).toDate()
}
}
object StringToDate {
implicit def +(s:String) = new StringToDate(s)
}
However, I am getting an error for "_" - ) expected but found (.
How do I correct this?
I'm not sure I understand your syntax to apply the Regex. Maybe, in toDate, you wanted:
regmap collectFirst {
case (pattern , v) if((pattern findFirstIn underlying).nonEmpty) => v}
I also would not use get to extract the string from the option, as it throws an exception if no matching regex is found. I don't know how you want to manage that case in your code so I can't give you suggestions.

Pretty String Manipulation

I have the following string which I wish to extract parts from:
<FONT COLOR="GREEN">201 KAR 2:340.</FONT>
In this particular case, I wish to extract the numbers 201,2, and 340, which I will later use to concatenate to form another string:
http://www.lrc.state.ky.us/kar/201/002/340reg.htm
I have a solution, but it is not easily readable, and it seems rather clunky. It involves using the mid function. Here it is:
intTitle = CInt(Mid(strFontTag,
InStr(strFontTag, ">") + 1,
(InStr(strFontTag, "KAR") - InStr(strFontTag, ">"))
- 3))
I would like to know if perhaps there is a better way to approach this task. I realize I could make some descriptive variable names, like intPosOfEndOfOpeningFontTag to describe what the first InStr function does, but it still feels clunky to me.
Should I be using some sort of split function, or regex, or some more elegant way that I have not come across yet? I have been manipulating strings in this fashion for years, and I just feel there must be a better way. Thanks.
<FONT[^>]*>[^\d]*(\d+)[^\d]*(\d+):(\d+)[^\d]*</FONT>
The class
Imports System
Imports System.IO
Imports System.Text
Imports System.Text.RegularExpressions
Imports System.Xml
Imports System.Xml.Linq
Imports System.Linq
Public Class clsTester
'methods
Public Sub New()
End Sub
Public Function GetTitleUsingRegEx(ByVal fpath$) As XElement
'use this function if your input string is not a well-formed
Dim result As New XElement(<result/>)
Try
Dim q = Regex.Matches(File.ReadAllText(fpath), Me.titPattern1, RegexOptions.None)
For Each mt As Match In q
Dim t As New XElement(<title/>)
t.Add(New XAttribute("name", mt.Groups("name").Value))
t.Add(New XAttribute("num1", mt.Groups("id_1").Value))
t.Add(New XAttribute("num2", mt.Groups("id_2").Value))
t.Add(New XAttribute("num3", mt.Groups("id_3").Value))
t.Add(mt.Value)
result.Add(t)
Next mt
Return result
Catch ex As Exception
result.Add(<error><%= ex.ToString %></error>)
Return result
End Try
End Function
Public Function GetTitleUsingXDocument(ByVal fpath$) As XElement
'use this function if your input string is well-formed
Dim result As New XElement(<result/>)
Try
Dim q = XElement.Load(fpath).Descendants().Where(Function(c) Regex.IsMatch(c.Name.LocalName, "(?is)^font$")).Where(Function(c) Regex.IsMatch(c.Value, Me.titPattern2, RegexOptions.None))
For Each nd As XElement In q
Dim s = Regex.Match(nd.Value, Me.titPattern2, RegexOptions.None)
Dim t As New XElement(<title/>)
t.Add(New XAttribute("name", s.Groups("name").Value))
t.Add(New XAttribute("num1", s.Groups("id_1").Value))
t.Add(New XAttribute("num2", s.Groups("id_2").Value))
t.Add(New XAttribute("num3", s.Groups("id_3").Value))
t.Add(nd.Value)
result.Add(t)
Next nd
Return result
Catch ex As Exception
result.Add(<error><%= ex.ToString %></error>)
Return result
End Try
End Function
'fields
Private titPattern1$ = "(?is)(?<=<font[^<>]*>)(?<id_1>\d+)\s+(?<name>[a-z]+)\s+(?<id_2>\d+):(?<id_3>\d+)(?=\.?</font>)"
Private titPattern2$ = "(?is)^(?<id_1>\d+)\s+(?<name>[a-z]+)\s+(?<id_2>\d+):(?<id_3>\d+)\.?$"
End Class
The usage
Sub Main()
Dim y = New clsTester().GetTitleUsingRegEx("C:\test.htm")
If y.<error>.Count = 0 Then
Console.WriteLine(String.Format("Result from GetTitleUsingRegEx:{0}{1}", vbCrLf, y.ToString))
Else
Console.WriteLine(y...<error>.First().Value)
End If
Console.WriteLine("")
Dim z = New clsTester().GetTitleUsingXDocument("C:\test.htm")
If z.<error>.Count = 0 Then
Console.WriteLine(String.Format("Result from GetTitleUsingXDocument:{0}{1}", vbCrLf, z.ToString))
Else
Console.WriteLine(z...<error>.First().Value)
End If
Console.ReadLine()
End Sub
Hope this helps.
regex pattern: <FONT[^>]*>.*?(\d+).*?(\d+).*?(\d+).*?<\/FONT>
I think #Jean-François Corbett has it right.
Hide it away in a function and never look back
Change your code to this:
intTitle = GetCodesFromColorTag("<FONT COLOR="GREEN">201 KAR 2:340.</FONT>")
Create a new function:
Public Function GetCodesFromColorTag(FontTag as String) as Integer
Return CInt(Mid(FontTag, InStr(FontTag, ">") + 1,
(InStr(FontTag, "KAR") - InStr(FontTag, ">"))
- 3))
End Function

NullPointerException while working with stateful PartialFunction and collectFirst

Consider this (very ugly code):
object ExternalReferences2 {
import java.util.regex._
implicit def symbol2string(sym: Symbol) = sym.name
object Mapping {
def fromXml(mapping: scala.xml.NodeSeq) = {
new Mapping(mapping \ 'vendor text,
mapping \ 'match text,
mapping \ 'format text)
}
}
case class Mapping(vendor: String,
matches: String,
format: String) extends PartialFunction[String, String] {
private val pattern = Pattern.compile(matches)
private var _currentMatcher: Matcher = null
private def currentMatcher =
{ println("Getting matcher: " + _currentMatcher); _currentMatcher }
private def currentMatcher_=(matcher: Matcher) =
{ println("Setting matcher: " + matcher); _currentMatcher = matcher }
def isDefinedAt(entity: String) =
{ currentMatcher = pattern.matcher(entity); currentMatcher.matches }
def apply(entity: String) = apply
def apply = {
val range = 0 until currentMatcher.groupCount()
val groups = range
map (currentMatcher.group(_))
filterNot (_ == null)
map (_.replace('.', '/'))
format.format(groups: _*)
}
}
val config =
<external-links>
<mapping>
<vendor>OpenJDK</vendor>
<match>{ """^(javax?|sunw?|com.sun|org\.(ietf\.jgss|omg|w3c\.dom|xml\.sax))(\.[^.]+)+$""" }</match>
<format>{ "http://download.oracle.com/javase/7/docs/api/%s.html" }</format>
</mapping>
</external-links>
def getLinkNew(entity: String) =
(config \ 'mapping)
collectFirst({ case m => Mapping.fromXml(m)})
map(_.apply)
def getLinkOld(entity: String) =
(config \ 'mapping).view
map(m => Mapping.fromXml(m))
find(_.isDefinedAt(entity))
map(_.apply)
}
I tried to improve the getLinkOld method by using collectFirst as shown in getLinkNew, but I always get a NullPointerException because _currentMatcher is still set to null
scala> ExternalReferences2.getLinkNew("java.util.Date")
Getting matcher: null
java.lang.NullPointerException
at ExternalReferences2$Mapping.apply(<console>:32)
at ExternalReferences2$$anonfun$getLinkNew$2.apply(<console>:58)
at ExternalReferences2$$anonfun$getLinkNew$2.apply(<console>:58)
at scala.Option.map(Option.scala:131)
at ExternalReferences2$.getLinkNew(<console>:58)
at .<init>(<console>:13)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
while it works perfectly with getLinkOld.
What is the problem here?
Your matcher is created as a side-effect in isDefined. Passing side effecting functions to routine such as map is usually a recipe for disaster, but this is not even what happens here. Your code requires isDefined to have been called just before apply is, with the same argument. That makes your code very fragile, and that is what you should change.
Clients of PartialFunction do not have to do follow that protocol in general. Imagine for instance
if (f.isDefinedAt(x) && f.isDefinedAt(y)) {fx = f(x); fy = f(y)}.
And here the code that calls apply is not even yours, but the collection classes', so you do not control what happens.
Your specific problem in getLinkNew is that isDefined is simply never called.The PartialFunction argument of collectFirst is {case m => ...}. The isDefined that is called is the isDefined of this function. As m is an irrefutable pattern, it is allways true, and collectFirst will always return the first element if there is one. That the partial function returns another partial function (the Mapping) which happens not to be defined at m, is irrelevant.
Edit - Possible workaround
A very light change would be to check whether a matcher is available and create it if it is not. Better, keep the entity string that has been used to create it too, so that you can check it is the proper one. This should make the side effect benign as long as there is no multithreading. But the way, do not use null, use Option, so the compiler will not let you ignore the possibility that it may be None.
var _currentMatcher : Option[(String, Matcher)] = None
def currentMatcher(entity: String) : Matcher = _currentMatcher match{
case Some(e,m) if e == entity => m
case _ => {
_currentMatcher = (entity, pattern.matcher(entity))
_currentmatcher._2
}
}
Edit again. Stupid me
Sorry, the so called workaround indeed makes the class safer, but it does not make the collectFirst solution work. Again, the case m => partial function is always defined (note: entity does not even appears in your getLinkNew code, which should be worrying). The problem is that one would need a PartialFunction of a NodeSeq (not of entity, which will be known to the function, but not passed as argument). isDefined will be called, then apply. The pattern and the matcher depends on the NodeSeq, so they cannnot be created beforehand, but only in isDefined and/or apply. In the same spirit, you can cache what is computed in isDefined to reuse in Apply. This is definitely not pretty
def linkFor(entity: String) = new PartialFunction[NodeSeq, String] {
var _matcher : Option[String, Matcher] = None
def matcher(regexp: String) = _matcher match {
case Some(r, m) where r == regexp => m
case None => {
val pattern = Pattern.compile(regexp)
_matcher = (regexp, pattern.matcher(entity))
_matcher._2
}
}
def isDefined(mapping: NodeSeq) = {
matcher(mapping \ "match" text).matches
}
def apply(mapping: NodeSeq) = {
// call matcher(...), it is likely to reuse previous matcher, build result
}
}
You use that with (config \ mapping).collectFirst(linkFor(entity))