FParsec styles; demonstrate differences between combinator and monadic style? - combinators

I am new to F#, about two months, and I recently finished the FParsec tutorial and started looking for more examples. The more I read the more confused I became, and then I started to see references to styles. I looked for more styles and came up with this list.
Combinator style
Monadic style
Arrow style
Direct style
Can someone list all of the styles and explain and demonstrate how each one works with a common problem, e.g. parse
“(abc
(b CDEF
(de 1 E)
(f 234)
)
(h 3)
(jkl H)
)”
into
[Lower "abc";
Group[Lower "b"; Upper "CDEF";
Group [Lower "de"; Number "1"; Upper "E"];
Group [Lower "f"; Number "234"]];
Group [Lower "h"; Number "3"];
Group [Lower "jkl"; Upper "H"]
]
Using
Type out =
| Lower of string
| Upper of string
| Number of string
| Group of out list
EDIT
I picked up combinator and monadic style from a comment in FParsec and a delimiter based syntax
Direct style is always appearing as Direct Style Monadic Parser
Arrow style appears in Parsec: Direct Style Monadic Parser Combinators For The Real World I haven’t read all of this.
EDIT
Per suggestion
Combinator style
type out =
| Lower of string
| Upper of string
| Number of string
| Group of out list
type Parser = Parser<out, unit>
let isUpper = fun c -> isAsciiUpper c
let upper : Parser =
many1Satisfy isUpper .>> ws
|>> fun x -> Upper(x)
let isLower = fun c -> isAsciiLower c
let lower : Parser=
many1Satisfy isLower .>> ws
|>> fun x -> Lower(x)
let isNumber = fun c -> isDigit c
let number : Parser =
many1Satisfy isNumber .>> ws
|>> fun x -> Number(x)
let groupRef, groupImpl = createParserForwardedToRef()
let item : Parser =
lower <|> upper <|> number <|> groupRef
let items =
many item .>> ws
|>> fun x -> Group(x)
do groupImpl := between (pchar '(') (pchar ')') items .>> ws
let test () =
match run groupRef "(abc (b CDEF (de 1 E) (f 234)) (h 3) (jkl H) )" with
| Success(result, _, _) -> printf "Success: %A" result
| Failure(errorMsg, _, _) -> printf "Failure: %s" errorMsg

Monadic style
type out =
| Lower of string
| Upper of string
| Number of string
| Group of out list
type Parser = Parser<out, unit>
let isUpper = fun c -> isAsciiUpper c
let upper : Parser = parse {
let! x = many1Satisfy isUpper
do! ws
return Upper(x)
}
let isLower = fun c -> isAsciiLower c
let lower = parse {
let! x = many1Satisfy isLower
do! ws
return Lower(x)
}
let isNumber = fun c -> isDigit c
let number = parse {
let! x = many1Satisfy isNumber
do! ws
return Number(x)
}
let groupRef, groupImpl = createParserForwardedToRef()
let group = parse {
let! x = groupRef
do! ws
return x
}
let item =
lower <|> upper <|> number <|> group
let items = parse {
let! x = many item
do! ws
return Group(x)
}
do groupImpl := between (pchar '(') (pchar ')') items
let test () =
match run group "(abc (b CDEF (de 1 E) (f 234)) (h 3) (jkl H) )" with
| Success(result, _, _) -> printf "Success: %A" result
| Failure(errorMsg, _, _) -> printf "Failure: %s" errorMsg

Related

Ocaml cast string to list of tuples

I have the file "example.dat" with text "[(1,2); (3,4); (5,6)]". I need to get list of tuples from it. I know, how I can get it from list of ints.
# let f line = List.map int_of_string line;;
# open Printf
let file = "example.dat"
let () =
let ic = open_in file in
try
let line = input_line ic in
f line;
flush stdout;
close_in ic
with e ->
close_in_noerr ic;
raise e;;
How I must to change my functions?
Given a list of strings that represent ints, your function f returns a list of ints. It doesn't return a list of tuples.
You don't say whether you want to verify that the input has some kind of proper form. If you want to verify that it has the form of (say) a list of type (int * int) list in OCaml, this is a parsing problem that would take some work.
If you just want to extract the parts of the input line that look like ints, you can use regular expression processing from the Str module:
# let re = Str.regexp "[^0-9]+" in
Str.split re "[(1,2); (37,4); (5,6)]";;
- : string list = ["1"; "2"; "37"; "4"; "5"; "6"]
Then you can rewrite your function f to collect up each pair of ints into a tuple. I don't see a good way to use List.map for this. You might have to write your own recursive function or use List.fold_left.
Update
I will write you a function that changes a list of values into a list of pairs. I hope this isn't for a school assignment, in which case you should be figuring this out for yourself.
let rec mkpairs l =
match l with
| [] | [_] -> []
| a :: b :: rest -> (a, b) :: mkpairs rest
As you can see, this function silently discards the last element of the list if the list has an odd number of elements.
This function is not tail recursive. So that's something you could think about improving.
let open Genlex in
let open Stream in
let lexer = make_lexer ["["; "("; ","; ")"; ";"; "]";] in
let stream = lexer (of_string array_string) in
let fail () = failwith "Malformed string" in
let parse_tuple acc = match next stream with
| Int first -> ( match next stream with
| Kwd "," -> ( match next stream with
| Int second -> ( match next stream with
| Kwd ")" -> (first, second) :: acc
| _ -> fail () )
| _ -> fail () )
| _ -> fail () )
| _ -> fail ()
in
let rec parse_array acc =
match next stream with
| Kwd "(" -> parse_array (parse_tuple acc)
| Kwd ";" -> parse_array acc
| Kwd "]" -> acc
| _ -> fail ()
in
try
match next stream with
| Kwd "[" -> List.rev (parse_array [])
| _ -> fail ()
with Stream.Failure -> fail ();;

Operations on contiguous elements in F# [duplicate]

I need to extract the sequence of equal chars in a text.
For example:
The string "aaaBbbcccccccDaBBBzcc11211" should be converted to a list of strings like
["aaa";"B";"bb";"ccccccc";"D";"a";"BBB";"z";"cc";"11";"2";"11"].
That's my solution until now:
let groupSequences (text:string) =
let toString chars =
System.String(chars |> Array.ofList)
let rec groupSequencesRecursive acc chars = seq {
match (acc, chars) with
| [], c :: rest ->
yield! groupSequencesRecursive [c] rest
| _, c :: rest when acc.[0] <> c ->
yield (toString acc)
yield! groupSequencesRecursive [c] rest
| _, c :: rest when acc.[0] = c ->
yield! groupSequencesRecursive (c :: acc) rest
| _, [] ->
yield (toString acc)
| _ ->
yield ""
}
text
|> List.ofSeq
|> groupSequencesRecursive []
groupSequences "aaaBbbcccccccDaBBBzcc11211"
|> Seq.iter (fun x -> printfn "%s" x)
|> ignore
I'm a F# newbie.
This solution can be better?
Here a completely generic implementation:
let group xs =
let folder x = function
| [] -> [[x]]
| (h::t)::ta when h = x -> (x::h::t)::ta
| acc -> [x]::acc
Seq.foldBack folder xs []
This function has the type seq<'a> -> 'a list list when 'a : equality, so works not only on strings, but on any (finite) sequence of elements, as long as the element type supports equality comparison.
Used with the input string in the OP, the return value isn't quite in the expected shape:
> group "aaaBbbcccccccDaBBBzcc11211";;
val it : char list list =
[['a'; 'a'; 'a']; ['B']; ['b'; 'b']; ['c'; 'c'; 'c'; 'c'; 'c'; 'c'; 'c'];
['D']; ['a']; ['B'; 'B'; 'B']; ['z']; ['c'; 'c']; ['1'; '1']; ['2'];
['1'; '1']]
Instead of a string list, the return value is a char list list. You can easily convert it to a list of strings using a map:
> group "aaaBbbcccccccDaBBBzcc11211" |> List.map (List.toArray >> System.String);;
val it : System.String list =
["aaa"; "B"; "bb"; "ccccccc"; "D"; "a"; "BBB"; "z"; "cc"; "11"; "2"; "11"]
This takes advantage of the String constructor overload that takes a char[] as input.
As initially stated, this implementation is generic, so can also be used with other types of lists; e.g. integers:
> group [1;1;2;2;2;3;4;4;3;3;3;0];;
val it : int list list = [[1; 1]; [2; 2; 2]; [3]; [4; 4]; [3; 3; 3]; [0]]
How about with groupby
"aaaBbbcccccccD"
|> Seq.groupBy id
|> Seq.map (snd >> Seq.toArray)
|> Seq.map (fun t -> new string (t))
If you input order matters, here is a method that works
"aaaBbbcccccccDaBBBzcc11211"
|> Seq.pairwise
|> Seq.toArray
|> Array.rev
|> Array.fold (fun (accum::tail) (ca,cb) -> if ca=cb then System.String.Concat(accum,string ca)::tail else string(ca)::accum::tail) (""::[])
This one is also based on recursion though the matching gets away with smaller number of checks.
let chop (txt:string) =
let rec chopInner txtArr (word: char[]) (res: List<string>) =
match txtArr with
| h::t when word.[0] = h -> chopInner t (Array.append word [|h|]) res
| h::t when word.[0] <> h ->
let newWord = word |> (fun s -> System.String s)
chopInner t [|h|] (List.append res [newWord])
| [] ->
let newWord = word |> (fun s -> System.String s)
(List.append res [newWord])
let lst = txt.ToCharArray() |> Array.toList
chopInner lst.Tail [|lst.Head|] []
And the result is as expected:
val text : string = "aaaBbbcccccccDaBBBzcc11211"
> chop text;;
val it : string list =
["aaa"; "B"; "bb"; "ccccccc"; "D"; "a"; "BBB"; "z"; "cc"; "11"; "2"; "11"]
When you're folding, you'll need to carry along both the previous value and the accumulator holding the temporary results. The previous value is wrapped as option to account for the first iteration. Afterwards, the final result is extracted and reversed.
"aaaBbbcccccccDaBBBzcc11211"
|> Seq.map string
|> Seq.fold (fun state ca ->
Some ca,
match state with
| Some cb, x::xs when ca = cb -> x + ca::xs
| _, xss -> ca::xss )
(None, [])
|> snd
|> List.rev
// val it : string list =
// ["aaa"; "B"; "bb"; "ccccccc"; "D"; "a"; "BBB"; "z"; "cc"; "11"; "2"; "11"]
Just interesting why everyone publishing solutions based on match-with? Why not go plain recursion?
let rec groups i (s:string) =
let rec next j = if j = s.Length || s.[i] <> s.[j] then j else next(j+1)
if i = s.Length then []
else let j = next i in s.Substring(i, j - i) :: (groups j s)
"aaaBbbcccccccDaBBBzcc11211" |> groups 0
val it : string list = ["aaa"; "B"; "bb"; "ccccccc"; "D"; "a"; "BBB"; "z"; "cc"; "11"; "2"; "11"]
As someone other here:
Know thy fold ;-)
let someString = "aaaBbbcccccccDaBBBzcc11211"
let addLists state elem =
let (p, ls) = state
elem,
match p = elem, ls with
| _, [] -> [ elem.ToString() ]
| true, h :: t -> (elem.ToString() + h) :: t
| false, h :: t -> elem.ToString() :: ls
someString
|> Seq.fold addLists ((char)0, [])
|> snd
|> List.rev

F# Regex matching chain

As I am not completely happy with F#'s regex implementation for my usage, I wanted to implement a so-called regex chain. It basically works as follows:
The given string s will be checked, whether it matches the first pattern. If it does, it should execute a function associated with the first pattern. If it does not, it should continue with the next one.
I tried to implement it as follows:
let RegexMatch ((s : string, c : bool), p : string, f : GroupCollection -> unit) =
if c then
let m = Regex.Match(s, p)
if m.Success then
f m.Groups
(s, false)
else (s, c)
else (s, c)
("my input text", true)
|> RegexMatch("pattern1", fun g -> ...)
|> RegexMatch("pattern2", fun g -> ...)
|> RegexMatch("pattern3", fun g -> ...)
|> .... // more patterns
|> ignore
The problem is, that this code is invalid, as the forward-pipe operator does not seem to pipe tuples or does not like my implementation 'design'.
My question is: Can I fix this code above easily or should I rather implement some other kind of regex chain?
Your function RegexMatch won't support piping, because it has tupled parameters.
First, look at the definition of the pipe:
let (|>) x f = f x
From this, one can clearly see that this expression:
("text", true)
|> RegexMatch("pattern", fun x -> ...)
would be equivalent to this:
RegexMatch("pattern", fun x -> ...) ("text", true)
Does this match your function signature? Obviously not. In your signature, the text/bool pair comes first, and is part of the triple of parameters, together with pattern and function.
To make it work, you need to take the "piped" parameter in curried form and last:
let RegexMatch p f (s, c) = ...
Then you can do the piping:
("input", true)
|> RegexMatch "pattern1" (fun x -> ...)
|> RegexMatch "pattern2" (fun x -> ...)
|> RegexMatch "pattern3" (fun x -> ...)
As an aside, I must note that your approach is not very, ahem, functional. You're basing your whole logic on side effects, which will make your program not composable and hard to test, and probably prone to bugs. You're not reaping the benefits of F#, effectively using it as "C# with nicer syntax".
Also, there are actually well researched ways to achieve what you want. For one, check out Railway-oriented programming (also known as monadic computations).
To me this sounds like what you are trying to implement is Active Patterns.
Using Active Patterns you can use regular pattern matching syntax to match against RegEx patterns:
let (|RegEx|_|) p i =
let m = System.Text.RegularExpressions.Regex.Match (i, p)
if m.Success then
Some m.Groups
else
None
[<EntryPoint>]
let main argv =
let text = "123"
match text with
| RegEx #"\d+" g -> printfn "Digit: %A" g
| RegEx #"\w+" g -> printfn "Word : %A" g
| _ -> printfn "Not recognized"
0
Another approach is to use what Fyodor refers to as Railway Oriented Programming:
type RegexResult<'T> =
| Found of 'T
| Searching of string
let lift p f = function
| Found v -> Found v
| Searching i ->
let m = System.Text.RegularExpressions.Regex.Match (i, p)
if m.Success then
m.Groups |> f |> Found
else
Searching i
[<EntryPoint>]
let main argv =
Searching "123"
|> lift #"\d+" (fun g -> printfn "Digit: %A" g)
|> lift #"\w+" (fun g -> printfn "Word : %A" g)
|> ignore
0

Trying to get first word from character list

I have a character list [#"h", #"i", #" ", #"h", #"i"] which I want to get the first word from this (the first character sequence before each space).
I've written a function which gives me this warning:
stdIn:13.1-13.42 Warning: type vars not generalized because of value
restriction are instantiated to dummy types (X1,X2,...)
Here is my code:
fun next [] = ([], [])
| next (hd::tl) = if(not(ord(hd) >= 97 andalso ord(hd) <= 122)) then ([], (hd::tl))
else
let
fun getword [] = [] | getword (hd::tl) = if(ord(hd) >= 97 andalso ord(hd) <= 122) then [hd]#getword tl else [];
in
next (getword (hd::tl))
end;
EDIT:
Expected input and output
next [#"h", #"i", #" ", #"h", #"i"] => ([#"h", #"i"], [#" ", #"h", #"i"])
Can anybody help me with this solution? Thanks!
This functionality already exists within the standard library:
val nexts = String.tokens Char.isSpace
val nexts_test = nexts "hi hi hi" = ["hi", "hi", "hi"]
But if you were to build such a function anyway, it seems that you return ([], []) sometimes and a single list at other times. Normally in a recursive function, you can build the result by doing e.g. c :: recursive_f cs, but this is assuming your function returns a single list. If, instead, it returns a tuple, you suddenly have to unpack this tuple using e.g. pattern matching in a let-expression:
let val (x, y) = recursive_f cs
in (c :: x, y + ...) end
Or you could use an extra argument inside a helper function (since the extra argument would change the type of the function) to store the word you're extracting, instead. A consequence of doing that is that you end up with the word in reverse and have to reverse it back when you're done recursing.
fun isLegal c = ord c >= 97 andalso ord c <= 122 (* Only lowercase ASCII letters *)
(* But why not use one of the following:
fun isLegal c = Char.isAlpha c
fun isLegal c = not (Char.isSpace c) *)
fun next input =
let fun extract (c::cs) word =
if isLegal c
then extract cs (c::word)
else (rev word, c::cs)
| extract [] word = (rev word, [])
in extract input [] end
val next_test_1 =
let val (w, r) = next (explode "hello world")
in (implode w, implode r) = ("hello", " world")
end
val next_test_2 = next [] = ([], [])

automata in OCaml

I am a bit new to OCaml. I want to implement product construction algorithm for automata in OCaml. I am confused how to represent automata in OCaml. Can someone help me?
A clean representation for a finite deterministic automaton would be:
type ('state,'letter) automaton = {
initial : 'state ;
final : 'state -> bool ;
transition : 'letter -> 'state -> 'state ;
}
For instance, an automaton which determines whether a word contains an odd number of 'a' could be represented as such:
let odd = {
initial = `even ;
final = (function `odd -> true | _ -> false) ;
transition = (function
| 'a' -> (function `even -> `odd | `odd -> `even)
| _ -> (fun state -> state))
}
Another example is an automation which accepts onlythe string "bbb" (yes, these are taken from this online handout) :
let bbb = {
initial = `b0 ;
final = (function `b3 -> true | _ -> false) ;
transition = (function
| 'b' -> (function `b0 -> `b1 | `b1 -> `b2 | `b2 -> `b3 | _ -> `fail)
| _ -> (fun _ -> `fail))
}
Automaton product is described mathematically as using the cartesian product of the state sets as the new sets, and the natural extensions of the final and transition functions over that set:
let product a b = {
initial = (a.initial, b.initial) ;
final = (fun (x,y) -> a.final x && b.final y) ;
transition = (fun c (x,y) -> (a.transition c x, b.transition c y)
}
This product automaton computes the intersection of two languages. You can also use || in lieu of && to implement the union of two languages.