Is it ok to use Lwt.return as the final call in a recursive function?
I have a function that compiles fine but does not run properly and it looks like the function f below. Please assume that there is no issue with any function provided as g in this example, I am basically just trying to find out if it is ok to have a function with the following form or if there is a better/simpler (and Lwt compliant) way of doing the following:
let rec f (x : string list) (g : string -> unit Lwt.t) =
match List.length x with
| 0 -> Lwt.return ()
| _ -> g (List.hd x) >>= fun () -> f (List.tl x) g
;;
val f : string list -> (string -> unit Lwt.t) -> unit Lwt.t = <fun>
I am pretty sure that I am doing it wrong. But the actual function I am using is much more complex than this example so I am having a difficult time debugging it.
First of all the correct way of dealing with lists in OCaml is deconstructing them with pattern matching, like this:
let rec f (xs : string list) (g : string -> unit Lwt.t) =
match xs with
| [] -> return ()
| x :: xs -> g x >>= fun () -> f xs g
The next step would be notice, that you're actually just perform iteration over a list. There is a Lwt_list.iter_s for this:
let f g xs = Lwt_list.iter_s g xs
That can simplified even more
let f = Lwt_list.iter_s
That means, that you even do not need to write such function, since it is already there.
And finally, there was no issues with recursion in your original implementation. The function that you've provided was tail recursive.
It depends whether g returns an lwt thread that is already computed such as return () or scheduled and woken up later by the lwt scheduler. In the former case, it's possible that the call to fun () -> f (List.tl x) g is made right away instead of being scheduled for later, and that could grow the stack depending on what optimizations are happening.
I don't think your code should rely on such tricky behavior. For this particular example, as suggested in #ivg's answer, you should use the functions from the Lwt_list module.
It's a good idea to look at the implementation of the Lwt_list module to see how it's done. The same advice goes for the OCaml standard library as well.
Related
I'm trying to work out what specifically lwt is doing in a couple of examples:
If I have:
let%lwt x = f () in
let%lwt y = g () in
return ()
Does this run f then g, or since y doesn't rely on x will it run both in parallel?
That particular example runs f () and g () in sequence, i.e. g () doesn't start until after the promise returned by f () is resolved.
The way to see this is, when looking at
let%lwt x = e in
e'
to realize that e' becomes the body of a callback, that will run only when x is available. So, in the code in the question, Lwt first sees:
(* Do this first. *)
let%lwt x = f () in
(* Do this once is available. *)
let%lwt y = g () in
return ()
and, once x is available, it is left with
(* Do this next. *)
let%lwt y = g () in
(* Do this once y is available. *)
return ()
To avoid this serialization, call f () and g () first, without any intervening let%lwt, bind variables to the promises x', y' these functions return, and wait on the promises:
let x' = f () in
let y' = g () in
let%lwt x = x' in
let%lwt y = y' in
return ()
(And to answer the title, Lwt does not use data dependencies. I don't think a library could have access to this kind of data dependency information).
In your code, no, because you are using Lwt.t as monad, rather than as an applicative.
Monads
You're probably already familiar with asynchronous IO and the functions Lwt.bind : 'a Lwt.t -> ('a -> 'b Lwt.t) -> 'b Lwt.t and Lwt.return : 'a -> 'a Lwt.t. Just in case, though, I will give a brief recap:
Lwt.bind promise callback awaits promise, and upon resolution, calls callback with the result, getting back another promise.
Lwt.return data creates a promise that resolves to data.
A monad is a generic type 'a t that has some function bind : 'a t -> ('a -> 'b t) -> 'b t and some function return : 'a -> 'a t. (These functions must also follow certain laws, but I digress.) Obviously, the type 'a Lwt.t with the functions Lwt.bind and Lwt.return form a monad.
Monads are a common functional programming pattern when one wants to represent some kind of "effect" or "computation," in this case asynchronous IO. Monads are powerful because the bind function lets later computations depend on the results of earlier ones. If m : 'a t represents some computation that results in 'a, and f : 'a -> 'b t is a function that uses an 'a to perform a computation that results in a 'b, then bind m f makes f depend on the result of m.
In the case of Lwt.bind promise callback, callback depends on the result of promise. The code in callback cannot run until promise is resolved.
When you write
let%lwt x = f () in
let%lwt y = g () in
return ()
you are really writing Lwt.bind (f ()) (fun x -> Lwt.bind (g ()) (fun y -> return ())). Because g () is inside the callback, it is not run until f () is resolved.
Applicatives
A functional programming pattern related to the monad is the applicative. An applicative is a generic type 'a t with a function map : ('a -> 'b) -> 'a t -> 'b t, the function return : 'a -> 'a t, and a function both : 'a t * 'b t -> ('a * 'b) t. Unlike monads, however, applicatives need not have bind : 'a t -> ('a -> 'b t) -> 'b t, meaning that with applicatives alone, later computations cannot depend on previous ones. All monads are applicatives, but not all applicatives are monads.
Because g () does not depend on the result of f (), your code can be rewritten to use both:
let (let*) = bind
let (and*) = both
let* x = f ()
and* y = g () in
return ()
This code translates to bind (fun (x, y) -> return ()) (both (f ()) (g ())). f () and g () appear outside the callback to bind, meaning that they are run immediately and can await in parallel. both combines f () and g () into a single promise.
The (let*) and (and*) operators are new to OCaml 4.08. If you are using an earlier version of OCaml, you can just write the translation directly.
Lwt.both in the documentation
The Lwt home page contains a code snippit using let* and and*
"Binding operators" (let* and and*) in the OCaml manual
I'm trying to make this an instance of Monad in Haskell:
data Parser a = Done ([a], String) | Fail String
Now I try this code to make it an instance of Monad:
instance Functor Parser where
fmap = liftM
instance Applicative Parser where
pure = return
(<*>) = ap
instance Monad Parser where
return xs = Done ([], xs)
Done (xs, s) >>= f = Done (concat (map f xs)), s)
But this obviously doesn't work, because the function f in the bind-function is of the type a -> M b. So the (map f xs) function yields a list of M b-things. It should actually make a list of b's. How can I do this in Haskell?
NOTE: The actual error given by GHC 7.10.3 is:
SuperInterpreter.hs:71:27:
Couldn't match expected type `String' with actual type `a'
`a' is a rigid type variable bound by
the type signature for return :: a -> Parser a
at SuperInterpreter.hs:71:5
Relevant bindings include
xs :: a (bound at SuperInterpreter.hs:71:12)
return :: a -> Parser a (bound at SuperInterpreter.hs:71:5)
In the expression: xs
In the first argument of `Done', namely `([], xs)'
SuperInterpreter.hs:72:45:
Couldn't match type `Parser b' with `[b]'
Expected type: a -> [b]
Actual type: a -> Parser b
Relevant bindings include
f :: a -> Parser b (bound at SuperInterpreter.hs:72:22)
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
(bound at SuperInterpreter.hs:72:5)
In the first argument of `map', namely `f'
In the first argument of `concat', namely `(map f xs)'
Failed, modules loaded: none.
leftaroundabout already showed you some of the problems.
Usually I expect a parser to be some kind of function that takes an input-String, maybe consuming some of this string and then returning an result together with the unconsumed input.
Based on this idea you can extent your code to do just this:
data ParserResult a
= Done (a, String)
| Fail String
deriving Show
data Parser a = Parser { run :: String -> ParserResult a }
instance Functor Parser where
fmap = liftM
instance Applicative Parser where
pure = return
(<*>) = ap
instance Monad Parser where
return a = Parser $ \ xs -> Done (a, xs)
p >>= f = Parser $ \ xs ->
case run p xs of
Done (a, xs') -> run (f a) xs'
Fail msg -> Fail msg
a simple example
here is a simple parser that would accept any character:
parseAnyChar :: Parser Char
parseAnyChar = Parser f
where f (c:xs) = Done (c, xs)
f "" = Fail "nothing to parse"
and this is how you would use it:
λ> run parseAnyChar "Hello"
Done ('H',"ello")
λ> run parseAnyChar ""
Fail "nothing to parse"
While it's not completely uncommon to define fmap = liftM and so on, this is a bit backwards IMO. If you first define the more fundamental instances and base the more involved ones on them, things often come out clearer. I'll leave <*> = ap, but transform everything else:
instance Functor Parser where -- Note that you can `derive` this automatically
fmap f (Done vs rest) = Done (map f vs) rest
fmap f (Fail err) = Fail err
instance Applicative Parser where
pure xs = Done ([], xs)
(<*>) = ap
Now with fmap already present, I can define Monad in the “more mathematical” way: define join instead of >>=.
instance Monad Parser where
return = pure
q >>= f = joinParser $ fmap f q
That means you'll work with intuitively handleable concrete values, rather than having to worry about threading a function through a parser. You can therefore see quite clearly what's going on, just write out the recursion:
joinParser :: Parser (Parser a) -> Parser a
joinParser (Fail err) = Fail err
joinParser (Done [] rest) = Done [] rest
joinParser (Done (Fail err : _) _) = Fail err
joinParser (Done (Done v0 rest0 : pss) rest) = ??
at this point you see clearly what Carsten already remarked: your Parser type doesn't really make sense as a parser. Both the inner and outer Done wrappers somehow have rest data; combining it would mean you combine the undone work... this is not what a parser does.
Search the web a bit, there's plenty of material on how to implement parsers in Haskell. In doubt, look how some established library does it, e.g. parsec.
when trying to write a simple program for solving a toy SAT problem, I came across the following problem I cannot get my head around.
I have a type variable which is defined as follows:
type prefix =
| Not
| None
type variable =
| Fixed of (prefix * bool)
| Free of (prefix * string)
from which I can build a clause of type variable list and a formula of type clause list. Essentially this boils down to having a formula in
either CNF or DNF (this has less to do with the problem).
When now trying to simplify a clause I do the following:
Filter all Fixed variables from the clause which gives a list
Simplify the variables (Fixed(Not, true) => Fixed(None, false))
Now I have a list containing just Fixed variables which I now want to combine to a single Fixed value by doing something like this
let combine l =
match l with
| [] -> []
| [x] -> [x]
| (* Get the first two variables, OR/AND them
and recurse on the rest of the list *)
How would I achieve my desired behavior in a functional language? My experience in OCaml is not that big, I am rather a beginner.
I tried doing x::xs::rest -> x <||> xs <||> combine rest but this does not work. Where <||> is just a custom operator to OR the variables.
Thanks for your help.
How about using the neat higher order functions already there?
let combine = function
| x::xs -> List.fold_left (<||>) x xs
| [] -> failwith "darn, what do I do if the list is empty?"
For clarification:
List.fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a
takes a function that gets the running aggregate and the next element of the list; it returns the new aggregate; then we need an initial value and the list of items to fold over.
The use of your infix operator <||> in brackets makes it a prefix function so we can give it to List.fold_left just like that -- instead of writing (fun a b -> a <||> b).
If you have a neutral element of your <||> operator, lets call it one, we could write it even more concise:
let combine = List.fold_left (<||>) one
As List.fold_left requires three arguments and we only gave it two, combine here is a function of variable list -> variable as the previous one. If you wonder why this works, check out the concept of currying.
Here's my attempt:
let rec combine l =
match l with
| [] -> []
| [x] -> [x]
| a :: b :: rest -> combine ((a <||> b) :: rest)
Note you need let rec.
Let us consider the following implementation of the Continuation monad, for CPS-style computations yielding and integer:
module Cont : sig
type 'a t = ('a -> int) -> int
val return : 'a -> 'a t
val bind : 'a t -> ('a -> 'b t) -> 'b t
val callCC: (('a -> 'b t) -> 'a t) -> 'a t
end = struct
type 'a t = ('a -> int) -> int
let return x =
fun cont -> cont x
let bind m f =
fun cont -> m (fun x -> (f x) cont)
let callCC k =
fun cont -> k (fun x -> (fun _ -> cont x)) cont
end
How can we rewrite the CPS-style implementation of gcd computation (see How to memoize recursive functions?) and especially the memoization to take advantage of the Cont monad?
After defining
let gcd_cont k (a,b) =
let (q, r) = (a / b, a mod b) in
if r = 0 then Cont.return b else k (b,r)
I tried to use the type solver to give me cue about the type that the memoization function should have:
# let gcd memo ((a,b):int * int) =
Cont.callCC (memo gcd_cont (a,b)) (fun x -> x)
;;
val gcd :
(((int * int -> int Cont.t) -> int * int -> int Cont.t) ->
int * int -> (int -> 'a Cont.t) -> int Cont.t) ->
int * int -> int = <fun>
However I could not turn this hint into an actual implementation. Is someone able to do this? The logic behind using “callCC” in the memoization function is that if a value is found in the cache, then this is is an early exit condition.
I feel like the issue is that in his answer to How to memoize recursive functions?, Michael called CPS style what is not CPS style. In CPS style, the extra continuation argument k is used whenever one wants to return a value - the value is then applied to k instead.
This is not really what we want here, and not what implements:
let gcd_cont k (a,b) =
let (q, r) = (a / b, a mod b) in
if r = 0 then b else k (b,r)
Here, k is not used for returning (b is returned directly), it is used instead of performing a recursive call. This unwinds the recursion: within gcd_cont, one can think of k as gcd_cont itself, just like if let rec was used. Later on, gcd_cont can be turned into a truly recursive function using a fixpoint combinator, that basically "feeds it to itself":
let rec fix f x = f (fix f) x
let gcd = fix gcd_cont
(this is equivalent to the call function that Michael defines)
The difference with defining gcd directly using a let rec is that the version with unwinded recursion allows one to "instrument" the recursive calls, as the recursion itself is performed by the fixpoint combinator. This is what we want for memoization: we only want to perform recursion if the result is not in the cache. Thus the definition of a memo combinator.
If the function is defined with a let rec, the recursion is closed at the same time as defining the function, so one cannot instrument the recursive call-sites to insert memoization.
As a side note, the two answers basically implement the same thing: the only difference is the way they implement recursion in the fixpoint combinator: Michael's fixpoint combinator uses let rec, Jackson's one uses a reference, i.e. "Landin's knot" — an alternative way of implementing recursion, if you have references in your language.
Sooo, to conclude, I'd say implementing that in the continuation monad is not really possible / does not really make sense, as the thing was not CPS in the first place.
Its possible to create infinite, circular lists using let rec, without needing to resort to mutable references:
let rec xs = 1 :: 0 :: xs ;;
But can I use this same technique to write a function that receives a finite list and returns an infinite, circular version of it? I tried writing
let rec cycle xs =
let rec result = go xs and
go = function
| [] -> result
| (y::ys) -> y :: go ys in
result
;;
But got the following error
Error: This kind of expression is not allowed as right-hand side of `let rec'
Your code has two problems:
result = go xs is in illegal form for let rec
The function tries to create a loop by some computation, which falls into an infinite loop causing stack overflow.
The above code is rejected by the compiler because you cannot write an expression which may cause recursive computation in the right-hand side of let rec (see Limitations of let rec in OCaml).
Even if you fix the issue you still have a problem: cycle does not finish the job:
let rec cycle xs =
let rec go = function
| [] -> go xs
| y::ys -> y :: g ys
in
go xs;;
cycle [1;2];;
cycle [1;2] fails due to stack overflow.
In OCaml, let rec can define a looped structure only when its definition is "static" and does not perform any computation. let rec xs = 1 :: 0 :: xs is such an example: (::) is not a function but a constructor, which purely constructs the data structure. On the other hand, cycle performs some code execution to dynamically create a structure and it is infinite. I am afraid that you cannot write a function like cycle in OCaml.
If you want to introduce some loops in data like cycle in OCaml, what you can do is using lazy structure to prevent immediate infinite loops like Haskell's lazy list, or use mutation to make a loop by a substitution. OCaml's list is not lazy nor mutable, therefore you cannot write a function dynamically constructs looped lists.
If you do not mind using black magic, you could try this code:
let cycle l =
if l = [] then invalid_arg "cycle" else
let l' = List.map (fun x -> x) l in (* copy the list *)
let rec aux = function
| [] -> assert false
| [_] as lst -> (* find the last cons cell *)
(* and set the last pointer to the beginning of the list *)
Obj.set_field (Obj.repr lst) 1 (Obj.repr l')
| _::t -> aux t
in aux l'; l'
Please be aware that using the Obj module is highly discouraged. On the other hand, there are industrial-strength programs and libraries (Coq, Jane Street's Core, Batteries included) that are known to use this sort of forbidden art.
camlspotter's answer is good enough already. I just want to add several more points here.
First of all, for the problem of write a function that receives a finite list and returns an infinite, circular version of it, it can be done in code / implementation level, just if you really use the function, it will have stackoverflow problem and will never return.
A simple version of what you were trying to do is like this:
let rec circle1 xs = List.rev_append (List.rev xs) (circle1 xs)
val circle: 'a list -> 'a list = <fun>
It can be compiled and theoretically it is correct. On [1;2;3], it is supposed to generate [1;2;3;1;2;3;1;2;3;1;2;3;...].
However, of course, it will fail because its run will be endless and eventually stackoverflow.
So why let rec circle2 = 1::2::3::circle2 will work?
Let's see what will happen if you do it.
First, circle2 is a value and it is a list. After OCaml get this info, it can create a static address for circle2 with memory representation of list.
The memory's real value is 1::2::3::circle2, which actually is Node (1, Node (2, Node (3, circle2))), i.e., A Node with int 1 and address of a Node with int 2 and address of a Node with int 3 and address of circle2. But we already know circle2's address, right? So OCaml just put circle2's address there.
Everything will work.
Also, through this example, we can also know a fact that for a infinite circled list defined like this actually doesn't cost limited memory. It is not generating a real infinite list to consume all memory, instead, when a circle finishes, it just jumps "back" to the head of the list.
Let's then go back to example of circle1. Circle1 is a function, yes, it has an address, but we do not need or want it. What we want is the address of the function application circle1 xs. It is not like circle2, it is a function application which means we need to compute something to get the address. So,
OCaml will do List.rev xs, then try to get address circle1 xs, then repeat, repeat.
Ok, then why we sometimes get Error: This kind of expression is not allowed as right-hand side of 'let rec'?
From http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#s%3aletrecvalues
the let rec binding construct, in addition to the definition of
recursive functions, also supports a certain class of recursive
definitions of non-functional values, such as
let rec name1 = 1 :: name2 and name2 = 2 :: name1 in expr which
binds name1 to the cyclic list 1::2::1::2::…, and name2 to the cyclic
list 2::1::2::1::…Informally, the class of accepted definitions
consists of those definitions where the defined names occur only
inside function bodies or as argument to a data constructor.
If you use let rec to define a binding, say let rec name. This name can be only in either a function body or a data constructor.
In previous two examples, circle1 is in a function body (let rec circle1 = fun xs -> ...) and circle2 is in a data constructor.
If you do let rec circle = circle, it will give error as circle is not in the two allowed cases. let rec x = let y = x in y won't do either, because again, x not in constructor or function.
Here is also a clear explanation:
https://realworldocaml.org/v1/en/html/imperative-programming-1.html
Section Limitations of let rec