Keyword "as" in SML/NJ - sml

I recently see people using as in their SML/NJ program. The most helpful reference I found is "as" keyword in OCaml.
Though OCaml also belongs to ML programming language family, they are different. For example, in the example program given in the previous answer,
let rec compress = function
| a :: (b :: _ as t) -> if a = b then compress t else a :: compress t
| smaller -> smaller;;
My translation of it to SML/NJ is (Please correct me if I have done it wrong)
fun compress (a :: (t as b :: _)) = if a = b then compress t else a :: compress t
| compress smaller = smaller
As you have seen, the pattern (b :: _ as t) has an order different from (t as b :: _) in the second snippet. (Still, their usage is pretty much the same)
For potential answers, I hope it can contain (1) reference to this keyword as in any among the official documentation of SML/NJ, courses, and books and "maybe" (2) some examples to illustrate its usage. I am hoping this question can help future users seeing as.

The as keyword is part of the Standard ML definition ('97 revision). See page 79, figure 22 (highlighting mine):
These are called as-patterns in Haskell and pretty much any other language that allows binding an identifier to a (sub-)pattern, but the origin of the name is clearly from ML.
The purpose it serves is to give a name to a pattern or a part of it. For example, we can capture the whole head of a 2-tuple list, while at the same assigning names to the tuple's values.
fun example1 (list : (int * string) list) =
case list of
(* `head` will be bound to the tuple value *)
(* `i` will be bound to the tuple's 1st element *)
(* `s` will be bound to the tuple's 2nd element *)
head as (i, s) :: tail => ()
| nil => ()
The other usage appears in record patterns. Notice that at first sight it might give the impression that the as-name is now to the right of the as keyword, but it's not (see the combined example below):
fun example2 (list : { foo: int * string } list) =
case list of
(* `f` will be found to the value of the `foo` field in the record. *)
{ foo as f } :: tail => ()
(* The above can also be expressed as follows *)
| { foo = f } :: tail => ()
| nil => ()
And a combined example, where you can see that as usage in records is consistent with its usage elsewhere, i.e., the name stays on the left-hand side of the as keyword (here the name is the record label).
fun example3 (list : { foo: int * string } list) =
case list of
head as { foo as (i, s) } :: tail => ()
(* This is valid, too, but `foo` is not a binding. *)
| head as { foo = (i, s) } :: tail => ()
(* Here, `f` is bound to the whole tuple value. *)
| head as { foo = f as (i, s) } :: tail => ()
| nil => ()

Related

how to represent a non-empty list type

I'm a big fan of creating data structures that make representing invalid states impossible, so I wanted to ask how I could represent a non empty list in reasonml?
Since it's possible to pattern match on lists like [] and [head, ...rest] I thought it would be easy to represent a non empty list, but I haven't found a way yet.
Update: Thanks to the enlightening answers below I was able to come up with something that really strikes my tune:
module List = {
include List;
type nonEmpty('a) = ::('a, list('a));
let foldNonEmpty = (type a, fn, l: nonEmpty(a)) => switch(l) {
| [head, ...tail] => fold_left(fn, head, tail)
};
}
module Number = {
let min = List.foldNonEmpty(Pervasives.min);
let max = List.foldNonEmpty(Pervasives.max);
}
Number.min([]); // illegal :D
Number.min([1]); // legal
Don't know how you guys feel about it, but I think it's awesome. Thanks!
You can also define a new list type without GADT as:
type nonempty('a) =
| First('a)
| ::('a,nonempty('a))
Compared to the GADT solution, you lose some syntactic sugar, because the syntax
let l = [1,2,3,4]
implicitly adds a terminal [] but the [x, ...y] syntax still works
let x = [1, 2, 3, 4, ...First(5)];
let head =
fun
| [a, ...q] => a
| First(a) => a;
let tail =
fun
| [a, ...q] => Some(q)
| First(a) => None;
Otherwise, the encoding
type nonempty_2('a) = { head:'a, more:list('a) };
let x = { head:1, more:[2,3,4,5 ] };
let head = (x) => x.head;
let tail = fun
| {more:[head,...more],_} => Some({head, more})
| {more:[],_} => None;
is even simpler and does not rely on potentially surprising syntactic constructions.
EDIT: ::, the infix variant constructor
If the part of the definition with :: seems strange, it is because it is a left-over of corner case of the OCaml syntax. In Ocaml,
[x, ... l ]
is written
x :: l
which is itself the infix form of
(::)(x,l)
(This the same prefix form of standard operator : 1 + 2 can also be written as
(+)(1,2) (in Reason)
)
And the last form is also the prefix form of [x,...l] in reason.
In brief, in Reason we have
[x, ... l ] ≡ (::)(x,l)
with the OCaml syntax as the missing link between the two notations.
In other words :: is an infix constructor (and the only one). With recent enough version of OCaml, it is possible to define your own version of this infix constructor with
type t = (::) of int * int list
The same construction carries over in Reason as
type t = ::(int, list(int))
Then if you write [a, ...b] it is translated to (::)(a,b) with :: as your newly defined operator. Similarly,
[1,2,3]
is in fact a shortcut for
[1,2,3, ...[]]
So if you define both [] and ::, for instance in this silly example
type alternating('a,'b) =
| []
| ::('a, alternating('b,'a) )
/* here the element of the list can alternate between `'a` and `'b`:*/
let l = [1,"one",2,"two"]
you end up with a syntax for exotic lists, that works exactly the same as
standard lists.
You can use GADT for this use case.
(we can also add phantom type https://blog.janestreet.com/howto-static-access-control-using-phantom-types/) but it isn't mandatory
type empty = Empty;
type nonEmpty = NonEmpty;
type t('a, 's) =
| []: t('a, empty)
| ::(('a, t('a, 's))): t('a, nonEmpty);
How to use it
let head: type a. t(a, nonEmpty) => a =
fun
| [x, ..._] => x;
type idea come form https://sketch.sh/s/yH0MJiujNSiofDWOU85loX/

How to implement OCaml function of type (string*int) list -> (string * int list) list where the output list is a tally of the items in the input

The question I have is how might I transform a list of a string and integer pair to a list of string and int list pairs.
For example, if I have the list [("hello",1) ; ("hi", 1) ; ("hello", 1) ; ("hi", 1) ; ("hey",1 )] then I should get back [("hello",[1;1]) ; ("hi", [1;1]) ; ("hey",[1])] where basically from a previous function I wrote that creates string * int pairs in a list, I want to group every string that's the same into a pair that has a list of ones of a length = to how many times that exact string appeared in a pair from the input list. Sorry if my wording is confusing but I am quite lost on this function. Below is the code I have written so far:
let transform5 (lst: (string *int) list) : (string *int list) list =
match lst with
| (hd,n)::(tl,n) -> let x,[o] = List.fold_left (fun (x,[o]) y -> if y = x then x,[o]#[1] else
(x,[o])::y,[o]) (hd,[o]) tl in (x,[1])::(tl,[1])
Any help is appreciated!
General advice on how to improve understanding of core concepts:
The code suggests you could use more practice with destructuring and manipulating lists. I recommend reading the chapter on Lists and Patterns in Real World Ocaml and spending some time working through the first 20 or so 99 OCaml Problems.
Some pointers on the code you've written so far:
I have reorganized your code into a strictly equivalent function, with some annotations indicating problem areas:
let transform5 : (string * int) list -> (string * int list) list =
fun lst ->
let f (x, [o]) y =
if y = x then (* The two branches of this conditional are values of different types *)
(x, [o] # [1]) (* : ('a * int list) *)
else
(x, [o]) :: (y, [o]) (* : ('a * int list) list *)
in
match lst with
| (hd, n) :: (tl, n) -> (* This will only match a list with two tuples *)
let x, [o] = List.fold_left f (hd, [o]) tl (* [o] can only match a singleton list *)
in (x, [1]) :: (tl, [1]) (* Doesn't use the value of o, so that info is lost*)
(* case analysis in match expressions should be exhaustive, but this omits
matches for, [], [_], and (_ :: _ :: _) *)
If you load your code in utop or compile it in a file, you should get a number of warnings and type errors that help indicate problem areas. You can learn a lot by taking up each of those messages one by one and working out what they are indicating.
Refactoring the problem
A solution to your problem using a fold over the input list is probably the right way to go. But writing solutions that use explicit recursion and break the task down into a number of sub-problems can often help study the problem and make the underlying mechanics very clear.
In general, a function of type 'a -> 'b can be understood as a problem:
Given a x : 'a, construct a y : 'b where ...
Our function has type (string * int) list -> (string * int list) list and you
state the problem quite clearly, but I've edited a bit to fit the format:
Given xs : (string * int) list, construct ys: (string * int list) list
where I want to group every string from xs that's the same into a pair
(string * int list) in ys that has a list of ones of a length = to how
many times that exact string appeared in a pair from xs.
We can break this into two sub-problems:
Given xs : (string * int) list, construct ys : (string * int) list list where each y : (string * int) list in ys is a group of the items in xs with the same string.
let rec group : (string * int) list -> (string * int) list list = function
| [] -> []
| x :: xs ->
let (grouped, rest) = List.partition (fun y -> y = x) xs in
(x :: grouped) :: group rest
Given xs : (string * int) list list, construct ys : (string * int list) list where for each group (string, int) list in xs we have one (s : string, n : int list) in ys where s is the string determining the group and n is a list holding all the 1s in the group.
let rec tally : (string * int) list list -> (string * int list) list = function
| [] -> []
| group :: xs ->
match group with
| [] -> tally xs (* This case shouldn't arise, but we match it to be complete *)
| (s, _) :: _ ->
let ones = List.map (fun (_, one) -> one) group in
(s, ones) :: tally xs
The solution to your initial problem will just be the composition of these two sub-problems:
let transform5 : (string * int) list -> (string * int list) list =
fun xs -> (tally (group xs))
Hopefully this is a helpful illustration of one way to go about decomposing these kinds of problems. However, there are some obvious defects with the code I have written: it is inefficient, in that it creates an intermediate data structure and it must iterate through the first list repeatedly to form its groups, before finally tallying up the results. It also resorts to explicit recursion, whereas it would be preferable to use higher order functions to take care of iterating over the lists for us (as you tried in your example). Trying to fix these defects might be instructive.
Reconsidering our context
Is the problem you've posed in this SO question the best sub-problem from the overall task you are pursuing? Here are two questions have occurred to me:
Why, do you have a (string * int) list where the value of int is always 1 in the first place? Does this actually carry any more information than a string list?
In general, we can represent any n : int by a int list which contains only 1s and has length = n. By why not just use n here?

Generic Sequence over natural numbers

I am trying to create a generic sequence, that would behave the following:
val generic_sequence= fn : (int -> int) -> int seq
that is, It should receive as an input a function:
foo: int -> int
and create a sequence that activates foo on all natural numbers.
I wrote the following auxiliary code (works fine):
datatype 'a seq = Nil
| Cons of 'a * (unit-> 'a seq);
fun head (Cons(x,_)) = x;
fun tail (Cons (_,xf)) = xf();
fun naturals k = Cons(k,fn()=>naturals (k+1));
and when I tried implementing the generic sequence I got stuck.
This is where I've got.
fun aux (Cons(x,xf))= (Cons(foo x,(fn=>aux((xf())))));
fun generic_seq foo = (aux (from 0));
I have 2 problems:
It doesn't compile
I am not sure if my approach is correct
Would appreciate some help here.
Ok, I figured it out,
I created a mapq function and it basically did everything for me.
fun mapq f Nil = Nil
| mapq f (Cons (x,xf)) = Cons (f(x), fn() => mapq f (xf()));

Streams (aka "lazy lists") and tail recursion

This question uses the following "lazy list" (aka "stream") type:
type 'a lazylist = Cons of 'a * (unit -> 'a lazylist)
My question is: how to define a tail-recursive function lcycle that takes a non-empty (and non-lazy) list l as argument, and returns the lazylist corresponding to repeatedly cycling over the elements l. For example:
# ltake (lcycle [1; 2; 3]) 10;;
- : int list = [1; 2; 3; 1; 2; 3; 1; 2; 3; 1]
(ltake is a lazy analogue of List::take; I give one implementation at the end of this post.)
I have implemented several non-tail-recursive versions of lcycles, such as:
let lcycle l =
let rec inner l' =
match l' with
| [] -> raise (Invalid_argument "lcycle: empty list")
| [h] -> Cons (h, fun () -> inner l)
| h::t -> Cons (h, fun () -> inner t)
in inner l
...but I have not managed to write a tail-recursive one.
Basically, I'm running into the problem that lazy evaluation is implemented by constructs of the form
Cons (a, fun () -> <lazylist>)
This means that all my recursive calls happen within such a construct, which is incompatible with tail recursion.
Assuming the lazylist type as defined above, is it possible to define a tail-recursive lcycle? Or is this inherently impossible with OCaml?
EDIT: My motivation here is not to "fix" my implementation of lcycle by making it tail-recursive, but rather to find out whether it is even possible to implement a tail recursive version of lcycle, given the definition of lazylist above. Therefore, pointing out that my lcycle is fine misses what I'm trying to get at. I'm sorry I did not make this point sufficiently clear in my original post.
This implementation of ltake, as well as the definition of the lazylist type above, comes from here:
let rec ltake (Cons (h, tf)) n =
match n with
0 -> []
| _ -> h :: ltake (tf ()) (n - 1)
I don't see much of a problem with this definition. The call to inner is within a function which won't be invoked until lcycle has returned. Thus there is no stack safety issue.
Here's an alternative which moves the empty list test out of the lazy loop:
let lcycle = function
| [] -> invalid_arg "lcycle: empty"
| x::xs ->
let rec first = Cons (x, fun () -> inner xs)
and inner = function
| [] -> first
| y::ys -> Cons (y, fun () -> inner ys) in
first
The problem is that you're trying to solve a problem that doesn't exist. of_list function will not take any stack space, and this is why lazy lists are so great. Let me try to explain the process. When you apply of_list function to a non empty list, it creates a Cons of the head of the list and a closure, that captures a reference to the tail of the list. Afterwards it momentary returns. Nothing more. So it takes only few words of memory, and none of them uses stack. One word contains x value, another contains a closure, that captures only a pointer to the xs.
So then, you deconstruct this pair, you got the value x that you can use right now, and function next, that is indeed the closure that, when invoked, will be applied to a list and if it is nonempty, will return another Cons. Note, that previous cons will be already destroyed to junk, so new memory won't be used.
If you do not believe, you can construct an of_list function that will never terminate (i.e., will cycle over the list), and print it with a iter function. It will run for ever, without taking any memory.
type 'a lazylist = Cons of 'a * (unit -> 'a lazylist)
let of_list lst =
let rec loop = function
| [] -> loop lst
| x :: xs -> Cons (x, fun () -> loop xs) in
loop lst
let rec iter (Cons (a, next)) f =
f a;
iter (next ()) f

Ocaml List: Implement append and map functions

I'm currently trying to extend a friend's OCaml program. It's a huge collection of functions needed for some data analysis.. Since I'm not really an OCaml crack I'm currently stuck on a (for me) strange List implementation:
type 'a cell = Nil
| Cons of ('a * 'a llist)
and 'a llist = (unit -> 'a cell);;
I've figured out that this implements some sort of "lazy" list, but I have absolutely no idea how it really works. I need to implement an Append and a Map Function based on the above type. Has anybody got an idea how to do that?
Any help would really be appreciated!
let rec append l1 l2 =
match l1 () with
Nil -> l2 |
(Cons (a, l)) -> fun () -> (Cons (a, append l l2));;
let rec map f l =
fun () ->
match l () with
Nil -> Nil |
(Cons (a, r)) -> fun () -> (Cons (f a, map f r));;
The basic idea of this implementation of lazy lists is that each computation is encapsulated in a function (the technical term is a closure) via fun () -> x.
The expression x is then only evaluated when the function is applied to () (the unit value, which contains no information).
It might help to note that function closures are essentially equivalent to lazy values:
lazy n : 'a Lazy.t <=> (fun () -> n) : unit -> 'a
force x : 'a <=> x () : 'a
So the type 'a llist is equivalent to
type 'a llist = 'a cell Lazy.t
i.e., a lazy cell value.
A map implementation might make more sense in terms of the above definition
let rec map f lst =
match force lst with
| Nil -> lazy Nil
| Cons (hd,tl) -> lazy (Cons (f hd, map f tl))
Translating that back into closures:
let rec map f lst =
match lst () with
| Nil -> (fun () -> Nil)
| Cons (hd,tl) -> (fun () -> Cons (f hd, map f tl))
Similarly with append
let rec append a b =
match force a with
| Nil -> b
| Cons (hd,tl) -> lazy (Cons (hd, append tl b))
becomes
let rec append a b =
match a () with
| Nil -> b
| Cons (hd,tl) -> (fun () -> Cons (hd, append tl b))
I generally prefer to use the lazy syntax, since it makes it more clear what's going on.
Note, also, that a lazy suspension and a closure are not exactly equivalent. For example,
let x = lazy (print_endline "foo") in
force x;
force x
prints
foo
whereas
let x = fun () -> print_endline "foo" in
x ();
x ()
prints
foo
foo
The difference is that force computes the value of the expression exactly once.
Yes, the lists can be infinite. The code given in the other answers will append to the end of an infinite list, but there's no program you can write than can observe what is appended following an infinite list.