Are comparisons in OCaml no longer polymorphic? - ocaml

I have been trying to migrate a codebase from OCaml 4.04 to OCaml 4.10. I have been running into a recurring error in which when I compare values that are not integers, I get a type error:
if (total < current) then (...)
90 | if (total < current) then (...)
^^^^^
Error: This expression has type float but an expression was expected of type
int
But replacing it with if Float.(total < current) fixes the issue. Am I doing something wrong, or is it that comparisons in OCaml are no longer polymorphic?

It appears that the change in behavior is due to my use of Jane Street Core. OCaml operators are still polymorphic but using Core forces you to use explicitly typed comparisons.
https://discuss.ocaml.org/t/removing-polymorphic-compare-from-core/2994

Related

Equality operator to compare non int types

I recently updated from OCaml 4.03 to OCaml 4.13 for my project. One change is that I am getting a type error when checking for equality between non-int types. For example, for floats I get this:
Error: This expression has type float but an expression was expected of type int
I can solve this by explicitly using Float.(f0 = f1). But I get the same problem with custom types. E.g.:
utop # type e = X | Y
utop # let a = X;;
val a : e = X
utop # let b = Y;;
utop # X = Y;;
Error: This expression has type e but an expression was expected of type int
What is the correct way of handling this scenario? Stdlib.(a = b) works but feels cumbersome since a polymorphic equality operator is so commonly used.
This is not an inherent behavior of OCaml. It comes from Jane Street Base (and presumably other modules from Jane Street), which override some of the built-in polymorphic functions.
The idea is that there are risks involved with the built-in polymorphic comparisons that can be surprising if you aren't careful.
To get the usual OCaml polymorphic comparison operators you can use the Polymorphic_compare module. Here is a link to the documentation of Jane Street Base (if that's what you're using): Base at Jane Street

C++ parsing expressions, breaking down the order of evaluation

I'm trying to write an expression parser. One part I'm stuck on is breaking down an expression into blocks via its appropriate order of precedence.
I found the order of precedence for C++ operators here. But where exactly do I split the expression based on this?
I have to assume the worst of the user. Here's a really messy over-exaggerated test example:
if (test(s[4]) < 4 && b + 3 < r && a!=b && ((c | e) == (g | e)) ||
r % 7 < 4 * givemeanobj(a & c & e, b, hello(c)).method())
Perhaps it doesn't even evaluate, and if it doesn't I still need to break it down to determine that.
It should break down into blocks of singles and pairs connected by operators. Essentially it breaks down into a tree-structure where the branches are the groupings, and each node has two branches.
Following the order of precedence the first thing to do would be to evaluate the givemeanobj(), however that's an easy one to see. The next would be the multiplication sign. Does that split everything before the * into a separate , or just the 4? 4 * givemeanobj comes before the <, right? So that's the first grouping?
Is there a straightforward rule to follow for this?
Is there a straightforward rule to follow for this?
Yes, use a parser generator such as ANTLR. You write your language specification formally, and it will generate code which parses all valid expressions (and no invalid ones). ANTLR is nice in that it can give you an abstract syntax tree which you can easily traverse and evaluate.
Or, if the language you are parsing is actually C++, use Clang, which is a proper compiler and happens to be usable as a library as well.

Why my OCaml "=" operator is only applied to int?

I use vscode, with extensions of "OCaml and Reason IDE"
Here is my result in utop:
utop # 1. = 1. ;;
Line 1, characters 0-2:
Error: This expression has type float but an expression was expected of type
int
And also for String:
utop # "Me" = "Me";;
Line 1, characters 0-4:
Error: This expression has type string but an expression was expected of type
int
Same for anything but int:
utop # 2 = 2 ;;
- : bool = true
">" "<" also have the same symptom. I don't know what actually happens. Can anyone help me out ? Thanks a lot!
You are probably using JaneStreet Base library. Maybe you imported it like that:
open Base;;
Base tries to limit exceptions to functions that have explicit _exn suffix, so it shadows the built-in polymorphic equality (=) which can raise an exception on some inputs (for example, if you compare structures containing functions).
You can get polymorphic equality back as follows:
let (=) = Poly.(=);;
Or you can use it with a local import: Poly.(x = y).
There are pros and cons to polymorphic comparison.
The consensus seems to be that using monomorphic comparison (for example, String.equal, etc) is a more robust choice, even though it is less convenient.

Functions with parametrised arity

Is it possible to write functions that returns functions whose signature depends on the arguments to the builder function?
Specifically I am refining an implementation of primitive recursion I wrote. I want to have factory like functions that for numeric parameters generate a function that works on tuples or lists of the length equal to the arguments passed for the numeric parameters. Currently I am handling cases like pr 1 2 [0,5,13], which is an invalid statement from the perspective of primitive recursion, at runtime through Either:
pr :: (Show a) => Int -> Int -> [a] -> Either String a
pr i k args
| 1 <= i && i <= k && length args == k = Right $ args!!(i-1)
| i <= 0 = Left "first argument of pr needs to be greater or equal to 1"
| k < i = Left "first argument of pr needs to be lesser or equal to the second argument"
| length args /= k = Left $ "pr expected "++(show k)++" arguments, got "++(show $ length args)++": pr "++(concat[show i, " ", show k, " ", show args])
But I would like to somehow catch that case at compile time, as from the perspective of the formal system I want to implement with this, this is a compile time error -- passing more arguments to a function than its domain specifies.
Is this somehow possible and if not what would be the correct approach to get compile time errors for what should be invalid statements.
What you want is a sized vector. It is like a list but in addition to the type of its elements, it is also parametrised by type level natural numbers.
sized-vector package on Hackage is what you need. As it happens, the function you're trying to implement is the last function in this library.
Note that every time you call last you will have to prove the compiler that its argument vector is of size at least 1. You can do this by constructing the vector in the source code (for example, the compiler will understand 1 :- 2 :- Nil is of size 2) or if the vector is obtained at runtime perhaps by conversion from a list, you'll have to write a function that either gives a run time error if it has no elements or constructs a vector of size at least one i.e. have the type level size S n for some n.
If you're not familiar with dependently typed programming (a paradigm that includes this and much much more) I suggest you look through some tutorials first. For example, this post is a good example that includes how to implement vectors from scratch and write functions for them.
A word of caution, learning and using dependently typed programming is exciting, addictive, but also time consuming. So if you want to focus on the task at hand, you might like to live with runtime checks for now.

How to split a simple Lisp-like code to tokens in C++?

Basically, the language has 3 list and 3 fixed-length types, one of them is string.
This is simple to detect the type of a token using regular expressions, but splitting them into tokens is not that trivial.
String is notated with double-quote, and double-qoute is escaped with backslash.
EDIT:
Some example code
{
print (sum (1 2 3 4))
if [( 2 + 3 ) < 6] : {print ("Smaller")}
}
Lists like
() are argument lists that are only evaluated when necessary.
[] are special list to express 2 operand operations in a prettier
way.
{} are lists that are always evaluated. First element is a function
name, second is a list of arguments, and this repeats.
anything : anything [ : anything [: ...]] translate to argument lists that have the elements joined by the :s. This is only for making loops and conditionals look better.
All functions take a single argument. Argument lists can be used for functions that need more. You can fore and argument list to evaluate using different types of eval functions. (There would be eval functions for each list model)
So, if you understand this, this works very similar like Lisp does, it's only has different list types for prettifying the code.
EDIT:
#rici
[[2 + 3] < 6] is OK too. As I mentioned, argument lists are evaluated only when it's necessary. Since < is a function that requires an argument list of length 2, (2 + 3) must be evaluated somehow, other ways it [(2 + 3) < 6] would translate to < (2 + 3) : 6 which equals to < (2 + 3 6) which is and invalid argument list for <. But I see you point, it's not trivial that how automatic parsing in this case should work. The version that I described above, is that the [...] evaluates arguments list with a function like eval_as_oplist (...) But I guess you are right, because this way, you couldn't use an argument list in the regular way inside a [...] which is problematic even if you don't have a reason to do so, because it doesn't lead to a better code. So [[. . .] . .] is a better code, I agree.
Rather than inventing your own "Lisp-like, but simpler" language, you should consider using an existing Lisp (or Scheme) implementation and embedding it in your C++ application.
Although designing your own language and then writing your own parser and interpreter for it is surely good fun, you will have hard time to come up with something better designed, more powerful and implemented more efficiently and robustly than, say, Scheme and it's numerous implementations.
Chibi Scheme: http://code.google.com/p/chibi-scheme/ is particularly well suited for embedding in C/C++ code, it's very small and fast.
I would suggest using Flex (possibly with Bison) or ANTLR, which has a C++ output target.
Since google is simpler than finding stuff on my own file server, here is someone else's example:
http://ragnermagalhaes.blogspot.com/2007/08/bison-lisp-grammar.html
This example has formatting problems (which can be resolved by viewing the HTML in a text editor) and only supports one type of list, but it should help you get started and certainly shows how to split the items into tokens.
I believe Boost.Spirit would be suitable for this task provided you could construct a PEG-compatible grammar for the language you're proposing. It's not obvious from the examples as to whether or not this is the case.
More specifically, Spirit has a generalized AST called utree, and there is example code for parsing symbolic expressions (ie lisp syntax) into utree.
You don't have to use utree in order to take advantage of Spirit's parsing and lexing capabilities, but you would have to have your own AST representation. Maybe that's what you want?