Clojure Function Naming Conventions - clojure

Context
Suppose I have protocols ICursor, IFoo, IBar then I can have a function named:
(defn IFoo->IBar [foo] ... )
Now, suppose I have a function which takes two arguments
x: ICursor
y: IFoo
and output an object of type IBar.
Now, is there any standard way to denote this in a function name? For example, none of the following work:
(defn ICursor,IFoo->IBar [x y] ...)
because "," is treated as space
(defn (ICursor, IFoo)->IBar [x y] ... )
because () is treated as function application.
(defn [ICursor, IFoo]->IBar [x y] ... )
because [] is treated as vector.
Question
Is there a standard way to encode protocol types of arguments in the function name?
Thanks!

I don't think there is any such recommended way and it seems like sort of type annotations. There is one such project around giving type annotations to clojure code at this link.
You can use something like : (defn ICursor->IFoo->IBar [x y] ...) which denotes that the function takes ICursor and IFoo as params and return IBar, so basically last type is return type and before that everything is parameter type but I am not sure if that can be a long term or idiomatic solution because then where is the method actual name :) which is important then type annotation.

Related

Simple question re: Passing a parameterized datatype in SML

I just spent an embarrassing amount of time figuring out that if you're passing a parameterized datatype into a higher-order function in SML, it needs to be in brackets (); so, for example:
fun f1 p = f2 p will work when called like this (for example): f1(Datatype(parameter)) but will not work if called like f1 Datatype(parameter). I'm sure there's a very simple reason why, but I'm not quite clear. Is it something like, the datatype and parameter are "seen" as 2 things by the function if not in brackets? Thanks!
It's important to realize how functions work in SML. Functions take a single argument, and return a single value. This is very easy to understand but it's very often practically necessary for a function to take more than one value as input. There are two ways of achieving this:
Tuples
A function can take one value that contains multiple values in the form of a tuple. This is very common in SML. Consider for instance:
fun add (x, y) = x + y
Here (x, y) is a tuple inferred to be composed of two ints.
Currying
A function takes one argument and returns one value. But functions are values in SML, so a function can return a function.
fun add x = fn y => x + y
Or just:
fun add x y = x + y
This is common in OCaml, but less common in SML.
Function Application
Function application in SML takes the form of functionName argument. When a tuple is involved, it looks like: functionName (arg1, arg2). But the space can be elided: functionName(arg1, arg2).
Even when tuples are not involved, we can put parentheses around any value. So calling a function with a single argument can look like: functionName argument, functionName (argument), or functionName(argument).
Your Question
f1(Datatype(parameter))
This parses the way you expect.
f1 Datatype(parameter)
This parses as f1 Datatype parameter, which is a curried function f1 applied to the arguments Datatype and parameter.

Cannot have first parameter as optional

Can you have a function with and optional parameter as the first and only parameter?
thus
(defn foo [& bar] (if (= bar) 1 2))
What & bar means at function definition is that all the rest of the arguments will be put into a list. It does not give any guarantees as to the size of the list, so it could be empty, with one or more items.
A better approach to having one and only optional argument, is to have it accept zero or one arguments:
(defn foo
([] (foo 12))
([bar] (if (= bar 12) 1 2)))
In this example, if you call the first function definition, with zero arity, it will simply call the second 1-arity function definition with a default value of 12.
This will work:
(defn foo [& bar]
(if (seq bar) 1 2))
Using seq is the way you are supposed to code 'not empty'. Not to be confused with seq?, which means something quite different, which will in fact return true here.
The important point is that bar becomes a list inside the function body. If you want to get this list back to what it was when you called the function then use apply, or deconstruct the list in the function itself, as being explained in the comments below...

Difference between Record constructor and positional factory function

Suppose I define a record called Node: (defrecord Node [tag attributes children]).
After this definition, according to the docstring of defrecord a factory function called ->Node is defined, as well as another factory function map->Node and a Java class constructor Node..
I'm wondering what exactly the difference is between the positional factory function ->Node and the constructor Node., apart from the normal differences between a Java class constructor / method on the one hand and a clojure function on the other (by normal differences I'm thinking things like the fact that functions are first-class in Clojure while methods are not).
(Update: see end of this answer for a note on primitive field types vs. parameter types of the ctor vs. parameter types of the factory.)
The positional factory just calls the constructor directly. The only interesting thing beyond that is that for records / types with large numbers of fields (namely over 20, which is the maximum number of positional arguments a Clojure function can accept) making the constructor call is slightly more involved (since you have to unpack some arguments from the rest-args seq); positional factories as emitted by defrecord and deftype handle that correctly, and moreover check that the correct number of arguments is supplied, throwing an appropriate exception if not.
This is documented in the docstring for the private function clojure.core/build-positional-factory; say (doc clojure.core/build-positional-factory) at the REPL to read it, or (source clojure.core/build-positional-factory) to see the source.
The end result looks roughly like this:
;; positional factory for a type with up to 20 fields
(defn ->Foo
"Construct a Foo."
[x y z]
(new Foo x y z))
;; positional factory for a type with many fields
(defn ->Bar
"Construct a Bar."
[a b c d e f g h i j k l m n o p q r s t & overage]
(if (= (count overage) 2)
(new Bar a b c d e f g h i j k l m n o p q r s t
(nth overage 0) (nth overage 1))
(throw
(clojure.lang.ArityException.
(+ 20 (count overage)) (name '->Bar))))))
A note on parameter types:
Not sure if this falls under the rubric of "normal differences", so I'll mention it explicitly: deftype / defrecord introduced classes may have fields of primitive types, in which case the corresponding parameters of the constructor will also be of primitive types. However, as of Clojure 1.5.1, the positional factories always take all-Object arguments, even if technically they could be declared as primitive-accepting functions (that is, if the primitive types involved are long and/or double and there are at most four positional parameters).
#{}, the empty set. Apart from the differences you've explicitly said you're not interested in, there are no other differences. ->Foo exists specifically because functions are more friendly than constructors.

Defining a function with "extra" parenthesis

Can anyone explain to me why
((fn ([x] x)) 1)
works and returns 1? (There's one "extra" set of parenthesis after the fn) Shouldn't it be the following?
((fn [x] x) 1)
Additionally,
((fn (([x] x))) 1)
(2 "extra" sets of parenthesis) fails with a "CompilerException System.ArgumentException: Parameter declaration([x] x) should be a vector". Why?
Thanks!
The extra set of parenthesis allows you to define a function taking a variable number of arguments. The following example defines a function that can take either one argument or two arguments:
(defn foo
([x] x)
([x y] (+ x y)))
You can see this as defining two functions under a single name. The appropriate function is going to be called depending on the number of argument you provide.
If you define a function with a fixed number of arguments, the two following forms are equivalent:
(defn bar ([x] x))
and
(defn baz [x] x)
With this in mind you can understand the compiler exception. You are trying to define a function as follows:
(defn qux
(([x] x)))
When using the extra set of parenthesis, closure expect the first element inside the parenthsesis to be a vector (within brackets). However in this case, the first element is ([x] x) which is a list and not a vector. This is the error you get.

No type constructor for record types?

I was translating the following Haskell code to OCaml:
data NFA q s = NFA
{ intialState :: q
, isAccepting :: q -> Bool
, transition :: q -> s -> [q]
}
Initially I tried a very literal translation:
type ('q,'s) nfa = NFA of { initialState: 'q;
isAccepting: 'q -> bool;
transition: 'q -> 's -> 'q list }
...and of course this gives a syntax error because the type constructor part, "NFA of" isn't allowed. It has to be:
type ('q,'s) nfa = { initialState: 'q;
isAccepting: 'q -> bool;
transition: 'q -> 's -> 'q list }
That got me to wondering why this is so. Why can't you have the type constructor for a record type just as you could for a tuple type (as below)?
type ('q, 's) dfa = NFA of ('q * ('q->bool) * ( 'q -> 's -> 'q list) )
Why would you want a type constructor for record types, except because that's your habit in Haskell?
In Haskell, records are not exactly first-class constructs: they are more like a syntactic sugar on top of tuples. You can define record fields name, use them as accessors, and do partial record update, but that desugars into access by positions in plain tuples. The constructor name is therefore necessary to tell one record from another after desugaring: if you had no constructor name, two records with different field names but the same field types would desugar into equivalent types, which would be a bad thing.
In OCaml, records are a primitive notion and they have their own identity. Therefore, they don't need a head constructor to distinguish them from tuples or records of the same field types. I don't see why you would like to add a head constructor, as this is more verbose without giving more information or helping expressivity.
Why can't you have the type constructor for a record type just as you could for a tuple type (as below)?
Be careful ! There is no tuple in the example you show, only a sum type with multiple parameters. Foo of bar * baz is not the same thing as Foo of (bar * baz): the former constructor has two parameters, and the latter constructor has only one parameter, which is a tuple. This differentiation is done for performances reasons (in memory, the two parameters are packed together with the constructor tag, while the tuple creates an indirection pointer). Using tuples instead of multi-parameters is slightly more flexible : you can match as both Foo (x, y) -> ... or Foo p -> ..., the latter not being available to multi-parameter constructors.
There is no asymmetry between tuples and records in that none of them has a special status in the sum type construction, which is only a sum of constructors of arbitrary arity. That said, it is easier to use tuples as parameter types for the constructors, as tuple types don't have to be declared to be used. Eg. some people have asked for the ability to write
type foo =
| Foo of { x : int; y : int }
| Bar of { z : foo list }
instead of the current
type foo = Foo of foo_t | Bar of bar_t
and foo_t = { x : int; y : int }
and bar_t = { z : foo list }
Your request is a particular (and not very interesting) case of this question. However, even with such shorthand syntax, there would still be one indirection pointer between the constructor and the data, making this style unattractive for performance-conscious programs -- what could be useful is the ability to have named constructor parameters.
PS: I'm not saying that Haskell's choice of desugaring records into tuples is a bad thing. By translating one feature into another, you reduce redundancy/overlapping of concepts. That said, I personally think it would be more natural to desugar tuples into records (with numerical field names, as done in eg. Oz). In programming language design, there are often no "good" and "bad" choices, only different compromises.
You can't have it because the language doesn't support it. It would actually be an easy fit into the type system or the data representation, but it would be a small additional complication in the compiler, and it hasn't been done. So yes, you have to choose between naming the constructor and naming the arguments.
Note that record label names are tied to a particular type, so e.g. {initialState=q} is a pattern of type ('q, 's) nfa; you can't usefully reuse the label name in a different type. So naming the constructor is only really useful when the type has multiple constructors; then, if you have a lot of arguments to a constructor, you may prefer to define a separate type for the arguments.
I believe there's a patch floating around for this feature, but I don't know if it's up-to-date for the latest OCaml version or anything, and it would require anyone using your code to have that patch.