Why define the unit natural transformation for a monad - isn't this implied by the definition of monad being an endofunctor? - monads

A monad is defined as an endofunctor on category C. Let's say, C has type int and bool and other constructed types as objects. Now let's think about the list monad defined over this category.
By it's very definition list then is an endofunctor, it maps (can this be interpreted as a function?) an int type into List[int] and bool to List[bool] of and maps (again a function?) a morphism int -> bool to
List[int] -> List[bool]
So, far, it kind of makes sense. But what throws me into deep confusion is the additional definitions of natural transformations that need to accompany it:
a. Unit...that transforms int into List[int] (doesn't the definition of List functor already imply this? This is one major confusion I have
b. Does the List functor always have to be understood as mapping from int to List[int] not from int to List[bool]?
c. Is the unit natural transformation int to List[int] different from map from int to List[int] implied by defining List as a functor? I guess this is just re-statement of my earlier question.

Unit is a natural transformation from the Identity functor on C to List; in general, a natural transformation a: F => G between two parallel functors F,G : X -> Y consists of
for each object x: X of the domain, a morphism a_x : Fx -> Gx
plus a naturality condition relating the action of F and G on morphisms
you should thought of a natural transformation as above as a way of "going" from F to G. Applying this to your unit for List situation, Unit specifies for each type X a function Unit_X : X -> List[X], and this is just viewing instances of your type as List[X] instances with one element.
I don't understand what you're asking exactly on b. but with respect to c. they're completely different things. There is no map from int to List[int] implied at the definition; what the definition gives you is, for each map f: X -> Y, a map List(f) : List[X] -> List[Y]; what Unit gives you is a way of viewing any type X as some particular kind of Lists of X's, those with one element.
Hope it helps; from the List[] notation you use, maybe you come from a Scala/Java background, if this is the case you may find this intro to category theory in Scala interesting: http://www.weiglewilczek.com/blog/?p=2760

Well, what is really confusing is, functor F between Cat A and Cat B isdefined as:
a mapping:
F maps A to F(A) --- does it mean new List()? or why not?
and F maps F(f) : F(A) -> F(B)
This is how I see those as being defined in the books. Point #1 above (F maps A to F(A)) - that reads to me like a morphism to convert A into F(A). If that is the case, why do we need unit natural transformation, to go from A to F(A)?
What is very curious is that the functor definition uses the word map (but does not use the word morphism). I see that A to F(A) is not called a morphism but a map.

Related

Simple question re: Passing a parameterized datatype in SML

I just spent an embarrassing amount of time figuring out that if you're passing a parameterized datatype into a higher-order function in SML, it needs to be in brackets (); so, for example:
fun f1 p = f2 p will work when called like this (for example): f1(Datatype(parameter)) but will not work if called like f1 Datatype(parameter). I'm sure there's a very simple reason why, but I'm not quite clear. Is it something like, the datatype and parameter are "seen" as 2 things by the function if not in brackets? Thanks!
It's important to realize how functions work in SML. Functions take a single argument, and return a single value. This is very easy to understand but it's very often practically necessary for a function to take more than one value as input. There are two ways of achieving this:
Tuples
A function can take one value that contains multiple values in the form of a tuple. This is very common in SML. Consider for instance:
fun add (x, y) = x + y
Here (x, y) is a tuple inferred to be composed of two ints.
Currying
A function takes one argument and returns one value. But functions are values in SML, so a function can return a function.
fun add x = fn y => x + y
Or just:
fun add x y = x + y
This is common in OCaml, but less common in SML.
Function Application
Function application in SML takes the form of functionName argument. When a tuple is involved, it looks like: functionName (arg1, arg2). But the space can be elided: functionName(arg1, arg2).
Even when tuples are not involved, we can put parentheses around any value. So calling a function with a single argument can look like: functionName argument, functionName (argument), or functionName(argument).
Your Question
f1(Datatype(parameter))
This parses the way you expect.
f1 Datatype(parameter)
This parses as f1 Datatype parameter, which is a curried function f1 applied to the arguments Datatype and parameter.

Currying: practical implications

My comprehension of the problem comes from Heilperin's et al. "Concrete Abstraction". I got that currying is the translation of the evaluation of a function that takes several arguments into evaluating a sequence of functions, each with a single argument. I have clear the semantic differences between the two approaches (can I call them this way?) but I am sure I did not grasp the practical implications behind the two approaches.
Please consider, in Ocaml:
# let foo x y = x * y;;
foo : int -> int -> int = <fun>
and
# let foo2 (x, y) = x * y;;
foo2 : int * int -> int = <fun>
The results will be the same for the two functions.
But, practically, what does make the two functions different? Readability? Computational efficiency? My lack of experience fails to give to this problem an adequate reading.
First of all, I would like to stress, that due to compiler optimizations the two functions above will be compiled into the same assembly code. Without the optimizations, the cost of currying would be too high, i.e., an application of a curried function would require allocating an amount of closures equal to the number of arguments.
In practice, curried function is useful, to define partial application. For example, cf.,
let double = foo 2
let double2 x = foo2 (2,x)
Another implication is that in a curried form, you do not need to allocate temporary tuples for the arguments, like in the example above, the function double2 will create an unnecessary tuple (2,x) every time it is called.
Finally, the curried form, actually simplifies reasoning about functions, as now, instead of having N families of N-ary functions, we have only unary functions. That allows, to type functions equally, for example, type 'a -> 'b is applicable to any function, e.g., int -> int, int -> int -> int, etc. Without currying, we would be required to add a number arguments into the type of a function, with all negative consequences.
With the first implementation you can define, for example
let double = foo 2
the second implementation can not be partially reused.

Type inference in SML

I'm currently learning SML and I have a question about something I have no name for. Lets call it "type alias" for the moment. Suppose I have the following datatype definition:
datatype 'a stack = Stack of 'a list;
I now want to add an explicit "empty stack" type. I can to this by adding it to the datatype:
datatype 'a stack = emptystack | Stack of 'a list;
Now I can pattern match a function like "push":
fun push (emptystack) (e:'a) = Stack([e])
| push (Stack(list):'a stack) (e:'a) = Stack(e::list);
The problem here is that Stack([]) and emptystack are different but I want them to be the same. So every time SML encounters an Stack([]) it should "know" that this is emptystack (in case of push it should then use the emptystack match).
Is there a way to achieve this?
The short answer is: No, it is not possible.
You can create type aliases with the code
type number = int
val foo : number -> int -> number =
fn a => fn b => a+b
val x : int = foo 1 3;
val y : number = foo 1 3;
However, as the name says, it only works for types. Your question goes for value constructors, which there is no syntax for.
Such an aliasing is not possible in SML.
Instead, you should design your datatypes to be unambiguous in their representation, if that is what you desire.
You'd probably be better suited with something that resembles the definition of 'a list more:
datatype 'a stack = EmptyStack | Stack of 'a * 'a stack;
This has the downside of not letting you use the list functions on it, but you do get an explicit empty stack constructor.
Since what you want is for one value emptystack to be synonymous with another value Stack [], you could call what you are looking for "value aliases". Values that are compared with the built-in operator = or pattern matching will not allow for aliases.
You can achieve this by creating your own equality operator, but you will lose the ability to use the built-in = (since Standard ML does not support custom operator overloading) as well as the ability to pattern match on the value constructors of your type.
Alternatively, you can construct a normal form for your type and always compare the normal form. Whenever practically feasible, follow Sebastian's suggestion of no ambiguity. There might be situations in which an unambiguous algebraic type will be much more complex than a simpler one that allows the same value to be represented in different ways.

No type constructor for record types?

I was translating the following Haskell code to OCaml:
data NFA q s = NFA
{ intialState :: q
, isAccepting :: q -> Bool
, transition :: q -> s -> [q]
}
Initially I tried a very literal translation:
type ('q,'s) nfa = NFA of { initialState: 'q;
isAccepting: 'q -> bool;
transition: 'q -> 's -> 'q list }
...and of course this gives a syntax error because the type constructor part, "NFA of" isn't allowed. It has to be:
type ('q,'s) nfa = { initialState: 'q;
isAccepting: 'q -> bool;
transition: 'q -> 's -> 'q list }
That got me to wondering why this is so. Why can't you have the type constructor for a record type just as you could for a tuple type (as below)?
type ('q, 's) dfa = NFA of ('q * ('q->bool) * ( 'q -> 's -> 'q list) )
Why would you want a type constructor for record types, except because that's your habit in Haskell?
In Haskell, records are not exactly first-class constructs: they are more like a syntactic sugar on top of tuples. You can define record fields name, use them as accessors, and do partial record update, but that desugars into access by positions in plain tuples. The constructor name is therefore necessary to tell one record from another after desugaring: if you had no constructor name, two records with different field names but the same field types would desugar into equivalent types, which would be a bad thing.
In OCaml, records are a primitive notion and they have their own identity. Therefore, they don't need a head constructor to distinguish them from tuples or records of the same field types. I don't see why you would like to add a head constructor, as this is more verbose without giving more information or helping expressivity.
Why can't you have the type constructor for a record type just as you could for a tuple type (as below)?
Be careful ! There is no tuple in the example you show, only a sum type with multiple parameters. Foo of bar * baz is not the same thing as Foo of (bar * baz): the former constructor has two parameters, and the latter constructor has only one parameter, which is a tuple. This differentiation is done for performances reasons (in memory, the two parameters are packed together with the constructor tag, while the tuple creates an indirection pointer). Using tuples instead of multi-parameters is slightly more flexible : you can match as both Foo (x, y) -> ... or Foo p -> ..., the latter not being available to multi-parameter constructors.
There is no asymmetry between tuples and records in that none of them has a special status in the sum type construction, which is only a sum of constructors of arbitrary arity. That said, it is easier to use tuples as parameter types for the constructors, as tuple types don't have to be declared to be used. Eg. some people have asked for the ability to write
type foo =
| Foo of { x : int; y : int }
| Bar of { z : foo list }
instead of the current
type foo = Foo of foo_t | Bar of bar_t
and foo_t = { x : int; y : int }
and bar_t = { z : foo list }
Your request is a particular (and not very interesting) case of this question. However, even with such shorthand syntax, there would still be one indirection pointer between the constructor and the data, making this style unattractive for performance-conscious programs -- what could be useful is the ability to have named constructor parameters.
PS: I'm not saying that Haskell's choice of desugaring records into tuples is a bad thing. By translating one feature into another, you reduce redundancy/overlapping of concepts. That said, I personally think it would be more natural to desugar tuples into records (with numerical field names, as done in eg. Oz). In programming language design, there are often no "good" and "bad" choices, only different compromises.
You can't have it because the language doesn't support it. It would actually be an easy fit into the type system or the data representation, but it would be a small additional complication in the compiler, and it hasn't been done. So yes, you have to choose between naming the constructor and naming the arguments.
Note that record label names are tied to a particular type, so e.g. {initialState=q} is a pattern of type ('q, 's) nfa; you can't usefully reuse the label name in a different type. So naming the constructor is only really useful when the type has multiple constructors; then, if you have a lot of arguments to a constructor, you may prefer to define a separate type for the arguments.
I believe there's a patch floating around for this feature, but I don't know if it's up-to-date for the latest OCaml version or anything, and it would require anyone using your code to have that patch.

Selectively disable subsumption in Scala? (correctly type List.contains)

List("a").contains(5)
Because an Int can never be contained in a list of String, this should generate an error at compile-time, but it does not.
It wastefully and silently tests every String contained in the list for equality to 5, which can never be true ("5" never equals 5 in Scala).
This has been named "the 'contains' problem". And some have implied that if a type system cannot correctly type such semantics, then why go through the extra effort for enforcing types. So I consider it is an important problem to solve.
The type parametrization B >: A of List.contains inputs any type that is a supertype of the type A (the type of the elements contained in the list).
trait List[+A] {
def contains[B >: A](x: B): Boolean
}
This type parametrization is necessary because the +A declares that the list is covariant on the type A, thus A can't be used in the contravariant position, i.e. as the type of an input parameter. Covariant lists (which must be immutable) are much more powerful for extension than invariant lists (which can be mutable).
A is a String in the problematic example above, but Int is not a supertype of String, so what happened? The implicit subsumption in Scala, decided that Any is a mutual supertype of both String and Int.
The creator of Scala, Martin Odersky, suggested that a fix would be to limit the input type B to only those types that have an equals method that Any doesn't have.
trait List[+A] {
def contains[B >: A : Eq](x: B): Boolean
}
But that doesn't solve the problem, because two types (where the input type is not supertype of the type of the elements of the list) might have a mutual supertype which is a subtype of Any, i.e. also a subtype of Eq. Thus, it would compile without error and the incorrectly typed semantics would remain.
Disabling implicit subsumption every where is not an ideal solution either, because implicit subsumption is why the following example for subsumption to Any works. And we wouldn't want to be forced to use type casts when the receiving site (e.g. passing as a function argument) has correctly typed semantics for a mutual supertype (that might not even be Any).
trait List[+A] {
def ::[B >: A](x: B): List[B]
}
val x : List[Any] = List("a", 5) // see[1]
[1] List.apply calls the :: operator.
So my question is what is the best fix to this problem?
My tentative conclusion is that implicit subsumption should be turned off at the definition site where the semantics are otherwise not typed correctly. I will be providing an answer that shows how to turn off implicit subsumption at the method definition site. Are there alternative solutions?
Please note this problem is general, and not isolated to lists.
UPDATE: I have filed an improvement request and started a scala discussion thread on this. I have also added comments under Kim Stebel's and Peter Schmitz's answers showing that their answers have erroneous functionality. Thus there is no solution. Also at the aforementioned discussion thread, I explained why I think soc's answer is not correct.
This sounds good in theory, but falls apart in real life in my opinion.
equals is not based on types and contains is building on top of that.
That's why code like 1 == BigInt(1) works and returns the result most people would expect.
In my opinion it doesn't make sense to make contains more strict than equals.
If contains would be made more strict, code like List[BigInt](1,2,3) contains 1 would stop working completely.
I don't think “unsafe” or “not type safe” are the right terms here, by the way.
Why not use an equality typeclass?
scala> val l = List(1,2,3)
l: List[Int] = List(1, 2, 3)
scala> class EQ[A](a1:A) { def ===(a2:A) = a1 == a2 }
defined class EQ
scala> implicit def toEQ[A](a1:A) = new EQ(a1)
toEQ: [A](a1: A)EQ[A]
scala> l exists (1===)
res7: Boolean = true
scala> l exists ("1"===)
<console>:14: error: type mismatch;
found : java.lang.String => Boolean
required: Int => Boolean
l exists ("1"===)
^
scala> List("1","2")
res9: List[java.lang.String] = List(1, 2)
scala> res9 exists (1===)
<console>:14: error: type mismatch;
found : Int => Boolean
required: java.lang.String => Boolean
res9 exists (1===)
I think you misunderstand Martin's solution, it is not B <: Eq, it is B : Eq, which is a shortcut for
def Contains[B >: A](x: B)(implicit ev: Eq[B])
And Eq[X] would then contains a method
def areEqual(a: X, b: X): Boolean
This is not the same as moving the equals method of Any a little lower in the hierarchy, which would indeed solve none of the problem of having it in Any.
In my library extension I use:
class TypesafeEquals[A](val a: A) {
def =*=(x: A): Boolean = a == x
def =!=(x: A): Boolean = a != x
}
implicit def any2TypesafeEquals[A](a: A) = new TypesafeEquals(a)
class RichSeq[A](val seq: Seq[A]) {
...
def containsSafely(a: A): Boolean = seq exists (a =*=)
...
}
implicit def seq2RichSeq[A](s: Seq[A]) = new RichSeq(s)
So I avoid calling contains.
The examples use L instead of List or SeqLike, because for this solution to be applied to preexisting contains method of those collections, it would require a change to the preexisting library code. One of the goals is the best way to do equality, not the best compromise to interopt with the current libraries (although backwards compatibility needs to be considered). Additionally, my other goal is this answer is generally applicable for any method function that wants to selectively disable the implicit subsumption feature of the Scala compiler for any reason, not necessarily tied to the equality semantics.
case class L[+A]( elem: A )
{
def contains[B](x: B)(implicit ev: A <:< B) = elem == x
}
The above generates an error as desired, assuming the desired semantics for List.contains is the input should be equal to and a supertype of the contained element.
L("a").contains(5)
error: could not find implicit value for parameter ev: <:<[java.lang.String,Int]
L("a").contains(5)
^
The error is not generated when implicit subsumption was not required.
scala> L("a").contains(5 : Any)
defined class L
scala> L("a").contains("")
defined class L
This disables the implicit subsumption (selectively at the method definition site), by requiring the input parameter type B to be the same as the argument type passed as input (i.e. not implicitly subsumable with A), and then separately require implicit evidence that B is a, or has an implicitly subsumable, supertype of A.]
UPDATE May 03, 2012: The code above is not complete, as is shown below that turning off all subsumption at the method definition-site does not give the desired result.
class Super
defined class Super
class Sub extends Super
defined class Sub
L(new Sub).contains(new Super)
defined class L
L(new Super).contains(new Sub)
error: could not find implicit value for parameter ev: <:<[Super,Sub]
L(new Super).contains(new Sub)
^
The only way to get the desired form of subsumption, is to also cast at the method (call) use-site.
L(new Sub).contains(new Super : Sub)
error: type mismatch;
found : Super
required: Sub
L(new Sub).contains(new Super : Sub)
^
L(new Super).contains(new Sub : Super)
defined class L
Per soc's answer, the current semantics for List.contains is that the input should be equal to, but not necessarily a supertype of the contained element. This assumes List.contains promises any matched item only equals and is not required to be a (subtype or) copy of an instance of the input. The current universal equality interface Any.equals : Any => Boolean is unityped, so equality doesn't enforce a subtyping relationship. If this is the desired semantics for List.contains, subtyping relationships can't be employed to optimize the compile-time semantics, e.g. disabling implicit subsumption, and we are stuck with the potential semantic inefficiencies that degrade runtime performance for List.contains.
While I will be studying and thinking more about equality and contains, afaics my answer remains valid for the general purpose of selectively disabling implicit subsumption at the method definition site.
My thought process is also ongoing holistically w.r.t. the best model of equality.
Update: I added a comment below soc's answer, so I now think his point is not relevant. Equality should always be based on a subtyped relationship, which afaics is what Martin Odersky is proposing for the new equality overhaul (see also his version of contains). Any ad-hoc polymorphic equivalence (e.g. BitInt(1) == 1) can be handled with implicit conversions. I explained in my comment below didierd's answer that without my improvement below, afaics Martin's proposed contains would have a semantic error, whereby a mutual implicitly subsumed supertype (other than Any) will select the wrong implicit instance of Eq (if one exists, else unnecessary compiler error). My solution disables the implicit subsumption for this method, which is the correct semantics for the subtyped argument of Eq.eq.
trait Eq[A]
{
def eq(x: A, y: A) = x == y
}
implicit object EqInt extends Eq[Int]
implicit object EqString extends Eq[String]
case class L[+A]( elem: A )
{
def contains[B](x: B)(implicit ev: A <:< B, eq: Eq[B]) = eq.eq(x, elem)
}
L("a").contains("")
Note Eq.eq can be optionally replaced by the implicit object (not overridden because there is no virtual inheritance, see below).
Note that as desired, L("a").contains(5 : Any) no longer compiles, because Any.equals is no longer used.
We can abbreviate.
case class L[+A]( elem: A )
{
def contains[B : Eq](x: B)(implicit ev: A <:< B) = eq.eq(x, elem)
}
Add: The x == y must be a virtual inheritance call, i.e. x.== should be declared override, because there is no virtual inheritance in the Eq typeclass. The type parameter A is invariant (because A is used in the contravariant position as input parameter of Eq.eg). Then we can define an implicit object on an interface (a.k.a. trait).
Thus, the Any.equals override must still check if the concrete type of the input matches. That overhead can't be removed by the compiler.
I think I have a legitimate solution to at least some of the problem posted here - I mean, the issue with List("1").contains(1):
https://docs.google.com/document/d/1sC42GKY7WvztXzgWPGDqFukZ0smZFmNnQksD_lJzm20/edit