Haskell - Why is Alternative implemented for List - list

I have read some of this post Meaning of Alternative (it's long)
What lead me to that post was learning about Alternative in general. The post gives a good answer to why it is implemented the way it is for List.
My question is:
Why is Alternative implemented for List at all?
Is there perhaps an algorithm that uses Alternative and a List might be passed to it so define it to hold generality?
I thought because Alternative by default defines some and many, that may be part of it but What are some and many useful for contains the comment:
To clarify, the definitions of some and many for the most basic types such as [] and Maybe just loop. So although the definition of some and many for them is valid, it has no meaning.
In the "What are some and many useful for" link above, Will gives an answer to the OP that may contain the answer to my question, but at this point in my Haskelling, the forest is a bit thick to see the trees.
Thanks

There's something of a convention in the Haskell library ecology that if a thing can be an instance of a class, then it should be an instance of the class. I suspect the honest answer to "why is [] an Alternative?" is "because it can be".
...okay, but why does that convention exist? The short answer there is that instances are sort of the one part of Haskell that succumbs only to whole-program analysis. They are global, and if there are two parts of the program that both try to make a particular class/type pairing, that conflict prevents the program from working right. To deal with that, there's a rule of thumb that any instance you write should live in the same module either as the class it's associated with or as the type it's associated with.
Since instances are expected to live in specific modules, it's polite to define those instances whenever you can -- since it's not really reasonable for another library to try to fix up the fact that you haven't provided the instance.

Alternative is useful when viewing [] as the nondeterminism-monad. In that case, <|> represents a choice between two programs and empty represents "no valid choice". This is the same interpretation as for e.g. parsers.
some and many does indeed not make sense for lists, since they try iterating through all possible lists of elements from the given options greedily, starting from the infinite list of just the first option. The list monad isn't lazy enough to do even that, since it might always need to abort if it was given an empty list. There is however one case when both terminates: When given an empty list.
Prelude Control.Applicative> many []
[[]]
Prelude Control.Applicative> some []
[]
If some and many were defined as lazy (in the regex sense), meaning they prefer short lists, you would get out results, but not very useful, since it starts by generating all the infinite number of lists with just the first option:
Prelude Control.Applicative> some' v = liftA2 (:) v (many' v); many' v = pure [] <|> some' v
Prelude Control.Applicative> take 100 . show $ (some' [1,2])
"[[1],[1,1],[1,1,1],[1,1,1,1],[1,1,1,1,1],[1,1,1,1,1,1],[1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1],[1,1,1,1,1,"
Edit: I believe the some and many functions corresponds to a star-semiring while <|> and empty corresponds to plus and zero in a semiring. So mathematically (I think), it would make sense to split those operations out into a separate typeclass, but it would also be kind of silly, since they can be implemented in terms of the other operators in Alternative.

Consider a function like this:
fallback :: Alternative f => a -> (a -> f b) -> (a -> f e) -> f (Either e b)
fallback x f g = (Right <$> f x) <|> (Left <$> g x)
Not spectacularly meaningful, but you can imagine it being used in, say, a parser: try one thing, falling back to another if that doesn't work.
Does this function have a meaning when f ~ []? Sure, why not. If you think of a list's "effects" as being a search through some space, this function seems to represent some kind of biased choice, where you prefer the first option to the second, and while you're willing to try either, you also tag which way you went.
Could a function like this be part of some algorithm which is polymorphic in the Alternative it computes in? Again I don't see why not. It doesn't seem unreasonable for [] to have an Alternative instance, since there is an implementation that satisfies the Alternative laws.
As to the answer linked to by Will Ness that you pointed out: it covers that some and many don't "just loop" for lists. They loop for non-empty lists. For empty lists, they immediately return a value. How useful is this? Probably not very, I must admit. But that functionality comes along with (<|>) and empty, which can be useful.

Related

Why would I ever want to use Maybe instead of a List?

Seeing as the Maybe type is isomorphic to the set of null and singleton lists, why would anyone ever want to use the Maybe type when I could just use lists to accomodate absence?
Because if you match a list against the patterns [] and [x] that's not an exhaustive match and you'll get a warning about that, forcing you to either add another case that'll never get called or to ignore the warning.
Matching a Maybe against Nothing and Just x however is exhaustive. So you'll only get a warning if you fail to match one of those cases.
If you choose your types such that they can only represent values that you may actually produce, you can rely on non-exhaustiveness warnings to tell you about bugs in your code where you forget to check for a given a case. If you choose more "permissive" types, you'll always have to think about whether a warning represents an actual bug or just an impossible case.
You should strive to have accurate types. Maybe expresses that there is exactly one value or that there is none. Many imperative languages represent the "none" case by the value null.
If you chose a list instead of Maybe, all your functions would be faced with the possibility that they get a list with more than one member. Probably many of them would only be defined for one value, and would have to fail on a pattern match. By using Maybe, you avoid a class of runtime errors entirely.
Building on existing (and correct) answers, I'll mention a typeclass based answer.
Different types convey different intentions - returning a Maybe a represents a computation with the possiblity of failing while [a] could represent non-determinism (or, in simpler terms, multiple possible return values).
This plays into the fact that different types have different instances for typeclasses - and these instances cater to the underlying essence the type conveys. Take Alternative and its operator (<|>) which represents what it means to combine (or choose) between arguments given.
Maybe a Combining computations that can fail just means taking the first that is not Nothing
[a] Combining two computations that each had multiple return values just means concatenating together all possible values.
Then, depending on which types your functions use, (<|>) would behave differently. Of course, you could argue that you don't need (<|>) or anything like that, but then you are missing out on one of Haskell's main strengths: it's many high-level combinator libraries.
As a general rule, we like our types to be as snug fitting and intuitive as possible. That way, we are not fighting the standard libraries and our code is more readable.
Lisp, Scheme, Python, Ruby, JavaScript, etc., manage to get along with just one type each, which you could represent in Haskell with a big sum type. Every function handling a JavaScript (or whatever) value must be prepared to receive a number, a string, a function, a piece of the document object model, etc., and throw an exception if it gets something unexpected. People who program in typed languages like Haskell prefer to limit the number of unexpected things that can occur. They also like to express ideas using types, making types useful (and machine-checked) documentation. The closer the types come to representing the intended meaning, the more useful they are.
Because there are an infinite number of possible lists, and a finite number of possible values for the Maybe type. It perfectly represents one thing or the absence of something without any other possibility.
Several answers have mentioned exhaustiveness as a factor here. I think it is a factor, but not the biggest one, because there is a way to consistently treat lists as if they were Maybes, which the listToMaybe function illustrates:
listToMaybe :: [a] -> Maybe a
listToMaybe [] = Nothing
listToMaybe (a:_) = Just a
That's an exhaustive pattern match, which rules out any straightforward errors.
The factor I'd highlight as bigger is that by using the type that more precisely models the behavior of your code, you eliminate potential behaviors that would be possible if you used a more general alternative. Say for example you have some context in your code where you uses a type of the form a -> [b], though the only correct alternatives (given your program's specification) are empty or singleton lists. Try as hard as you may to enforce the convention that this context should obey that rule, it's still possible that you'll mess up and:
Somehow a function used in that context will produce a list of two or more items;
And somehow a function that uses the results produced in that context will observe whether the lists have two or more items, and behave incorrectly in that case.
Example: some code that expects there to be no more than one value will blindly print the contents of the list and thus print multiple items when only one was supposed to be.
But if you use Maybe, then there really must be either one value or none, and the compiler enforces this.
Even though isomorphic, e.g. QuickCheck will run slower because of the increase in search space.

Monad: Why does Identity matter, what's going to happen if there's no such special member in a set?

I'm trying to learn the concept of monad, I'm watching this excellent video Brian Beckend trying to explain what is monad.
When he talks about monoid, it's a collection of types, it has a rule of composition, and this composition has to obey 2 rules:
associative: x # (y # z ) = (x # y) # z
a special member in the collection: x # id = x and id # x = x
I'm using # symbol representing composition. id means the special member.
The second point is what I'm trying to understand. why does this matter ? what if there's no such special member ?
When I learn new concept, I always try to relate these abstract concept to some other concrete things, so that I can fully understand and learn them by heart.
So what I'm trying to relate monad and monoid to is lego. So all the building blocks in a lego set forms a collection. and the composition rule is composite them into new shape of building blocks. and it's obvious the composition obey the first rule: associative. But there's no special building block which can composite with other building block and get the same back. So it fails to obey the second rule.
But lego is still highly composable. What has been missing or lack when lego fails to obey the second rule ? What is the consequence ?
Or put it this way, comparing to other monoid which obey all those rules. What feature does other monoid has but lego doesn't ?
A monoid without an identity element is called a semigroup and its still a fine and useful construct. It just gives us something different. Consider, for example, a fold on a list. We can do this by mapping every element of a list to a monoid and then composing them all. But if you only have a semigroup, you can't fold on a possibly empty list.
Consider another example -- the integers greater than zero, versus the integers greater than or equal to zero. In the latter case we have a monoid, since zero is literally our zero element. So I can solve for example, the equation "5 + x = 5". In the former case, with a semigroup, I can't solve that equation. Or I can say "you have no apples, I then give you five apples, how many do you have?" In a world without zero, we have to assume everyone starts with some apples to begin with! So, for the same reasons having a zero lying around is important with numbers, it is handy to have a "generalized zero" hanging around with more abstract algebraic structures.
(Note this doesn't mean one or the other is "better" -- just that they are different, and the extra structure, when available, can come in handy. Also note that there is a universal way to turn a semigroup into a monoid by adding a zero element, so since all semigroup results lift into the 'completed' results on monoids, it tends to be more convenient, typically, to just treat things in terms of the latter.)
The empty Lego could be considered as id but then you will have to accept that empty space is Lego. But yes if you don't want id like #sclv wrote, it would be a semigroup.

Erlang: Printing a List with a name always in front of it

I just started learning Erlang so please bear with me if this question seems a little simple.
Hi guys. I've been thinking about it for a while but nothing I come up with seems to be working.
I am writing an Erlang function that is supposed to take a list as an argument then print the list with my name in front of it. For the purposes of this question, let's say my name is "James".
If I type in testmodule:NameInFront("Legible", "Hey", "Think").
Erlang should return ["James", "Legible", "Hey", "Think"]
This is the code I have so far:
-module(testmodule).
-export([NameInFront/1]).
NameInFront(List)-> ["James"]++[List].
It works just fine when I type in just one word, which I guess it the fault of the NameInFront/1 part but I want it to be able to handle any amount of words I type in. Anyone know how I can get my function to handle multiple inputs? Thank you very much.
I'm not quite sure what you mean: whether you want your function to be variadic (take a flexible number of arguments), or you are having trouble getting your lists to join together properly.
Variadic functions are not the way Erlang works. FunctionName/Arity defines the concrete identity of a function in Erlang (discussed here). So our way of having a function take multiple arguments is to make one (or more) of the arguments a list:
print_terms(Terms) -> io:format("~tp~n", [Terms]).
The io:format/2 function itself actually takes a list as its second function, which is how it deals with a variable number of arguments:
print_two_things(ThingOne, ThingTwo) ->
io:format("~tp~n~tp~n", [ThingOne, ThingTwo]).
In your case you want to accept a list of things, add your name to it, and print it out. This is one way to do it.
name_in_front(ListOfStrings) ->
NewList = ["James" | ListOfStrings],
io:format("~p~n", [NewList]).
Using the ++ operator is another (which is actually a different syntax for a recursive operation which expands to the exact same thing, ):
name_in_front(ListOfStrings) ->
NewList = ["James"] ++ ListOfStrings,
io:format("~tp~n", [NewList]).
But that's a little silly, because it is intended to join two strings together in a simple way, and in this case it makes the syntax look weird.
Yet another way would be to more simply write a function that take two arguments and accomplishes the same thing:
any_name_in_front(Name, ListOfThings) ->
io:format("~tp~n", [[Name | ListOfThings]]).
The double [[]] is because io:format/2 takes a list as its second argument, and you want to pass a list of one thing (itself a list) into a single format substitution slot (the "~tp" part).
One thing to note is that capitalization matters in Erlang. It has a meaning. Module and function names are atoms, which are not the same thing as variables. For this reason they must be lowercase, and because they must be lowercase to start with the convention is to use underscores between words instead of usingCamelCase. Because, well, erlangIsNotCpp.
Play around in the shell a bit with the simple elements of the function you want, and once you have them ironed out write it into a source file and give it a try.

What to use property testing for

I'd like to know what is the property testing aiming for, what is it's sweet point, where it should be used. Let' have an example function that I want to test:
f :: [Integer] -> [Integer]
This function, f, takes a list of numbers and will square the odd numbers and filter out the even numbers. I can state some properties about the function, like
Given a list of even numbers, return empty list.
Given a list of odd numbers, the result list will have the same size as input.
Given that I have a list of even numbers and a list of odd numbers, when I join them, shuffle and pass to the function, the length of the result will be the length of the list of odd numbers.
Given I provide a list of positive odd numbers, then each element in the result list at the same index will be greater than in the original list
Given I provide a list of odd numbers and even numbers, join and shuffle them, then I will get a list, where each number is odd
etc.
None of the properties test, that the function works for the simplest case, e.g. I can make a simple case, that will pass these properties if I implement the f incorrectly:
f = fmap (+2) . filter odd
So, If I want to cover some simple case, It looks like I either need to repeat a fundamental part of the algorithm in the property specification, or I need to use value based testing. The first option, that I have, to repeat the algorithm may be useful, If I plan to improve the algorithm if I plan to change it's implementation, for speed for example. In this way, I have a referential implementation, that I can use to test again.
If I want to check, that the algorithm doesn't fail for some trivial cases and I don't want to repeat the algorithm in the specification, it looks like I need some unit testing. I would write for example these checks:
f ([2,5]) == [25]
f (-8,-3,11,1) == [9,121,1]
Now I have a lot more confidence it the algorithm.
My question is, is the property based testing meant to replace the unit testing, or is it complementary? Is there some general idea, how to write the properties, so they are useful or it just totally depends on the understanding of the logic of the function? I mean, can one tell that writing the properties in some way is especially beneficial?
Also, should one strive to make the properties test every part of the algorithm? I could put the squaring out of the algorithm, and then test it elsewhere, let the properties test just the filtering part, which it looks like, that it covers it well.
f :: (Integer -> Integer) -> [Integer] -> [Integer]
f g = fmap g . filter odd
And then I can pass just Prelude.id and test the g elsewhere using unit testing.
How about the following properties:
For all odd numbers in the source list, its square is element of the result list.
For all numbers in the result list, there is a number in the source list whose square it is.
By the way, odd is easier to read than \x -> x % 2 == 1
Reference algorithm
It's very common to have a (possibly inefficient) reference implementation and test against that. In fact, that's one of the most common quickcheck strategies when implementing numeric algorithms. But not every part of the algorithm needs one. Sometimes there are some properties that characterize the algorithm completely.
Ingo's comment is spot on in that regard: These properties determine the results of your algorithm (up to order and duplicates). To recover order and duplicates you can modify the properties to include "in the resulting list truncated after the position of the source element" and vice versa in the other property.
Granularity of tests
Of course, given Haskell's composability it's nice to test each reasonable small part of an algorithm by itself. I trust e.g. \x -> x*x and filter odd as reference without looking twice.
Whether there should be properties for each part is not as clear as you might inline that part of the algorithm later and thus make the properties moot. Due to Haskell's laziness that's not a common thing to do, but it happens.

What's the real purpose of `ignore` function in OCaml?

There is an ignore function in OCaml.
val ignore : 'a -> unit
Discard the value of its argument and return (). For instance,
ignore(f x) discards the result of the side-effecting function f. It
is equivalent to f x; (), except that the latter may generate a
compiler warning; writing ignore(f x) instead avoids the warning.
I know what this function will do, but don't get the point of using it.
Anyone can explain or give an example for when we have to use it?
You basically answered your own question. You don't ever have to use it. The point is precisely to avoid the warning. If you write f x; (), the compiler assumes you probably did something wrong. Probably you thought f x returns unit because you rarely want to ignore non-unit values.
However, sometimes that's not true, and you really want to ignore even non-unit values. Writing ignore (f x) documents the fact that you know f x returns something, but you are deliberately ignoring it.
Note that in real code f x might be something more complex, so the chances of you being wrong about the return type of f x are reasonably high. One example is partial application. Consider f : int -> int -> unit. You might accidentally write f 1, forgetting the second argument, and the warning will help you. Another example is if you do open Async, then many functions from the Standard Library change from returning unit to returning unit Deferred.t. Especially when first starting to use Async, it is quite likely that you'll accidentally think the semicolon operator is appropriate in places that you really need to use monadic bind.
As a complement to Ashish Agarwal's answer (because judging from your comment you don't seem very convinced) :
Imagine that I have a function that has side effects, and returns a value indicating something about the computation. Then, if I'm interested in how the computation went, I will need its return value. However, if I don't care about this and simply want the side effects to take place, I would use ignore.
Dumb example : let's say you have a function which sorts an array and returns Was_already_sorted or Was_not_sorted depending on the initial state of the array. Then if for some reason I'm interested in knowing how often my array was sorted, I might need the return value of this function. If not, I will ignore it.
I agree that this is a dumb example. And probably that in many cases there would be better ways to deal with the problem than using ignore (I've just noticed that I never use ignore). If you're really passionate about this, you could try to find examples of use of this function in real-life code (maybe in the source-code of software such as Unison?).
Also, note that you can use let _ = f x to the same end.