Im learning Erlang
suppose I have two lists
[{a,a,a,b,c},{d,d,a,a,b},{a,b,c,d,e}]
[{{a,a,a,a,a},10},{{a,a,a,a},6},{{a,a,a},4}]
after Patten Match, expected result {a,a,a,b,c} because it can match {{a,a,a},4}
I tried lists:keysearch and lists:member, but cannot get expected result
any suggestion?
thanks
Matching is not consumptive. Your mental model of how matching works is confusing set operations (like set subtraction and intersection) with comparison and assignment. Your concept of set operations might also benefit from some review.
Erlang's matching is only for assignment and assertion (a sort of comparison). If we match an unbound variable (never used before) against any value, the variable will be bound (assigned) that value:
Foo = {a,b,c}.
Now Foo and {a,b,c} can be used interchangeably. This is pure symbolic assignment like in math class, not a "variable" in the sense of other languages where variables are "storage boxes for values".
If we use the = operator against any value and this now bound symbol Foo, we are doing a check comparison (an assertion) not an assignment. Foo can't mean anything other than {a,b,c} in the current context, so trying to assign it any different value causes an exception, but simply stating that {a,b,c} is {a.b.c} is correct and still yields {a,b,c} (and since Foo is now a symbol for {a,b,c} it can appear on either side and still the statement is correct).
Doing
{a,b,c} = {a,b,c}.
or
{a,b,c} = Foo.
or
Foo = {a,b,c}.
returns {a,b,c}, and does not raise an exception because all we did here was assert that {a,b,c} is indeed {a,b,c}.
If I want just the first value assigned, I can match another way:
{Bar,_,_} = {a,b,c}.
Now Bar represents a, and the _ values are ignored (completely skipped). The original {a,b,c} has not been changed. This is also true if we do:
{_,Baz,_} = Foo.
Now Baz represents b, and Foo still represents {a,b,c}. And that's about it. When it comes to lists, like [{a,b,c}, {1,2,3}] we can still do matching, but because of the nature of lists we will check a piece at a time (try this in the interpreter):
Spam = [{a,b,c}, {1,2,3}].
[Boo | _] = [{a,b,c}, {1,2,3}].
Now Boo represents {a,b,c}, and Spam still represents its original list.
That's about all there is to matching. The magical thing about Erlang's pattern matching is not how it works, its how many places provide natural opportunities for pattern matching, and how this winds up naturally solving a huge number of problems that require procedural checks or direct assignment operations in other languages (cond, function parameters, =, message reception, etc.).
Set and list operations are not the same thing as pattern matching in Erlang. I suggest going through some basic learning material first, like some of the many good beginner tutorials and Learn You Some Erlang.
Related
I have read some of this post Meaning of Alternative (it's long)
What lead me to that post was learning about Alternative in general. The post gives a good answer to why it is implemented the way it is for List.
My question is:
Why is Alternative implemented for List at all?
Is there perhaps an algorithm that uses Alternative and a List might be passed to it so define it to hold generality?
I thought because Alternative by default defines some and many, that may be part of it but What are some and many useful for contains the comment:
To clarify, the definitions of some and many for the most basic types such as [] and Maybe just loop. So although the definition of some and many for them is valid, it has no meaning.
In the "What are some and many useful for" link above, Will gives an answer to the OP that may contain the answer to my question, but at this point in my Haskelling, the forest is a bit thick to see the trees.
Thanks
There's something of a convention in the Haskell library ecology that if a thing can be an instance of a class, then it should be an instance of the class. I suspect the honest answer to "why is [] an Alternative?" is "because it can be".
...okay, but why does that convention exist? The short answer there is that instances are sort of the one part of Haskell that succumbs only to whole-program analysis. They are global, and if there are two parts of the program that both try to make a particular class/type pairing, that conflict prevents the program from working right. To deal with that, there's a rule of thumb that any instance you write should live in the same module either as the class it's associated with or as the type it's associated with.
Since instances are expected to live in specific modules, it's polite to define those instances whenever you can -- since it's not really reasonable for another library to try to fix up the fact that you haven't provided the instance.
Alternative is useful when viewing [] as the nondeterminism-monad. In that case, <|> represents a choice between two programs and empty represents "no valid choice". This is the same interpretation as for e.g. parsers.
some and many does indeed not make sense for lists, since they try iterating through all possible lists of elements from the given options greedily, starting from the infinite list of just the first option. The list monad isn't lazy enough to do even that, since it might always need to abort if it was given an empty list. There is however one case when both terminates: When given an empty list.
Prelude Control.Applicative> many []
[[]]
Prelude Control.Applicative> some []
[]
If some and many were defined as lazy (in the regex sense), meaning they prefer short lists, you would get out results, but not very useful, since it starts by generating all the infinite number of lists with just the first option:
Prelude Control.Applicative> some' v = liftA2 (:) v (many' v); many' v = pure [] <|> some' v
Prelude Control.Applicative> take 100 . show $ (some' [1,2])
"[[1],[1,1],[1,1,1],[1,1,1,1],[1,1,1,1,1],[1,1,1,1,1,1],[1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1],[1,1,1,1,1,"
Edit: I believe the some and many functions corresponds to a star-semiring while <|> and empty corresponds to plus and zero in a semiring. So mathematically (I think), it would make sense to split those operations out into a separate typeclass, but it would also be kind of silly, since they can be implemented in terms of the other operators in Alternative.
Consider a function like this:
fallback :: Alternative f => a -> (a -> f b) -> (a -> f e) -> f (Either e b)
fallback x f g = (Right <$> f x) <|> (Left <$> g x)
Not spectacularly meaningful, but you can imagine it being used in, say, a parser: try one thing, falling back to another if that doesn't work.
Does this function have a meaning when f ~ []? Sure, why not. If you think of a list's "effects" as being a search through some space, this function seems to represent some kind of biased choice, where you prefer the first option to the second, and while you're willing to try either, you also tag which way you went.
Could a function like this be part of some algorithm which is polymorphic in the Alternative it computes in? Again I don't see why not. It doesn't seem unreasonable for [] to have an Alternative instance, since there is an implementation that satisfies the Alternative laws.
As to the answer linked to by Will Ness that you pointed out: it covers that some and many don't "just loop" for lists. They loop for non-empty lists. For empty lists, they immediately return a value. How useful is this? Probably not very, I must admit. But that functionality comes along with (<|>) and empty, which can be useful.
I have a function which will fail if there has being any change on the term/list it is using since the generation of this term/list. I would like to avoid to check that each parameter still the same. So I had thought about each time I generate the term/list to perform a CRC or something similar. Before making use of it I would generate again the CRC so I can be 99,9999% sure the term/list still the same.
Going to a specfic answer, I am programming in Erlang, I am thinking on using a function of the following type:
-spec(list_crc32(List :: [term()]) -> CRC32 :: integer()).
I use term, because it is a list of terms, (erlang has already a default fast CRC libraries but for binary values). I have consider to use "erlang:crc32(term_to_binary(Term))", but not sure if there could be a better approach.
What do you think?
Regards, Borja.
Without more context it is a little bit difficult to understand why you would have this problem, particularly since Erlang terms are immutable -- once assigned no other operation can change the value of a variable, not even in the same function.
So if your question is "How do I quickly assert that true = A == A?" then consider this code:
A = generate_list()
% other things in this function happen
A = A.
The above snippet will always assert that A is still A, because it is not possible to change A like you might do in, say, Python.
If your question is "How do I assert that the value of a new list generated exactly the same value as a different known list?" then using either matching or an actual assertion is the fastest way:
start() ->
A = generate_list(),
assert_loop(A).
assert_loop(A) ->
ok = do_stuff(),
A = generate_list(),
assert_loop(A).
The assert_loop/1 function above is forcing an assertion that the output of generate_list/0 is still exactly A. There is no telling what other things in the system might be happening which may have affected the result of that function, but the line A = generate_list() will crash if the list returned is not exactly the same value as A.
In fact, there is no way to change the A in this example, no matter how many times we execute assert_loop/1 above.
Now consider a different style:
compare_loop(A) ->
ok = do_stuff(),
case A =:= generate_list() of
true -> compare_loop(A);
false -> terminate_gracefully()
end.
Here we have given ourselves the option to do something other than crash, but the effect is ultimately the same, as the =:= is not merely a test of equality, it is a match test meaning that the two do not evaluate to the same values, but that they actually match.
Consider:
1> 1 == 1.0.
true
2> 1 =:= 1.0.
false
The fastest way to compare two terms will depend partly on the sizes of the lists involved but especially on whether or not you expect the assertion to pass or fail more often.
If the check is expected to fail more often then the fastest check is to use an assertion with =, an equivalence test with == or a match test with =:= instead of using erlang:phash2/1. Why? Because these tests can return false as soon as a non-matching element is encountered -- and if this non-match occurs near the beginning of the list then a full traverse of both lists is avoided entirely.
If the check is expected to pass more often then something like erlang:phash2/1 will be faster, but only if the lists are long, because only one list will be fully traversed each iteration (the hash of the original list is already stored). It is possible, though, on a short list that a simple comparison will still be faster than computing a hash, storing it, computing another hash, and then comparing the hashes (obviously). So, as always, benchmark.
A phash2 version could look like:
start() ->
A = generate_list(),
Hash = erlang:phash2(A),
assert_loop(Hash).
assert_loop(Hash) ->
ok = do_stuff(),
Hash = erlang:phash2(generate_list()),
loop(Hash).
Again, this is an assertive loop that will crash instead of exit cleanly, so it would need to be adapted to your needs.
The basic mystery still remains, though: in a language with immutable variables why is it that you don't know whether something will have changed? This is almost certainly a symptom of an underlying architectural problem elsewhere in the program -- either that or simply a misunderstanding of immutability in Erlang.
Seeing as the Maybe type is isomorphic to the set of null and singleton lists, why would anyone ever want to use the Maybe type when I could just use lists to accomodate absence?
Because if you match a list against the patterns [] and [x] that's not an exhaustive match and you'll get a warning about that, forcing you to either add another case that'll never get called or to ignore the warning.
Matching a Maybe against Nothing and Just x however is exhaustive. So you'll only get a warning if you fail to match one of those cases.
If you choose your types such that they can only represent values that you may actually produce, you can rely on non-exhaustiveness warnings to tell you about bugs in your code where you forget to check for a given a case. If you choose more "permissive" types, you'll always have to think about whether a warning represents an actual bug or just an impossible case.
You should strive to have accurate types. Maybe expresses that there is exactly one value or that there is none. Many imperative languages represent the "none" case by the value null.
If you chose a list instead of Maybe, all your functions would be faced with the possibility that they get a list with more than one member. Probably many of them would only be defined for one value, and would have to fail on a pattern match. By using Maybe, you avoid a class of runtime errors entirely.
Building on existing (and correct) answers, I'll mention a typeclass based answer.
Different types convey different intentions - returning a Maybe a represents a computation with the possiblity of failing while [a] could represent non-determinism (or, in simpler terms, multiple possible return values).
This plays into the fact that different types have different instances for typeclasses - and these instances cater to the underlying essence the type conveys. Take Alternative and its operator (<|>) which represents what it means to combine (or choose) between arguments given.
Maybe a Combining computations that can fail just means taking the first that is not Nothing
[a] Combining two computations that each had multiple return values just means concatenating together all possible values.
Then, depending on which types your functions use, (<|>) would behave differently. Of course, you could argue that you don't need (<|>) or anything like that, but then you are missing out on one of Haskell's main strengths: it's many high-level combinator libraries.
As a general rule, we like our types to be as snug fitting and intuitive as possible. That way, we are not fighting the standard libraries and our code is more readable.
Lisp, Scheme, Python, Ruby, JavaScript, etc., manage to get along with just one type each, which you could represent in Haskell with a big sum type. Every function handling a JavaScript (or whatever) value must be prepared to receive a number, a string, a function, a piece of the document object model, etc., and throw an exception if it gets something unexpected. People who program in typed languages like Haskell prefer to limit the number of unexpected things that can occur. They also like to express ideas using types, making types useful (and machine-checked) documentation. The closer the types come to representing the intended meaning, the more useful they are.
Because there are an infinite number of possible lists, and a finite number of possible values for the Maybe type. It perfectly represents one thing or the absence of something without any other possibility.
Several answers have mentioned exhaustiveness as a factor here. I think it is a factor, but not the biggest one, because there is a way to consistently treat lists as if they were Maybes, which the listToMaybe function illustrates:
listToMaybe :: [a] -> Maybe a
listToMaybe [] = Nothing
listToMaybe (a:_) = Just a
That's an exhaustive pattern match, which rules out any straightforward errors.
The factor I'd highlight as bigger is that by using the type that more precisely models the behavior of your code, you eliminate potential behaviors that would be possible if you used a more general alternative. Say for example you have some context in your code where you uses a type of the form a -> [b], though the only correct alternatives (given your program's specification) are empty or singleton lists. Try as hard as you may to enforce the convention that this context should obey that rule, it's still possible that you'll mess up and:
Somehow a function used in that context will produce a list of two or more items;
And somehow a function that uses the results produced in that context will observe whether the lists have two or more items, and behave incorrectly in that case.
Example: some code that expects there to be no more than one value will blindly print the contents of the list and thus print multiple items when only one was supposed to be.
But if you use Maybe, then there really must be either one value or none, and the compiler enforces this.
Even though isomorphic, e.g. QuickCheck will run slower because of the increase in search space.
I'm trying to learn the concept of monad, I'm watching this excellent video Brian Beckend trying to explain what is monad.
When he talks about monoid, it's a collection of types, it has a rule of composition, and this composition has to obey 2 rules:
associative: x # (y # z ) = (x # y) # z
a special member in the collection: x # id = x and id # x = x
I'm using # symbol representing composition. id means the special member.
The second point is what I'm trying to understand. why does this matter ? what if there's no such special member ?
When I learn new concept, I always try to relate these abstract concept to some other concrete things, so that I can fully understand and learn them by heart.
So what I'm trying to relate monad and monoid to is lego. So all the building blocks in a lego set forms a collection. and the composition rule is composite them into new shape of building blocks. and it's obvious the composition obey the first rule: associative. But there's no special building block which can composite with other building block and get the same back. So it fails to obey the second rule.
But lego is still highly composable. What has been missing or lack when lego fails to obey the second rule ? What is the consequence ?
Or put it this way, comparing to other monoid which obey all those rules. What feature does other monoid has but lego doesn't ?
A monoid without an identity element is called a semigroup and its still a fine and useful construct. It just gives us something different. Consider, for example, a fold on a list. We can do this by mapping every element of a list to a monoid and then composing them all. But if you only have a semigroup, you can't fold on a possibly empty list.
Consider another example -- the integers greater than zero, versus the integers greater than or equal to zero. In the latter case we have a monoid, since zero is literally our zero element. So I can solve for example, the equation "5 + x = 5". In the former case, with a semigroup, I can't solve that equation. Or I can say "you have no apples, I then give you five apples, how many do you have?" In a world without zero, we have to assume everyone starts with some apples to begin with! So, for the same reasons having a zero lying around is important with numbers, it is handy to have a "generalized zero" hanging around with more abstract algebraic structures.
(Note this doesn't mean one or the other is "better" -- just that they are different, and the extra structure, when available, can come in handy. Also note that there is a universal way to turn a semigroup into a monoid by adding a zero element, so since all semigroup results lift into the 'completed' results on monoids, it tends to be more convenient, typically, to just treat things in terms of the latter.)
The empty Lego could be considered as id but then you will have to accept that empty space is Lego. But yes if you don't want id like #sclv wrote, it would be a semigroup.
For an assignment, I have to create a type inference relation. here's the approach I used
tuples([]).
tuples(_|_).
type(tuples([]),tuples([])).
type(tuples(X|T),tuples(Y|Z)) :- type(tuples(T),tuples(Z)),type(X,Y).
I have already defined the type relation for all possible terms required for my assignment where y is the type of X in type(X,Y). For defining types of n-tuples, I used the approach similiar to the one used for appending lists.
But prolog always returns false when I ask
?-type(tuples([3,str]),Z)
or even
?-type(tuples([3,str]),tuples(Z))
or
?-type(tuples([3,str,4,abc,5,6]),Z)
i.e a list of length n, the answer returned is false.
Nothing changed even when I revered the sub-rules in the last rule.
tuples([]).
tuples(_|_).
type(tuples([]),tuples([])).
type(tuples(X|T),tuples(Y|Z)) :- type(X,Y),type(tuples(T),tuples(Z)).
I am not asking for alternative approaches to type of tuples to help me in my assignment but I can't figure out why this approach is not working.
It looks like your definition of a tuple is a List with length 2.
This rule does not check for that:
tuples(_|_).
What you probably want is this:
tuples([_,_]).
If you want it to check for any length list, use:
tuples([_|_]).
In the latter rule, the first wildcard represents the first item in the list (the head) and the second wildcard represents the rest of the list (the tail).