Need to match a tuple from a list in OCaml - ocaml

I need to create a function rec assoc (d,k,l) that takes a triple (d,k,l) where l is a list of key-value pairs [(k1,v1);(k2,v2);...] and finds the first ki that equals k. If such a ki is found, then vi is returned. Otherwise, the default value d is returned. it needs to be tail recursive
here is the function that I made:
let rec assoc (d,k,l) = match l with
|[]-> d
|(a,b)::t ->
if (a==k) then b
else assoc(d,k,t)
;;
my logic here is take the head of the list l and if the first part of the tuple in the list matches k, I return the second part of the tuple.
If not then i want to call the function again on the tail of the list so it checks each element. If the entire list is traversed down to the empty list without finding a match, I want to return d. For some reason, it always returns d no matter what list I give it. What could be the reason for that.
heres some sample output it should give:
# assoc (-1,"jeff",[("sorin",85);("jeff",23);("moose",44)]);;
- : int = 23
# assoc (-1,"bob",[("sorin",85);("jeff",23);("moose",44)("margaret",99)]);;
- : int = -1
mine returns -1 for both

Don't use == for comparisons. It's a special-purpose "physical equality". Use = for comparisons.
(Other than this your code looks excellent.)
Comparison operators are defined in the Pervasives module. Here are the descriptions of = (the normal equality comparison) and == (the physical equality comparison):
e1 = e2 tests for structural equality of e1 and e2. Mutable structures (e.g. references and arrays) are equal if and only if their current contents are structurally equal, even if the two mutable objects are not the same physical object. Equality between functional values raises Invalid_argument. Equality between cyclic data structures may not terminate.
e1 == e2 tests for physical equality of e1 and e2. On mutable types such as references, arrays, byte sequences, records with mutable fields and objects with mutable instance variables, e1 == e2 is true if and only if physical modification of e1 also affects e2. On non-mutable types, the behavior of ( == ) is implementation-dependent; however, it is guaranteed that e1 == e2 implies compare e1 e2 = 0.
Generally speaking, you shouldn't use == in your code unless you want the specific (weak) equality test described here. Or, just don't use it :-)

Related

Stroustrup on comparisons, errata?

In Stroustrup C++ 4th Ed Page 891, where comparisons properties are described. He explains that the function cmp can be represented by less than < for a strict weak ordering. I'm confused by his explanation of "Transitivity of equivalence" as follows;
Transitivity of equivalence: Define equiv(x,y) to be
!(cmp(x,y)||cmp(y,x)). If equiv(x,y) and equiv(y,z), then equiv(x,z).
The last rule is the one that allows us to define equality (x==y) as !(cmp(x,y)||cmp(y,x)) if we need ==.
Should this instead be defined as follows?
cmp is <= and equiv(x,y) = (cmp(x,y) && cmp(y,x))
Appreciate your guidance.
This is not errata.
equiv(x,y) := !(cmp(x,y)||cmp(y,x))
x := x
y := x
substituting in:
!((x < x) || (x < x))
!((false) || (false))
!(false)
true
Can this instead be defined as follows?
cmp is <= and equiv(x,y) = (cmp(x,y) && cmp(y,x))
Yes, that also gives you consistent definitions.
Should this instead be defined as follows?
It isn't better than the definitions we use, so I'd suggest no, mostly because there's loads of existing code written for the current definition.
Instead of the current definition of Compare
Compare is a set of requirements expected by some of the standard library facilities from the user-provided function object types.
The return value of the function call operation applied to an object of a type satisfying Compare, when contextually converted to bool, yields true if the first argument of the call appears before the second in the strict weak ordering relation induced by this type, and false otherwise.
For all a, cmp(a,a)==false
If cmp(a,b)==true then cmp(b,a)==false
If cmp(a,b)==true and cmp(b,c)==true then cmp(a,c)==true
It would instead be
The return value of the function call operation applied to an object of a type satisfying Compare, when contextually converted to bool, yields false if the first argument of the call appears after the second in the strict weak ordering relation induced by this type, and true otherwise.
For all a, cmp(a,a)==true
If cmp(a,b)==false then cmp(b,a)==true
If cmp(a,b)==true and cmp(b,c)==true then cmp(a,c)==true

Rigorous proof of the following C++ code's property?

Take the following C++14 code snippet:
unsigned int f(unsigned int a, unsigned int b){
if(a>b)return a;
return b;
}
Statement: the function f returns the maximum of its arguments.
Now, the statement is "obviously" true, yet I failed to prove it rigorously with respect to the ISO/IEC 14882:2014(E) specification.
First: I cannot state the property in a formal way.
A formalized version could be:
For every statement s, when the abstract machine (which is defined in the spec.) is in state P and s looks like "f(expr_a,expr_b)" and 'f' in s is resolved to the function in question, s(P).return=max(expr_a(P).return, expr_b(P).return).
Here for a state P and expression s, s(P) is the state of the machine after evaluation of s.
Question: What would be a correctly formalized version of the statement? How to prove the statement using the properties imposed by the above mentioned specification? For each deductive step please reference the applicable snippet from the standard allowing said step (the number of the segment is enough).
Edit: Maybe formalized in Coq
Please appologize for my approximate ageing mathematic knowledge.
Maximum for a closed subset of natural number (BN) can be defined as follow:
Max:(BN,BN) -> BN
(x ∊ BN)(a ∊ BN)(b ∊ BN)(x = Max(a,b)) => ( x=a & a>b | x=b )
where the symbol have the common mathemical signification.
While your function could be rewritten as follow, where UN is the ensemble of unsigned int:
f:(UN,UN) -> UN
(x ∊ UN)(a ∊ UN)(b ∊ UN)(x = f(a,b)) => ( x=a && a>b || x=b )
Where symbol = is operator==(unsigned int,unsigned int), etc...
So the question reduces to know if the standard specifies that the mathematical structure(s) formed by the unsigned integer with C++ arithmetic operators and comparison operator is isomorphic to the matematical structures (classes,categories) formed by a closed subset of N with the common arithemtic operation and relations. I think the answer is yes, this is expressed in plain english:
C++14 standard,[expr.rel]/5 (Relational Operators)
If both operands (after conversions) are of arithmetic or enumeration type, each of the operators shall yield true if the specified relationship is true and false if it is false.
C++14 standard, [basic.fundamental]/4 (Fundamental Types)
Unsigned integers shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.
Then you could also proove that ({true,false},&&,||) is also isomorphic to boolean arithmetic by analysing the text in [expr.log.and]
and [expr.log.or]
I do not think that you should go further than showing that there is this isomorphism because further would mean demonstrating axioms.
It appears to me that the easiest solution is to prove this backwards. If the first argument to f is the maximum argument, prove that the first argument is returned (fairly easy - the maximum argument a is by definition bigger than b). If the second argument is the maximum argument, prove that the second argument is returned. If the two are equal, show that there is no unique maximum element, so the second argument is still a maximum argument.
Finally, prove that these three options are exhaustive. If a unique maximum argument is passed, it must be passed either as the first or the second argument, since f is binary.
I am unsure about what you want. Looking at a previous version, N3337, we can easily see that almost everything is specified:
a and b starts with the calling values (Function 5.2.2 - 4)
Calling a function executes the compound statement of the function body (Obvious, but where?)
The statements are normally executed in order (Statements 6)
If-statements execute the first sub-statement if condition is true (The If Statement 6.4.1)
Relations actually work as expected (Relation operators 5.9 - 5)
The return-statement returns the value to the caller of the function (The return statement 6.6.3)
However, you attempt to start with f(expr_a, expr_b); and evaluating the arguments to f potentially requires a lot more; especially since they are not sequenced - and could be any function-call.

Does compare work for all types?

Let's consider a type t and two variables x,y of type t.
Will the call compare x y be valid for any type t? I couldn't find any counterexample.
The polymorphic compare function works by recursively exploring the structure of values, providing an ad-hoc total ordering on OCaml values, used to define structural equality tested by the polymorphic = operator.
It is, by design, not defined on functions and closures, as observed by #antron. The recursive nature of the definition implies that structural equality is not defined on values containing a function or a closure. This recursive nature also imply that the compare function is not defined on recursive values, as mentioned by a #antron as well.
Structural equality, and therefore the compare function and the comparison operators, is not aware of structure invariants and cannot be used to compare (mildly) advanced data structures such as Sets, Maps, HashTbls and so on. If comparison of these structures is desired, a specialised function has to be written, this is why Set and Map define such a function.
When defining your own structures, a good rule of thumb is to distinguish between
concrete types, which are defined only in terms of primitive types and other concrete types. Concrete types should not be used for structures whose processing expects some invariants, because it is easy to create arbitrary values of this type breaking these invariants. For these types, the polymorphic comparison function and operators are appropriate.
abstract types, whose concrete definition is hidden. For these types, it is best to provide specialised comparison function. The mixture library defines a compare mixin that can be used to derive comparison operators from the implementation of a specialised compare function. Its use is illustrated in the README.
It doesn't work for function types:
# compare (fun x -> x) (fun x -> x);;
Exception: Invalid_argument "equal: functional value".
Likewise, it won't (generally) work for other types whose values can contain functions:
# type t = A | B of (int -> int);;
type t = A | B of (int -> int)
# compare A A;;
- : int = 0
# compare (B (fun x -> x)) A;;
- : int = 1
# compare (B (fun x -> x)) (B (fun x -> x));;
Exception: Invalid_argument "equal: functional value".
It also doesn't (generally) work for recursive values:
# type t = {self : t};;
type t = { self : t; }
# let rec v = {self = v};;
val v : t = {self = <cycle>}
# let rec v' = {self = v'};;
val v' : t = {self = <cycle>}
# compare v v;;
- : int = 0
# compare v v';;
(* Does not terminate. *)
These cases are also listed in the documentation for compare in Pervasives.

Frequency Trees

I am currently revising for my programming exam (I am new at programming) and I came across with an exercise asking me to implement a functions which "take a frequency tree and list of bits to a value in the Frequency tree and return the remaining bits in a list".
the part that i have trouble understanding is the type that I was given:
FreqTree a -> [bit] -> (a,[bit])
what does the (a, [bit]) actually mean? is a just a value?
Thanks heaps
The type (a, b) is a tuple or pair containing both an a and a b. In Haskell lowercase types are actually type variables, or unknowns. If they are written in a type then it means that the type is invariant to the actual type that variable represents.
If we read the description of this function carefully we can see it is reflective of the type:
take a frequency tree and list of bits to a value in the Frequency tree and return the remaining bits in a list
A function type like
a -> b -> c
can be read as a function from an a and a b to a c. In fact, to further cement the notion of and we had before we can write an equivalent function of type
(a, b) -> c
repeating the idea that tuple types should be read as "and". This conversion is called curry
curry :: (a -> b -> c) -> ((a, b) -> c)
curry f (a, b) = f a b
Using this notion, the description of the function translates to
(FreqTree, ListOfBits) -> (ValueInFreqTree, ListOfRemainingBits)
take
a frequency tree *and* list of bits
to
a value in the Frequency tree *and* the remaining bits in a list
From here we just do a little pattern matching against the type given
(FreqTree , ListOfBits) -> (ValueInFreqTree, ListOfRemainingBits)
FreqTree -> ListOfBits -> (ValueInFreqTree, ListOfRemainingBits)
FreqTree a -> [bit] -> (a , [bit] )
The first step above is the opposite of curry, called uncurry, and the second step compares our expected type to the type given. Here we can see some good parity—list of bits map to [bit] and FreqTree maps to FreqTree a.
So the final bit is to figure out how that type variable a works. This requires understanding the meaning of the parameterized type FreqTree a. Since frequency trees might contain any type of thing caring not about its particular form and only about the ability to compute its frequency, it's nice to parameterize the type by it's value. You might write
FreqTree value
where, again, the lowercase name represents a type variable. In fact, we can do this substitution in the previous type as well, value for a
FreqTree value -> [bit] -> (value, [bit])
and now perhaps this type has taken its most clear form. Given a FreqTree containing unknown types marked as value and a list of bit, we return a particular value and another list of bit.

less or less_equal using set

We can pass a function as <(less) operator to STL data structures such as set, multiset, map, priority_queue, ...
Is there a problem if our function acts like <=(less_equal)?
Yes, there is a problem.
Formally, the comparison function must define a strict weak ordering, and <= does not do that.
more specifically, the < is also used to determine equivalence (x and y are equivalent iff !(x < y) && !(y < x)). This does not hold true for <= (using that operator would have your set believe that objects are never equivalent)
From Effective STL -> Item 21. Always have comparison functions return false for equal
values.
Create a set where less_equal is the comparison type, then insert 10 into the set:
set<int, less_equal<int> > s; // s is sorted by "<="
s.insert(10); //insert the value 10
Now try inserting 10 again:
s.insert(10);
For this call to insert, the set has to figure out whether 10 is already present. We know
that it is. but the set is dumb as toast, so it has to check. To make it easier to
understand what happens when the set does this, we'll call the 10 that was initially
inserted 10A and the 10 that we're trying to insert 10B.The set runs through its internal data structures looking for the place to insert 10B. It ultimately has to check 10B to see if it's the same as 10A. The definition of "the same"
for associative containers is equivalence, so the set tests to see whether
10B is equivalent to 10A. When performing this test, it naturally uses the set's
comparison function. In this example, that's operator<=, because we specified
less_equal as the set's comparison function, and less_equal means operators. The set
thus checks to see whether this expression is true:
!(10A<= 10B)&&!(10B<= 10A) //test 10Aand 10B for equivalence
Well, 10A and 10B are both 10, so it's clearly true that 10A <= 10B. Equally clearly, 10B
<= 10A. The above expression thus simplifies to
!!(true)&&!(true)
and that simplifies to
false && false
which is simply false. That is, the set concludes that 10A and 10B are not equivalent,
hence not the same, and it thus goes about inserting 10B into the container alongside
10A. Technically, this action yields undefined behavior, but the nearly universal
outcome is that the set ends up with two copies of the value 10, and that means it's not
a set any longer. By using less_equal as our comparison type, we've corrupted the
container! Furthermore, any comparison function where equal values return true will
do the same thing. Equal values are, by definition, not equivalent!
There is indeed a problem.
The comparison function should satisfy strict weak ordering which <= does not.