What is Clojure's equivalent of Scala's maxBy function? - clojure

Scala's TraversableOnce has maxBy:
maxBy[B](f: (A) ⇒ B)(implicit cmp: Ordering[B]): A
Finds the first element which yields the largest value measured by function f.
Does Clojure have something similar?

The closest thing seems to be max-key:
(max-key k x) (max-key k x y) (max-key k x y & more)
Returns the x for which (k x), a number, is greatest.
The name makes it sound like it only works with maps, but k can be any function.
The only thing missing is that k must return a number, whereas Scala's can handle anything with an Ordering instance.

Related

How can I maintain a counter when using map on a list? [duplicate]

Horner's rule is used to simplify the process of evaluating a polynomial at specific variable values. https://rosettacode.org/wiki/Horner%27s_rule_for_polynomial_evaluation#Standard_ML
I've easily applied the method using SML, to a one variable polynomial, represented as an int list:
fun horner coeffList x = foldr (fn (a, b) => a + b * x) (0.0) coeffList
This works fine. We can then call it using:
- val test = horner [1.0, 2.0, 3.0] 2.0;
> val test = 17.0 : real
Where [1.0, 2.0, 3.0] is the list representing the polynomial coefficients, 2.0 is the value of the variable x, and 17.0 is the result of evaluating the polynomial.
My problem is as such: We have a two variable polynomial represented by an (int list list). The nth item in a high-level list will represent all the polynomial terms containing y^n, and the mth item in a low-level list will represent all the polynomial terms containing x^m.
For example: [[2],[3,0,0,3],[1,2]] is the polynomial
( 2(x^0)(y^0) ) +
( 3(x^0)(y^1) + 0(x^1)(y^1) + 0(x^2)(y^1) + 3(x^3)(y^1) ) +
( 1(x^0)(y^2) + 2(x^1)(y^2) )
The function needs to return the value of the polynomial at the specified x and y.
I've tried various methods using the mlton compiler.
First I tried a nested foldr function:
fun evalXY (z::zs) x y =
foldr
(fn (s, li:list) =>
s + ((foldr (fn(a, b) => a + b*x) 0 li)*y)
)
0
z:zs
You can see that I'm trying to use "s" as an accumulator, like "a" was used in the single variable example. Since each element being processed by foldr needs to be "foldr'ed" itself, i call foldr again in the function describing the outer foldr. I know hat this inner foldr works fine, I proved it above. *My problem seems to be that I cant access the element of the list that the outer foldr is on to pass that list into the inner foldr. >See where I use li in the inner foldr, thats my issue. *
Then i tried applying my single variable function to map. I came across the same issue:
fun evalXY (z::zs) x y =
map
(foldr (fn(a, b) => a + b*x) 0 ???)
z:zs
*With this attempt, i know that im getting back a list of ints. I put in an int list list, in which the inner lists were processed and returned to the outer list as ints by foldr. After this i would foldr again to apply the y value to the polynomial.
The function here should look like :: fn evalXY : (int list list) * int * int) -> ... -> int list *
I am new to SML, so maybe i'm missing something fundamental here. I know this is a functional programming language, so I'm trying to accumulate values instead of altering different variables,
You're very close. Let's begin by formalizing the problem. Given coefficients C as a nested list like you indicated, you want to evaluate
Notice that you can pull out the s from the inner sum, to get
Look closely at the inner sum. This is just a polynomial on variable x with coefficients given by . In SML, we can write the inner sum in terms of your horner function as
fun sumj Ci = horner Ci x
Let's go a step further and define
In SML, this is val D = map sumj C. We can now write the original polynomial in terms of D:
It should be clear that this is just another instance of horner, since we have a polynomial with coefficients . In SML, the value of this polynomial is
horner D y
...and we're done!
Here's the final code:
fun horner2 C x y =
let
fun sumj Ci = horner Ci x
val D = map sumj C
in
horner D y
end
Isn't that nice? All we need is multiple applications of Horner's method, and map.
Your second approach seems to be on the right track. If you have already defined horner, what you need to do is to apply horner to the result of mapping horner applied to inner list x over the outer list, something like:
fun evalXY coeffLists x y = horner (map (fn coeffList => horner coeffList x) coeffLists) y
You could replace the two calls to horner by the corresponding folds, but it would be much less readable.
Note that if you reverse the order of the two parameters in horner then you can shorted evalXY:
fun horner x coeffList = foldr (fn (a, b) => a + b * x) (0.0) coeffList
fun evalXY x y coeffLists = horner y (map (horner x) coeffLists)
The point being that the way currying works, if you use this second order then horner x is already a function of coeffList so you no longer need the anonymous function fn coeffList => horner coeffList x. The moral of the story is that when defining a curried function, you should think carefully about the order of the parameters since it will make some partial applications easier to create than others.
By the way, SML is fussy. In your discussion of horner you said that you would call it like horner list 2. It would need to be horner list 2.0. Similarly, in your second attempt, using 0 rather than 0.0 is problematic.

If statements in Racket

I am trying to construct a function "number-crop" which takes three arguments x a b. If x is to the left of the closed interval [a, b] on the number line, then return a. If x is to the right of the interval, then return b. Otherwise, just return x. This is what I have:
(define (number-crop x a b)
(if (max x a b) x b)
(if (min x a b) x a))
I am returned with the error, "define: expected only one expression for the function body, but found 1 extra part". I am new to Racket so I am still trying to understand how if statements work within the language.
Scheme/Racket if expressions always have exactly one condition and exactly two branches. Since they are expressions, not statements, this makes them very useful, and they function much like the conditional “ternary” operator in languages in the C family. However, when you have multiple conditions, you likely want something closer to if...else if chains, which is provided via the cond form.
The cond form is just like if, except with the ability to have any number of “clauses” which are each determined by a single condition. Using cond, your number-crop function would look like this:
(define (number-crop x a b)
(cond
[(< x a) a]
[(> x b) b]
[else x]))
(Note that else is special inside of cond—it replaces the condition for the last clause, and it always runs if the other cases fail.)
This would likely work, but if you already have access to min and max, you don’t actually need to branch at all! If you use those two functions together, you can write number-crop with neither if nor cond:
(define (number-crop x a b)
(min (max a x) b))
This works because composing both min and max will effectively clamp the value within the given range, provided that a is always the minimum and b is always the maximum.
In Scheme (Racket), functions are defined to return one thing. In your case it is clear: the result of the operation you describe. However, Scheme is different from most imperative languages in several respects. For example, if you look at your expression inside the define, it contains two expressions, one after the other. This contradicts the "one expression that calculates the function" assumption in Scheme.
Moreover, even if you write it in an imperative language, you'd use nested ifs, that you can of course use here. Something along the lines of:
(define (number-crop x a b)
(if (= x (max x a b))
b
(if (= x (min x a b))
a
x)))

Remove real element from list - SML

I have written the following code:
fun remove_element(nil, elem) = raise Empty
| remove_element(hd::tl, elem) = if(hd=elem) then tl else hd::remove_element(tl, elem);
but that function (which removed element elem from list) works for int. I need to make it work for real numbers, but I can't do it. I have tried a lot of ways of rewriting the function and also I used :real but these bring me errors.
Any suggestions?
Thank you
The accepted answer should have allowed you to finish your assignment, so I will show two other approaches for variations of your problem without worrying about doing your homework for you. As Kevin Johnson said, it isn't possible to directly compare two reals. It is possible to do so indirectly since a=b if and only if a<=b and b<=a. Often this is a bug, especially if the list in question is of numbers produced by numerical computations. But -- there are some situations where it makes sense to compare reals for equality so you should certainly be able to do so as long as you are clear that this is what you want. This leads to the following modification of your code:
fun remove_real([],x:real) = []
| remove_real(y::ys,x) =
if (y <= x andalso y >= x) then
remove_real(ys,x)
else
y::remove_real(ys,x);
A few points:
1) I changed it to remove all occurrences of the element from the list rather than just the first occurrence. This involved changing the basis case to returning the empty list since [] with y removed is just [] rather than an error situation. Also, rather than simply returning the tail if the element is found I return the recursive call applied to the tail to remove any additional occurrences later on. You could easily modify the code to make it closer to your original code.
2) I needed to put the explicit type annotation x:real so that SML could infer that the list was of type real list rather than type int list.
3) I replaced nil by [] for aesthetic reasons
4) I replaced your pattern hd::tl by y::ys. For one thing, hd and tl are built-in functions -- I see no reason to bind those identifiers to anything else, even if it is just local to a function definition. For another thing, the less visual clutter in a pattern the better.
5) I made more use of white space. Partially a matter of taste, but I think that fairly complicated clauses (like your second line) should be split across multiple lines.
If you want to go the route of including an error tolerance for comparing reals, I think that it makes most sense to include the tolerance as an explicit parameter. I find |x-y| < e to be more natural than two inequalities. Unfortunately, the built-in abs only applies to ints. If x - y is real then the expression
if x - y < 0.0 then y - x else x - y
returns the absolute value of x - y (it flips the sign in the case that it is neagative). As an added bonus -- the comparison with 0.0 rather than 0 is all that SML needs to infer the type. This leads to:
fun remove_elem([],x,tol) = []
| remove_elem(y::ys,x,tol) =
if (if x - y < 0.0 then y - x else x - y) < tol then
remove_elem(ys,x,tol)
else
y::remove_elem(ys,x,tol);
Typical output:
- remove_real([2.0, 3.1, 3.14, 3.145, 3.14], 3.14);
val it = [2.0,3.1,3.145] : real list
- remove_elem([2.0, 3.1, 3.14, 3.145, 3.14], 3.14,0.01);
val it = [2.0,3.1] : real list
- remove_elem([2.0, 3.1, 3.14, 3.145, 3.14], 3.14,0.001);
val it = [2.0,3.1,3.145] : real list
The issue is here: hd=elem
In languages like ML and Javascript, you cannot directly compare two reals as reals are bound to rounding errors.
You have to use a lambda range and define an interval instead. elem - lambda < hd andalso elem + lambda > hd

Clojure: how is map different from comp?

map takes a function and a list and applies the function to every element of the list. e.g.,
(map f [x1 x2 x3])
;= [(f x1) (f x2) (f x3)]
Mathematically, a list is a partial function from the natural numbers ℕ. If x : ℕ → X is some list, and f : X → Y is some function, then map takes the pair (f, x) to the list f○x : ℕ → Y. Therefore, map and comp return the same value, at least in the simple case.
However, when we apply map with more than one argument, there's something more complex going on. Consider the example:
(map f [x1 x2 x3] [y1 y2 y3])
;= [(f x1 y1) (f x2 y2) (f x3 y3)]
Here, we have two lists x : ℕ → X and y : ℕ → Y with the same domain, and a function of type f : X → (Y → Z). In order to evaluate on the tuple (f, x, y), map has to do some more work behind the scenes.
First, map constructs the diagonal product list diag(x, y) : ℕ → X × Y, which is defined by diag(x, y)(n) = (x(n), y(n)).
Second, map uncurries the function to curry-1(f) : X × Y → Z. Finally, map composes these operations to get curry-1(f) ○ diag(x, y) : ℕ → Z.
My question is: does this pattern generalize? Namely, suppose that we have three lists x : ℕ → X, y : ℕ → Y and z : ℕ → Z, and a function f : X → (Y → (Z → W))). Does map send the tuple (f, x, y, z) to the list curry-2(f) ○ diag(x, y, z) : ℕ → W?
It seems that the question title has little to do with the question actually asked in the body; I'll try to address both issues.
The Clojure side
As evidenced by examples like (map inc [1 2 3]) and (comp inc [1 2 3]) -- both of which, incidentally, make perfect sense in Clojure -- the Clojure functions map and comp operate completely differently even in the one sequence case. map simply does not treat its sequence arguments as functions in the software sense of callable objects, whereas comp treats all of its arguments in this way; map returns a compound datum, whereas comp does not; the value returned by comp is callable as a function, whereas map's returns values are not; etc.
(Other functional languages similarly have separate "map" and "compose" higher-order functions; in Haskell, these are map (and the more general fmap) and (.).)
Notably, map performs no actual in-memory tupling-up of arguments to its input function and does not apply any deschönfinkelizing / uncurrying transformation to the input function.
The mathematical side
The pattern does of course generalize fine, though it's worth keeping in mind that what's a function of what etc. -- under the hood of the model, as it were -- depends on the choice of representation which tends to be arbitrary. Finite sequences can be represented perfectly well as (total) functions with finite ordinals as domains, or as Kuratowski tuples, or in the way which you describe where you don't care about your lists not necessarily being "gapless" etc. Depending on the representational choices, the concept of natural numbers might not enter the picture at all, the objects representing lists may or may not look like functions whose codomain is a superset of the set of the list's entries etc.
I don't know if it helps, but:
Clojure doesn't have currying, like Haskell. It does have partial function application, but it's not the same as currying.
Clojure's map is more like zipWith, zipWith3, etc in Haskell
Technically yes, map could be viewed as composing functions like this, though in practice it introduces some overhead that comp does not.
map produces a lazy sequence that will compute a sequence when the result is finally read. So it it returns a sequence not strictly the result your type expression implies. It also adds the overhead of sequences and changes the evaluation order because it is lazy and chunked.

Defining an "arg max" like function over finite sets, and proving some of its properties, and avoiding a detour via lists

I'm working with a custom implementation of vectors as functions whose domain is a finite "index set" of natural numbers, and whose image is of some type on which one can define a maximum, usually real. E.g. I could have a two-dimensional vector v with v 1 = 2.7 and v 3 = 4.2.
On such vectors I'd like to define an "arg max" like operator, which tells me the index of the maximum component, 3 in the example of v above. I'm saying "the" index because the "arg max" like operator will additionally accept a tie-breaking function to be applied to components with values. (The background is bids in auctions.)
I know that Max on finite sets is defined using fold1 (of which I do not yet understand how it works). I tried this, which was accepted in itself, but then didn't work for the other things I wanted to do:
fun arg_max_tb :: "index_set ⇒ tie_breaker ⇒ (real vector) ⇒ nat"
where "arg_max_tb N t v = fold1
(λ x y . if (v x > v y) then x (* component values differ *)
else if (v x = v y ∧ t x y) then x (* tie-breaking needed *)
else y) N"
Note that furthermore I would like to prove certain properties of my "arg max" like operator, which will likely require induction. I know that there is the rule finite_ne_induct for induction over finite sets. OK, but I would also like to be able to define my operator in such a way that it can be evaluated (e.g. when trying with concrete finite sets), but evaluating
value "arg_max_tb {1::nat} (op >) (nth [27::real, 42])"
with expected return value 1 gives me the following error:
Wellsortedness error
(in code equation arg_max_tb ?n ?t ?v \equiv
fold1 (\lambda x y. if ord_real_inst.less_real (?v y) (?v x) then ...) ?n):
Type nat not of sort enum
No type arity nat :: enum
Therefore I resorted to converting my finite sets to lists. On lists I have been able to define the operator, and to prove some of its properties (can share the code if it's of interest) by induction using list_nonempty_induct.
The working list-based definition looks as follows:
fun arg_max_l_tb :: "(nat list) ⇒ tie_breaker ⇒ (real vector) ⇒ nat"
where "arg_max_l_tb [] t v = 0"
(* in practice we will only call the function
with lists of at least one element *)
| "arg_max_l_tb [x] t v = x"
| "arg_max_l_tb (x # xs) t v =
(let y = arg_max_l_tb xs t v in
if (v x > v y) then x (* component values differ *)
else if (v x = v y ∧ t x y) then x (* tie-breaking needed *)
else y)"
fun arg_max_tb :: "index_set ⇒ tie_breaker ⇒ (real vector) ⇒ nat"
where "arg_max_tb N t v = arg_max_l_tb (sorted_list_of_set N) t v"
I didn't succeed to directly define a function over the constructors of a finite set. The following doesn't work:
fun arg_max_tb :: "index_set ⇒ tie_breaker ⇒ (real vector) ⇒ participant"
where "arg_max_tb {} t b = 0"
| "arg_max_tb {x} t b = x"
| "arg_max_tb (insert x S) t b =
(let y = arg_max_tb S t b in
if (b x > b y) then x
else if (b x = b y ∧ t x y) then x
else y)"
It gives me the error message
Malformed definition:
Non-constructor pattern not allowed in sequential mode.
⋀t b. arg_max_tb {} t b = 0
Could this be because the list constructors are defined as a datatype, whereas finite sets are merely defined as an inductive scheme?
Whatever – do you know of a way of defining this function over finite sets? Either by writing it down directly, or by some fold-like utility function?
Folding over a finite set requires that the result is independent of the order in which the elements of the set are visited, because sets are unordered. Most lemmas about fold1 f therefore assume that the folding operation f is left-commutative, i.e., f a (f b x) = f b (f a x) for all a, b, x.
The function that you supply to fold1 in your first definition does not satisfy this because the tie-breaking function is an arbitrary predicate. For example, take the tie-breaking function %v v'. True. Hence, if you want to stick to this definition, you will have to find sufficient conditions on the tie-breaking first and thread this assumption through all your lemmata.
Your working solution based on a sorted list of the elements avoids this commutatitivity problem. Your last suggestion with pattern matching on {}, {x} and insert x S does not work for two reasons. First, fun can only pattern-match on datatype constructors, so you would have to use function instead; this explains the error message. But then, you also have to prove the equations do not overlap and you will therefore run into the same problem with commutativity again. Additionally, you will not be able to prove termination because S might be infinite.
The well-sortedness error for code generation comes from the setup for fold1. fold1 f A is defined as THE x. fold1Set f A x where fold1Set f A x holds iff x is the result of folding f over A in some order of the elements. To check that all the results are the same, the generated code naively tests for all possible values of x whether fold1Set f A x holds. If it indeed finds just one such value, then it returns that value. Otherwise, it raises an exception. In your case, x is an index, i.e., of type nat which infinitely many values inhabit. Hence, exhaustive testing is not possible. Technically, this translates as nat not being an instance of the type class enum.
Normally, you derive specialised code equations for everything that you define in terms of fold1. See the code generator tutorial on program refinement.
This question really consists of multiple questions.
Defining a function on finite sets
fold / foldl1
The usual recursion combinator is Finite_Set.fold (or fold1). However, to be able to prove anything fold f z S, the result must be independent of the order f is applied to the elements of S.
If f is associative and commutative, you can use Finite_Set.ab_semigroup_mult.fold1_insert and Finite_Set.fold1_singleton to get simp rules for fold1 f S and you should be able to use finite_ne_induct as your induction rule.
Note that the function (I'll call it f) you give to fold1 is only commutative if t is a linear order:
fun arg_max_tb :: "index_set ⇒ tie_breaker ⇒ (real vector) ⇒ nat"
where "arg_max_tb N t v = fold1
(λ x y . if (v x > v y) then x (* component values differ *)
else if (v x = v y ∧ t x y) then x (* tie-breaking needed *)
else y) N"
This is not covered by the existing lemmas on fold1, so you either need to prove a generalized variant of Finite_Set.ab_semigroup_mult.fold1_insert or insert an additional tie-breaker, e.g.
else if (v x = v y ∧ ~t x y ∧ ~t y x ∧ x < y) then x
If t is a linear order, you will be able to remove this additional tie-breaker from the simp rules. Note that this additional tie-breaker is basically what you get from using sorted_list_of_set.
THE / SOME
Your arg_max_tb selects one element of a list with certain properties. This can also be defined directly with the constructs THE x. P x or SOME x. P x (choice operators). The former selects the unique element satisfying the property P (if no unique element exists, the result is undefined), the latter selects some element satisfying the property P (if no such element exists, the result is undefined). Both work for infinite lists, too.
These are often preferable if you don't need executable code.
Getting an executable function
Functions defined by recursion (i.e. primrec, fun or function) are executable by default (if all functions used in their definition are executable, too). THE and SOME can in general only be executed for enumerable domains (this is the error message you got from value -- nat is not enumerable, as it is not finite).
However, you can always give an alternative definition of your function to the code generator. See the Tutorial on Function Definitions, in particular the section about refinement.
If you prefer the formulation with choice operators for proving, but also like your function to be executable, the easiest way might to prove that the definitions of arg_max_tb via choice and sorted_list_of_set are equivalent. Then you can use the [code_unfold] predicate to replace the definition by choice with the (executable) definition by sorted_list_of_set