Given a partially ordered set, remove all smaller items - clojure

I'm struggling to find a beautiful, idiomatic way to write a function
(defn remove-smaller
[coll partial-order-fn]
___
)
where partial-order-fn takes two arguments and return -1 0 or 1 is they are comparable (resp. smaller, equal, bigger) or nil otherwise.
The result of remove-smaller should be coll, with all items that are smaller than any other item in coll are removed.
Example: If we defined a partial order such as numbers are compared normally, letters too, but a letter and a number are not comparable:
1 < 2 a < t 2 ? a
Then we would have:
(remove-smaller [1 9 a f 3 4 z])
==> [9 z]

(defn partial-compare [x y]
(when (= (type x) (type y))
(compare x y)))
(defn remove-smaller [coll partial-order-fn]
(filter
(fn [x] (every? #(let [p (partial-order-fn x %)]
(or (nil? p) (>= p 0)))
coll))
coll))
(defn -main []
(remove-smaller [1 9 \a \f 3 4 \z] partial-compare))
This outputs (9 \z), which is correct unless you want the return value to be of the same type as coll.

In practice I might just use tom's answer, since no algorithm can guarantee better than O(n^2) worst-case performance and it's easy to read. But if performance matters, choosing an algorithm that is always n^2 isn't good if you can avoid it; the below solution avoids re-iterating over any items which are known not to be maxes, and therefore can be as good as O(n) if the set turns out to actually be totally ordered. (of course, this relies on transitivity of the ordering relation, but since you call this a partial order that's implied)
(defn remove-smaller [cmp coll]
(reduce (fn [maxes x]
(let [[acc keep-x]
,,(reduce (fn [[acc keep-x] [max diff]]
(cond (neg? diff) [(conj acc max) false]
(pos? diff) [acc keep-x]
:else [(conj acc max) keep-x]))
[[] true], (map #(list % (or (cmp x %) 0))
maxes))]
(if keep-x
(conj acc x)
acc)))
(), coll))

(def data [1 9 \a \f 3 4 \z])
(defn my-fn [x y]
(when (= (type x) (type y))
(compare x y)))
(defn remove-smaller [coll partial-order-fn]
(mapv #(->> % (sort partial-order-fn) last) (vals (group-by type data))))
(remove-smaller data my-fn)
;=> [9 \z]
Potentially the order of the remaining items might differ to the input collection, but there is no order between the equality 'partitions'

Related

Sum equal adjacent integers

Test case:
(def coll [1 2 2 3 4 4 4 5])
(def p-coll (partition 2 1 coll))
;; ((1 2) (2 2) (2 3) (3 4) (4 4) (4 4) (4 5))
Expected output:
(2 2 4 4 4) => 16
Here is what I am to implement: Start with vector v [0]. Take each pair, if the first element of the pair is equal to the last element of the vector, or if the elements of the pair are equal, add the first item of the pair to v. (And finally reduce v to its sum.) The code below can do if the elements of the pair are equal part, but not the first part. (Thus I get (0 2 4 4). I guess the reason is that the elements are added to v at the very end. My questions:
What is the way to compare an element with the last selected element?
What other idiomatic ways are there to implement what I am trying achieve?
How else can I approach the problem?
(let [s [0]]
(concat s (map #(first %)
(filter #(or (= (first %) (first s)) (= (first %) (second %))) p-coll))))
You are on the right track with partitioning the data here. But there
is a nicer way to do that. You can use (partition-by identity coll)
to group consecutive, same elements.
Then just keep the ones with more than one elements and sum them all up.
E.g.
(reduce
(fn [acc xs]
(+ acc (apply + xs)))
0
(filter
next
(partition-by identity coll)))
Starting out from your initial partition, with p-coll being like you described above (i.e. a list of pairs), and v being the vector [0], you can do the following:
(reduce
(fn [vect [a b]]
(if (or (= a b) (= a (last vect)))
(conj vect a)
vect))
v p-coll)
;; => [0 2 2 4 4 4]
We start from the vector [0], and reduce p-coll by processing its elements one by one. If an element satisfies one of the two conditions you specified, then we conj it onto the initial vector. Otherwise, we leave the vector as is.
Finally, you can use apply + to sum the resulting vector and get your final answer.
Generally, when you need to process a collection (here, p-coll) and some partial answer (here, the vector v) into some sort of final answer (here, the vector [0 2 2 4 4 4]), reduce is the most idiomatic and purely functional approach. After having identified those components, it's just a matter of coming up with the appropriate function to put them together.
Another approach (less idiomatic, but easier to understand from a procedural standpoint) would be to use an atom for the vector v, and keep growing it as you process the list with doseq:
(def v (atom [0]))
(doseq [[a b] p-coll]
(if (or (= a b) (= a (last #v)))
(swap! v conj a)))
(println #v)
;; => [0 2 2 4 4 4]
A solution only with flatten and map:
(defn consecutives [lst]
(flatten (map (fn [[x y z]] (cond (= x y z) [z]
(= y z) [y z]
:else []))
(map #'vector lst (rest lst) (rest (rest lst))))))
Purely tail-recursive solution
which "keeps in memory" previous and previous-previous element.
(defn consecutives
[lst]
(loop [lst lst
acc []
last-val nil
last-last-val nil]
(cond (empty? lst) acc
:else (recur (rest lst)
(if (= (first lst) last-val)
(conj (if (= last-val last-last-val)
acc
(conj acc last-val))
(first lst))
acc)
(first lst)
last-val))))
(consecutives coll)
;; => [2 2 4 4 4]

How to make reduce more readable in Clojure?

A reduce call has its f argument first. Visually speaking, this is often the biggest part of the form.
e.g.
(reduce
(fn [[longest current] x]
(let [tail (last current)
next-seq (if (or (not tail) (> x tail))
(conj current x)
[x])
new-longest (if (> (count next-seq) (count longest))
next-seq
longest)]
[new-longest next-seq]))
[[][]]
col))
The problem is, the val argument (in this case [[][]]) and col argument come afterward, below, and it's a long way for your eyes to travel to match those with the parameters of f.
It would look more readable to me if it were in this order instead:
(reduceb val col
(fn [x y]
...))
Should I implement this macro, or am I approaching this entirely wrong in the first place?
You certainly shouldn't write that macro, since it is easily written as a function instead. I'm not super keen on writing it as a function, either, though; if you really want to pair the reduce with its last two args, you could write:
(-> (fn [x y]
...)
(reduce init coll))
Personally when I need a large function like this, I find that a comma actually serves as a good visual anchor, and makes it easier to tell that two forms are on that last line:
(reduce (fn [x y]
...)
init, coll)
Better still is usually to not write such a large reduce in the first place. Here you're combining at least two steps into one rather large and difficult step, by trying to find all at once the longest decreasing subsequence. Instead, try splitting the collection up into decreasing subsequences, and then take the largest one.
(defn decreasing-subsequences [xs]
(lazy-seq
(cond (empty? xs) []
(not (next xs)) (list xs)
:else (let [[x & [y :as more]] xs
remainder (decreasing-subsequences more)]
(if (> y x)
(cons [x] remainder)
(cons (cons x (first remainder)) (rest remainder)))))))
Then you can replace your reduce with:
(apply max-key count (decreasing-subsequences xs))
Now, the lazy function is not particularly shorter than your reduce, but it is doing one single thing, which means it can be understood more easily; also, it has a name (giving you a hint as to what it's supposed to do), and it can be reused in contexts where you're looking for some other property based on decreasing subsequences, not just the longest. You can even reuse it more often than that, if you replace the > in (> y x) with a function parameter, allowing you to split up into subsequences based on any predicate. Plus, as mentioned it is lazy, so you can use it in situations where a reduce of any sort would be impossible.
Speaking of ease of understanding, as you can see I misunderstood what your function is supposed to do when reading it. I'll leave as an exercise for you the task of converting this to strictly-increasing subsequences, where it looked to me like you were computing decreasing subsequences.
You don't have to use reduce or recursion to get the descending (or ascending) sequences. Here we are returning all the descending sequences in order from longest to shortest:
(def in [3 2 1 0 -1 2 7 6 7 6 5 4 3 2])
(defn descending-sequences [xs]
(->> xs
(partition 2 1)
(map (juxt (fn [[x y]] (> x y)) identity))
(partition-by first)
(filter ffirst)
(map #(let [xs' (mapcat second %)]
(take-nth 2 (cons (first xs') xs'))))
(sort-by (comp - count))))
(descending-sequences in)
;;=> ((7 6 5 4 3 2) (3 2 1 0 -1) (7 6))
(partition 2 1) gives every possible comparison and partition-by allows you to mark out the runs of continuous decreases. At this point you can already see the answer and the rest of the code is removing the baggage that is no longer needed.
If you want the ascending sequences instead then you only need to change the < to a >:
;;=> ((-1 2 7) (6 7))
If, as in the question, you only want the longest sequence then put a first as the last function call in the thread last macro. Alternatively replace the sort-by with:
(apply max-key count)
For maximum readability you can name the operations:
(defn greatest-continuous [op xs]
(let [op-pair? (fn [[x y]] (op x y))
take-every-second #(take-nth 2 (cons (first %) %))
make-canonical #(take-every-second (apply concat %))]
(->> xs
(partition 2 1)
(partition-by op-pair?)
(filter (comp op-pair? first))
(map make-canonical)
(apply max-key count))))
I feel your pain...they can be hard to read.
I see 2 possible improvements. The simplest is to write a wrapper similar to the Plumatic Plumbing defnk style:
(fnk-reduce { :fn (fn [state val] ... <new state value>)
:init []
:coll some-collection } )
so the function call has a single map arg, where each of the 3 pieces is labelled & can come in any order in the map literal.
Another possibility is to just extract the reducing fn and give it a name. This can be either internal or external to the code expression containing the reduce:
(let [glommer (fn [state value] (into state value)) ]
(reduce glommer #{} some-coll))
or possibly
(defn glommer [state value] (into state value))
(reduce glommer #{} some-coll))
As always, anything that increases clarity is preferred. If you haven't noticed already, I'm a big fan of Martin Fowler's idea of Introduce Explaining Variable refactoring. :)
I will apologize in advance for posting a longer solution to something where you wanted more brevity/clarity.
We are in the new age of clojure transducers and it appears a bit that your solution was passing the "longest" and "current" forward for record-keeping. Rather than passing that state forward, a stateful transducer would do the trick.
(def longest-decreasing
(fn [rf]
(let [longest (volatile! [])
current (volatile! [])
tail (volatile! nil)]
(fn
([] (rf))
([result] (transduce identity rf result))
([result x] (do (if (or (nil? #tail) (< x #tail))
(if (> (count (vswap! current conj (vreset! tail x)))
(count #longest))
(vreset! longest #current))
(vreset! current [(vreset! tail x)]))
#longest)))))))
Before you dismiss this approach, realize that it just gives you the right answer and you can do some different things with it:
(def coll [2 1 10 9 8 40])
(transduce longest-decreasing conj coll) ;; => [10 9 8]
(transduce longest-decreasing + coll) ;; => 27
(reductions (longest-decreasing conj) [] coll) ;; => ([] [2] [2 1] [2 1] [2 1] [10 9 8] [10 9 8])
Again, I know that this may appear longer but the potential to compose this with other transducers might be worth the effort (not sure if my airity 1 breaks that??)
I believe that iterate can be a more readable substitute for reduce. For example here is the iteratee function that iterate will use to solve this problem:
(defn step-state-hof [op]
(fn [{:keys [unprocessed current answer]}]
(let [[x y & more] unprocessed]
(let [next-current (if (op x y)
(conj current y)
[y])
next-answer (if (> (count next-current) (count answer))
next-current
answer)]
{:unprocessed (cons y more)
:current next-current
:answer next-answer}))))
current is built up until it becomes longer than answer, in which case a new answer is created. Whenever the condition op is not satisfied we start again building up a new current.
iterate itself returns an infinite sequence, so needs to be stopped when the iteratee has been called the right number of times:
(def in [3 2 1 0 -1 2 7 6 7 6 5 4 3 2])
(->> (iterate (step-state-hof >) {:unprocessed (rest in)
:current (vec (take 1 in))})
(drop (- (count in) 2))
first
:answer)
;;=> [7 6 5 4 3 2]
Often you would use a drop-while or take-while to short circuit just when the answer has been obtained. We could so that here however there is no short circuiting required as we know in advance that the inner function of step-state-hof needs to be called (- (count in) 1) times. That is one less than the count because it is processing two elements at a time. Note that first is forcing the final call.
I wanted this order for the form:
reduce
val, col
f
I was able to figure out that this technically satisfies my requirements:
> (apply reduce
(->>
[0 [1 2 3 4]]
(cons
(fn [acc x]
(+ acc x)))))
10
But it's not the easiest thing to read.
This looks much simpler:
> (defn reduce< [val col f]
(reduce f val col))
nil
> (reduce< 0 [1 2 3 4]
(fn [acc x]
(+ acc x)))
10
(< is shorthand for "parameters are rotated left"). Using reduce<, I can see what's being passed to f by the time my eyes get to the f argument, so I can just focus on reading the f implementation (which may get pretty long). Additionally, if f does get long, I no longer have to visually check the indentation of the val and col arguments to determine that they belong to the reduce symbol way farther up. I personally think this is more readable than binding f to a symbol before calling reduce, especially since fn can still accept a name for clarity.
This is a general solution, but the other answers here provide many good alternative ways to solve the specific problem I gave as an example.

Make (map f c1 c2) map (count c1) times, even if c2 has less elements

When doing
(map f [0 1 2] [:0 :1])
f will get called twice, with the arguments being
0 :0
1 :1
Is there a simple yet efficient way, i.e. without producing more intermediate sequences etc., to make f get called for every value of the first collection, with the following arguments?
0 :0
1 :1
2 nil
Edit Addressing question by #fl00r in the comments.
The actual use case that triggered this question needed map to always work exactly (count first-coll) times, regardless if the second (or third, or ...) collection was longer.
It's a bit late in the game now and somewhat unfair after having accepted an answer, but if a good answer gets added that only does what I specifically asked for - mapping (count first-coll) times - I would accept that.
You could do:
(map f [0 1 2] (concat [:0 :1] (repeat nil)))
Basically, pad the second coll with an infinite sequence of nils. map stops when it reaches the end of the first collection.
An (eager) loop/recur form that walks to end of longest:
(loop [c1 [0 1 2] c2 [:0 :1] o []]
(if (or (seq c1) (seq c2))
(recur (rest c1) (rest c2) (conj o (f (first c1) (first c2))))
o))
Or you could write a lazy version of map that did something similar.
A general lazy version, as suggested by Alex Miller's answer, is
(defn map-all [f & colls]
(lazy-seq
(when-not (not-any? seq colls)
(cons
(apply f (map first colls))
(apply map-all f (map rest colls))))))
For example,
(map-all vector [0 1 2] [:0 :1])
;([0 :0] [1 :1] [2 nil])
You would probably want to specialise map-all for one and two collections.
just for fun
this could easily be done with common lisp's do macro. We could implement it in clojure and do this (and much more fun things) with it:
(defmacro cl-do [clauses [end-check result] & body]
(let [clauses (map #(if (coll? %) % (list %)) clauses)
bindings (mapcat (juxt first second) clauses)
nexts (map #(nth % 2 (first %)) clauses)]
`(loop [~#bindings]
(if ~end-check
~result
(do
~#body
(recur ~#nexts))))))
and then just use it for mapping (notice it can operate on more than 2 colls):
(defn map-all [f & colls]
(cl-do ((colls colls (map next colls))
(res [] (conj res (apply f (map first colls)))))
((every? empty? colls) res)))
in repl:
user> (map-all vector [1 2 3] [:a :s] '[z x c v])
;;=> [[1 :a z] [2 :s x] [3 nil c] [nil nil v]]

Clojure: Find even numbers in a vector

I am coming from a Java background trying to learn Clojure. As the best way of learning is by actually writing some code, I took a very simple example of finding even numbers in a vector. Below is the piece of code I wrote:
`
(defn even-vector-2 [input]
(def output [])
(loop [x input]
(if (not= (count x) 0)
(do
(if (= (mod (first x) 2) 0)
(do
(def output (conj output (first x)))))
(recur (rest x)))))
output)
`
This code works, but it is lame that I had to use a global symbol to make it work. The reason I had to use the global symbol is because I wanted to change the state of the symbol every time I find an even number in the vector. let doesn't allow me to change the value of the symbol. Is there a way this can be achieved without using global symbols / atoms.
The idiomatic solution is straightfoward:
(filter even? [1 2 3])
; -> (2)
For your educational purposes an implementation with loop/recur
(defn filter-even [v]
(loop [r []
[x & xs :as v] v]
(if (seq v) ;; if current v is not empty
(if (even? x)
(recur (conj r x) xs) ;; bind r to r with x, bind v to rest
(recur r xs)) ;; leave r as is
r))) ;; terminate by not calling recur, return r
The main problem with your code is you're polluting the namespace by using def. You should never really use def inside a function. If you absolutely need mutability, use an atom or similar object.
Now, for your question. If you want to do this the "hard way", just make output a part of the loop:
(defn even-vector-3 [input]
(loop [[n & rest-input] input ; Deconstruct the head from the tail
output []] ; Output is just looped with the input
(if n ; n will be nil if the list is empty
(recur rest-input
(if (= (mod n 2) 0)
(conj output n)
output)) ; Adding nothing since the number is odd
output)))
Rarely is explicit looping necessary though. This is a typical case for a fold: you want to accumulate a list that's a variable-length version of another list. This is a quick version:
(defn even-vector-4 [input]
(reduce ; Reducing the input into another list
(fn [acc n]
(if (= (rem n 2) 0)
(conj acc n)
acc))
[] ; This is the initial accumulator.
input))
Really though, you're just filtering a list. Just use the core's filter:
(filter #(= (rem % 2) 0) [1 2 3 4])
Note, filter is lazy.
Try
#(filterv even? %)
if you want to return a vector or
#(filter even? %)
if you want a lazy sequence.
If you want to combine this with more transformations, you might want to go for a transducer:
(filter even?)
If you wanted to write it using loop/recur, I'd do it like this:
(defn keep-even
"Accepts a vector of numbers, returning a vector of the even ones."
[input]
(loop [result []
unused input]
(if (empty? unused)
result
(let [curr-value (first unused)
next-result (if (is-even? curr-value)
(conj result curr-value)
result)
next-unused (rest unused) ]
(recur next-result next-unused)))))
This gets the same result as the built-in filter function.
Take a look at filter, even? and vec
check out http://cljs.info/cheatsheet/
(defn even-vector-2 [input](vec(filter even? input)))
If you want a lazy solution, filter is your friend.
Here is a non-lazy simple solution (loop/recur can be avoided if you apply always the same function without precise work) :
(defn keep-even-numbers
[coll]
(reduce
(fn [agg nb]
(if (zero? (rem nb 2)) (conj agg nb) agg))
[] coll))
If you like mutability for "fun", here is a solution with temporary mutable collection :
(defn mkeep-even-numbers
[coll]
(persistent!
(reduce
(fn [agg nb]
(if (zero? (rem nb 2)) (conj! agg nb) agg))
(transient []) coll)))
...which is slightly faster !
mod would be better than rem if you extend the odd/even definition to negative integers
You can also replace [] by the collection you want, here a vector !
In Clojure, you generally don't need to write a low-level loop with loop/recur. Here is a quick demo.
(ns tst.clj.core
(:require
[tupelo.core :as t] ))
(t/refer-tupelo)
(defn is-even?
"Returns true if x is even, otherwise false."
[x]
(zero? (mod x 2)))
; quick sanity checks
(spyx (is-even? 2))
(spyx (is-even? 3))
(defn keep-even
"Accepts a vector of numbers, returning a vector of the even ones."
[input]
(into [] ; forces result into vector, eagerly
(filter is-even? input)))
; demonstrate on [0 1 2...9]
(spyx (keep-even (range 10)))
with result:
(is-even? 2) => true
(is-even? 3) => false
(keep-even (range 10)) => [0 2 4 6 8]
Your project.clj needs the following for spyx to work:
:dependencies [
[tupelo "0.9.11"]

Map a function on every two elements of a list

I need a function that maps a function only on every other element, e.g.
(f inc '(1 2 3 4))
=> '(2 2 4 4)
I came up with:
(defn flipflop [f l]
(loop [k l, b true, r '()]
(if (empty? k)
(reverse r)
(recur (rest k)
(not b)
(conj r (if b
(f (first k))
(first k)))))))
Is there a prettier way to achieve this ?
(map #(% %2)
(cycle [f identity])
coll)
It's a good idea to look at Clojure's higher level functions before using loop and recur.
user=> (defn flipflop
[f coll]
(mapcat #(apply (fn ([a b] [(f a) b])
([a] [(f a)]))
%)
(partition-all 2 coll)))
#'user/flipflop
user=> (flipflop inc [1 2 3 4])
(2 2 4 4)
user=> (flipflop inc [1 2 3 4 5])
(2 2 4 4 6)
user=> (take 11 (flipflop inc (range))) ; demonstrating laziness
(1 1 3 3 5 5 7 7 9 9 11)
this flipflop doesn't need to reverse the output, it is lazy, and I find it much easier to read.
The function uses partition-all to split the list into pairs of two items, and mapcat to join a series of two element sequences from the calls back into a single sequence.
The function uses apply, plus multiple arities, in order to handle the case where the final element of the partitioned collection is a singleton (the input was odd in length).
also, since you want to apply the function to some specific indiced items in the collection (even indices in this case) you could use map-indexed, like this:
(defn flipflop [f coll]
(map-indexed #(if (even? %1) (f %2) %2) coll))
Whereas amalloy's solution is the one, you could simplify your loop - recur solution a bit:
(defn flipflop [f l]
(loop [k l, b true, r []]
(if (empty? k)
r
(recur (rest k)
(not b)
(conj r ((if b f identity) (first k)))))))
This uses couple of common tricks:
If an accumulated list comes out in the wrong order, use a vector
instead.
Where possible, factor out common elements in a conditional.