Remove duplicate elements from two sequences

Remove duplicate elements from two sequences - clojure

I am wondering how to remove duplicate elements from two sequences and combine two sequences. For instance,
user=>(remove-dup [1 4 7 10 16] [2 7 18 4])
(1 2 10 18 16)
My code is:
(defn remove-dup [l1 l2]
(let [list (concat l1 l2)]
(loop [l list res '()]
(if (>= (second (first (frequencies l))) 2)
(recur (rest l) res)
(recur (rest l) (conj res (first (first l))))))))
But when I run the code, I got the error message:
IllegalArgumentException Don't know how to create ISeq from: java.lang.Long clojure.lang.RT.seqFrom (RT.java:528)
How can I fix this code. Thanks!

Your error is here:
(first (first l))
Remember, l is the sequence of all the elements you haven't handled yet. For instance, in the first iteration of the loop, l might look like this:
(1 4 7 10 16 2 7 18 4)
You can see from this that (first l) would be 1, so (first (first l)) would be trying to treat a number as a sequence, which doesn't work.
If you replace (first (first l)) with just (first l), you'll get a NullPointerException because you don't have a base case: what should you do when l is empty? You might do something like this (where ,,, is a placeholder for your current if expression):
(if (empty? l)
res
,,,)
However, if we try to use the method now, we still don't get the right result:
(remove-dup [1 4 7 10 16] [2 7 18 4])
;=> (4 18 7 2 16 10 1)
Hrm.
I could try to fiddle with your code some more to get it to work, but there's a better way to solve this problem. Since you're trying to remove duplicates and you don't care about order, the functions in clojure.set are the right tool for the job here. I would write remove-dup like this:
(require '[clojure.set :as set])
(defn remove-dup [c1 c2]
(let [[s1 s2] (map set [c1 c2])]
(seq (set/difference (set/union s1 s2) (set/intersection s1 s2)))))
Example:
(remove-dup [1 4 7 10 16] [2 7 18 4])
;=> (1 2 16 10 18)

there is a number of fatal errors in your code:
The thing that breaks it, is (first (first l)), since l is the list of numbers, it throws an error when you try to take first item of number.
But there are more important ones:
first of all, even if your code were correct, it doesn't have any case to break the loop, so it would probably lead to the infinite loop (or exception of some kind). Second is your total misunderstanding of the frequencies usage. You can't rely on the order of the frequencies results, since it returns unordered map (not to mention it is beind called in every loop iteration, which is really bad for preformance).
That's how i would do something like this with a single pass over collections in loop:
(defn unique [coll1 coll2]
(let [items (concat coll1 coll2)]
(loop [res #{}
seen #{}
[x & xs :as items] items]
(cond ;; if there are no items left to check, returning result
(empty? items) res
;; if we've already seen the first item of a coll, remove it from the resulting set
(seen x) (recur (disj res x) seen xs)
;; otherwise mark it as seen, and add it to the result set
:else (recur (conj res x) (conj seen x) xs)))))
in repl:
user> (unique [1 4 7 10 16] [2 7 18 4])
#{1 2 16 10 18}

(defn remove-dupl [l1 l2]
(let [rmdup (fn [l1 l2] (remove (set l1) l2))]
(concat (rmdup l1 l2) (rmdup l2 l1))))

Try this solution
(defn remove-dup [l1 l2]
(let [ls (concat l1 l2)]
(loop [l (frequencies ls) res '()]
(if (empty? l) res
(if (>= (second (first l)) 2)
(recur (rest l) res)
(recur (rest l) (cons (first (first l)) res)))))))

The others have found your errors. I'd like to look at what you are trying to do.
Given that
the order is not important and
you are removing duplicate elements
this is the set operation exclusive or (XOR).
It is not included in clojure.set. We can either, as Sam Estep does, define it in terms of the operations we have, or write it more directly ourselves:
(defn exclusive-or [sa sb]
(if (<= (count sa) (count sb))
(reduce
(fn [ans a]
(if (contains? sb a)
(disj ans a)
(conj ans a)))
sb
sa)
(recur sb sa)))
We can then define
(defn remove-dup [xs ys]dited
(exclusive-or (set xs) (set ys))
For example,
(remove-dup [1 4 7 10 16] [2 7 18 4]) ;#{1 2 10 16 18}
Edited to correct error in exclusive-or.

Related

Map a function on every two elements of a list

I need a function that maps a function only on every other element, e.g.
(f inc '(1 2 3 4))
=> '(2 2 4 4)
I came up with:
(defn flipflop [f l]
(loop [k l, b true, r '()]
(if (empty? k)
(reverse r)
(recur (rest k)
(not b)
(conj r (if b
(f (first k))
(first k)))))))
Is there a prettier way to achieve this ?

(map #(% %2)
(cycle [f identity])
coll)

It's a good idea to look at Clojure's higher level functions before using loop and recur.
user=> (defn flipflop
[f coll]
(mapcat #(apply (fn ([a b] [(f a) b])
([a] [(f a)]))
%)
(partition-all 2 coll)))
#'user/flipflop
user=> (flipflop inc [1 2 3 4])
(2 2 4 4)
user=> (flipflop inc [1 2 3 4 5])
(2 2 4 4 6)
user=> (take 11 (flipflop inc (range))) ; demonstrating laziness
(1 1 3 3 5 5 7 7 9 9 11)
this flipflop doesn't need to reverse the output, it is lazy, and I find it much easier to read.
The function uses partition-all to split the list into pairs of two items, and mapcat to join a series of two element sequences from the calls back into a single sequence.
The function uses apply, plus multiple arities, in order to handle the case where the final element of the partitioned collection is a singleton (the input was odd in length).

also, since you want to apply the function to some specific indiced items in the collection (even indices in this case) you could use map-indexed, like this:
(defn flipflop [f coll]
(map-indexed #(if (even? %1) (f %2) %2) coll))

Whereas amalloy's solution is the one, you could simplify your loop - recur solution a bit:
(defn flipflop [f l]
(loop [k l, b true, r []]
(if (empty? k)
r
(recur (rest k)
(not b)
(conj r ((if b f identity) (first k)))))))
This uses couple of common tricks:
If an accumulated list comes out in the wrong order, use a vector
instead.
Where possible, factor out common elements in a conditional.

Clojure: map function with updatable state

What is the best way of implementing map function together with an updatable state between applications of function to each element of sequence? To illustrate the issue let's suppose that we have a following problem:
I have a vector of the numbers. I want a new sequence where each element is multiplied by 2 and then added number of 10's in the sequence up to and including the current element. For example from:
[20 30 40 10 20 10 30]
I want to generate:
[40 60 80 21 41 22 62]
Without adding the count of 10 the solution can be formulated using a high level of abstraction:
(map #(* 2 %) [20 30 40 10 20 10 30])
Having count to update forced me to "go to basic" and the solution I came up with is:
(defn my-update-state [x state]
(if (= x 10) (+ state 1) state)
)
(defn my-combine-with-state [x state]
(+ x state))
(defn map-and-update-state [vec fun state update-state combine-with-state]
(when-not (empty? vec)
(let [[f & other] vec
g (fun f)
new-state (update-state f state)]
(cons (combine-with-state g new-state) (map-and-update-state other fun new-state update-state combine-with-state))
)))
(map-and-update-state [20 30 40 50 10 20 10 30 ] #(* 2 %) 0 my-update-state my-combine-with-state )
My question: is it the appropriate/canonical way to solve the problem or I overlooked some important concepts/functions.
PS:
The original problem is walking AST (abstract syntax tree) and generating new AST together with updating symbol table, so when proposing the solution to the problem above please keep it in mind.
I do not worry about blowing up stack, so replacement with loop+recur is not
my concern here.
Is using global Vars or Refs instead of passing state as an argument a definite no-no?

You can use reduce to accumulate a pair of the number of 10s seen so far and the current vector of results.:
(defn map-update [v]
(letfn [(update [[ntens v] i]
(let [ntens (+ ntens (if (= 10 i) 1 0))]
[ntens (conj v (+ ntens (* 2 i)))]))]
(second (reduce update [0 []] v))))

To count # of 10 you can do
(defn count-10[col]
(reductions + (map #(if (= % 10) 1 0) col)))
Example:
user=> (count-10 [1 2 10 20 30 10 1])
(0 0 1 1 1 2 2)
And then a simple map for the final result
(map + col col (count-10 col)))

Reduce and reductions are good ways to traverse a sequence keeping a state. If you feel your code is not clear you can always use recursion with loop/recur or lazy-seq like this
(defn twice-plus-ntens
([coll] (twice-plus-ntens coll 0))
([coll ntens]
(lazy-seq
(when-let [s (seq coll)]
(let [item (first s)
new-ntens (if (= 10 item) (inc ntens) ntens)]
(cons (+ (* 2 item) new-ntens)
(twice-plus-ntens (rest s) new-ntens)))))))
have a look at map source code evaluating this at your repl
(source map)
I've skipped chunked optimization and multiple collection support.
You can make it a higher-order function this way
(defn map-update
([mf uf coll] (map-update mf uf (uf) coll))
([mf uf val coll]
(lazy-seq
(when-let [s (seq coll)]
(let [item (first s)
new-status (uf item val)]
(cons (mf item new-status)
(map-update mf uf new-status (rest s))))))))
(defn update-f
([] 0)
([item status]
(if (= item 10) (inc status) status)))
(defn map-f [item status]
(+ (* 2 item) status))
(map-update map-f update-f in)

The most appropriate way is to use function with state
(into
[]
(map
(let [mem (atom 0)]
(fn [val]
(when (== val 10) (swap! mem inc))
(+ #mem (* val 2)))))
[20 30 40 10 20 10 30])
also see
memoize
standard function

Clojure - sort function

I am trying to write a recursive sort function that sorts a list from low to high (duh). I am currently getting output, just not the correct output. Here is my code:
(defn sort/predicate [pred loi]
(if (empty? loi)
()
(if (= (count loi) 1)
(cons (first loi) (sort pred (rest loi)))
(if (pred (first loi) (first (rest loi)))
(cons (first loi) (sort pred (rest loi)))
(if (pred (first (rest loi)) (first loi))
(cons (first (rest loi)) (sort pred (cons (first loi) (rest (rest loi)))))
(cons (first loi) (sort pred (rest loi))))))))
Basically, I compare the first two elements in the list and, if the first element is smaller I cons it with the result of comparing the next two elements of the list. If the second element of the list is smaller, I cons the second element with the result of sorting the first two elements of the cons of the first element and everything after the second element (sorry if that's hard to follow). Then, when there is only one element left in the list, I throw it on the end and return it. However, there is a bug along the way somewhere because I should get the following:
>(sort/predicate < '(8 2 5 2 3))
(2 2 3 5 8)
but instead, I get:
>(sort/predicate < '(8 2 5 2 3))
(2 5 2 3 8)
I'm pretty new to clojure, so any help is greatly appreciated. Also, I would like to keep my code roughly the same (I don't want to use a sorting function that already exists). Thanks

I don't think this is a very efficient way to sort, but I tried to stay true to your intention:
(defn my-sort [cmp-fn [x & xs]]
(cond
(nil? x) '()
(empty? xs) (list x)
:else (let [[y & ys :as s] (my-sort cmp-fn xs)]
(if (cmp-fn x y)
(cons x s)
(cons y (my-sort cmp-fn (cons x ys)))))))

;; merge sort implementation - recursive sort without stack consuming
(defn merge-sort
([v comp-fn]
(if (< (count v) 2) v
(let [[left right] (split-at (quot (count v) 2) v)]
(loop [result []
sorted-left (merge-sort left comp-fn)
sorted-right (merge-sort right comp-fn)]
(cond
(empty? sorted-left) (into result sorted-right)
(empty? sorted-right) (into result sorted-left)
:else (if (comp-fn 0 (compare (first sorted-left) (first sorted-right)))
(recur (conj result (first sorted-left)) (rest sorted-left) sorted-right)
(recur (conj result (first sorted-right)) sorted-left (rest sorted-right))))))))
([v] (merge-sort v >)))

clojure.core/sort implement by Java more general.
user=> (sort '(8 2 5 2 3))
(2 2 3 5 8)
user=> (sort > '(8 2 5 2 3))
(8 5 3 2 2)
user=> (source sort)
(defn sort
"Returns a sorted sequence of the items in coll. If no comparator is
supplied, uses compare. comparator must implement
java.util.Comparator. If coll is a Java array, it will be modified.
To avoid this, sort a copy of the array."
{:added "1.0"
:static true}
([coll]
(sort compare coll))
([^java.util.Comparator comp coll]
(if (seq coll)
(let [a (to-array coll)]
(. java.util.Arrays (sort a comp))
(seq a))
())))
nil
user=>

Partition a seq by a "windowing" predicate in Clojure

I would like to "chunk" a seq into subseqs the same as partition-by, except that the function is not applied to each individual element, but to a range of elements.
So, for example:
(gather (fn [a b] (> (- b a) 2))
[1 4 5 8 9 10 15 20 21])
would result in:
[[1] [4 5] [8 9 10] [15] [20 21]]
Likewise:
(defn f [a b] (> (- b a) 2))
(gather f [1 2 3 4]) ;; => [[1 2 3] [4]]
(gather f [1 2 3 4 5 6 7 8 9]) ;; => [[1 2 3] [4 5 6] [7 8 9]]
The idea is that I apply the start of the list and the next element to the function, and if the function returns true we partition the current head of the list up to that point into a new partition.
I've written this:
(defn gather
[pred? lst]
(loop [acc [] cur [] l lst]
(let [a (first cur)
b (first l)
nxt (conj cur b)
rst (rest l)]
(cond
(empty? l) (conj acc cur)
(empty? cur) (recur acc nxt rst)
((complement pred?) a b) (recur acc nxt rst)
:else (recur (conj acc cur) [b] rst)))))
and it works, but I know there's a simpler way. My question is:
Is there a built in function to do this where this function would be unnecessary? If not, is there a more idiomatic (or simpler) solution that I have overlooked? Something combining reduce and take-while?
Thanks.

Original interpretation of question
We (all) seemed to have misinterpreted your question as wanting to start a new partition whenever the predicate held for consecutive elements.
Yet another, lazy, built on partition-by
(defn partition-between [pred? coll]
(let [switch (reductions not= true (map pred? coll (rest coll)))]
(map (partial map first) (partition-by second (map list coll switch)))))
(partition-between (fn [a b] (> (- b a) 2)) [1 4 5 8 9 10 15 20 21])
;=> ((1) (4 5) (8 9 10) (15) (20 21))
Actual Question
The actual question asks us to start a new partition whenever pred? holds for the beginning of the current partition and the current element. For this we can just rip off partition-by with a few tweaks to its source.
(defn gather [pred? coll]
(lazy-seq
(when-let [s (seq coll)]
(let [fst (first s)
run (cons fst (take-while #((complement pred?) fst %) (next s)))]
(cons run (gather pred? (seq (drop (count run) s))))))))
(gather (fn [a b] (> (- b a) 2)) [1 4 5 8 9 10 15 20 21])
;=> ((1) (4 5) (8 9 10) (15) (20 21))
(gather (fn [a b] (> (- b a) 2)) [1 2 3 4])
;=> ((1 2 3) (4))
(gather (fn [a b] (> (- b a) 2)) [1 2 3 4 5 6 7 8 9])
;=> ((1 2 3) (4 5 6) (7 8 9))

Since you need to have the information from previous or next elements than the one you are currently deciding on, a partition of pairs with a reduce could do the trick in this case.
This is what I came up with after some iterations:
(defn gather [pred s]
(->> (partition 2 1 (repeat nil) s) ; partition the sequence and if necessary
; fill the last partition with nils
(reduce (fn [acc [x :as s]]
(let [n (dec (count acc))
acc (update-in acc [n] conj x)]
(if (apply pred s)
(conj acc [])
acc)))
[[]])))
(gather (fn [a b] (when (and a b) (> (- b a) 2)))
[1 4 5 8 9 10 15 20 21])
;= [[1] [4 5] [8 9 10] [15] [20 21]]
The basic idea is to make partitions of the number of elements the predicate function takes, filling the last partition with nils if necessary. The function then reduces each partition by determining if the predicate is met, if so then the first element in the partition is added to the current group and a new group is created. Since the last partition could have been filled with nulls, the predicate has to be modified.
Tow possible improvements to this function would be to let the user:
Define the value to fill the last partition, so the reducing function could check if any of the elements in the partition is this value.
Specify the arity of the predicate, thus allowing to determine the grouping taking into account the current and the next n elements.

I wrote this some time ago in useful:
(defn partition-between [split? coll]
(lazy-seq
(when-let [[x & more] (seq coll)]
(lazy-loop [items [x], coll more]
(if-let [[x & more] (seq coll)]
(if (split? [(peek items) x])
(cons items (lazy-recur [x] more))
(lazy-recur (conj items x) more))
[items])))))
It uses lazy-loop, which is just a way to write lazy-seq expressions that look like loop/recur, but I hope it's fairly clear.
I linked to a historical version of the function, because later I realized there's a more general function that you can use to implement partition-between, or partition-by, or indeed lots of other sequential functions. These days the implementation is much shorter, but it's less obvious what's going on if you're not familiar with the more general function I called glue:
(defn partition-between [split? coll]
(glue conj []
(fn [v x]
(not (split? [(peek v) x])))
(constantly false)
coll))
Note that both of these solutions are lazy, which at the time I'm writing this is not true of any of the other solutions in this thread.

Here is one way, with steps split up. It can be narrowed down to fewer statements.
(def l [1 4 5 8 9 10 15 20 21])
(defn reduce_fn [f x y]
(cond
(f (last (last x)) y) (conj x [y])
:else (conj (vec (butlast x)) (conj (last x) y)) )
)
(def reduce_fn1 (partial reduce_fn #(> (- %2 %1) 2)))
(reduce reduce_fn1 [[(first l)]] (rest l))

keep-indexed is a wonderful function. Given a function f and a vector lst,
(keep-indexed (fn [idx it] (if (apply f it) idx))
(partition 2 1 lst)))
(0 2 5 6)
this returns the indices after which you want to split. Let's increment them and tack a 0 at the front:
(cons 0 (map inc (.....)))
(0 1 3 6 7)
Partition these to get ranges:
(partition 2 1 nil (....))
((0 1) (1 3) (3 6) (6 7) (7))
Now use these to generate subvecs:
(map (partial apply subvec lst) ....)
([1] [4 5] [8 9 10] [15] [20 21])
Putting it all together:
(defn gather
[f lst]
(let [indices (cons 0 (map inc
(keep-indexed (fn [idx it]
(if (apply f it) idx))
(partition 2 1 lst))))]
(map (partial apply subvec (vec lst))
(partition 2 1 nil indices))))
(gather #(> (- %2 %) 2) '(1 4 5 8 9 10 15 20 21))
([1] [4 5] [8 9 10] [15] [20 21])

Clojure: finding sequential items from a sequence

In a Clojure program, I have a sequence of numbers:
(2 3 4 6 8 1)
I want to find the longest sub-sequence where the items are sequential:
(2 3 4)
I am assuming that it will involve (take-while ...) or (reduce ...).
Any ideas?
Clarification: I need the longest initial list of sequential items. Much easier, I'm sure. Thanks for the solutions to the more difficult problem I initially posed.

If you are only interested in the longest initial sequence, it's a 1-liner:
(defn longest-initial-sequence [[x :as s]]
(take-while identity (map #(#{%1} %2) s (iterate inc x))))

Taking into account the OP's comment on the question -- which completely changes the game! -- this can be written very simply:
(let [doubletons (partition 2 1 [1 2 3 5 6])
increment? (fn increment? [[x y]]
(== (inc x) y))]
(cons (ffirst doubletons)
(map second (take-while increment? doubletons))))
;; returns (1 2 3)
Note that this is actually lazy. I expect it not to hold onto the head of doubletons thanks to locals clearing. Another version:
(cons (first [1 2 3 5 6])
(map second (take-while increment? (partition 2 1 [1 2 3 5 6]))))
The original version of the question is more fun, though! :-) A super-simple solution to that could be built using the above, but of course that would be significantly less performant than using reduce. I'll see if I have anything substantially different from zmila's and dnolen's solutions -- and yet still reasonably performant -- to add to that part of this thread later. (Not very likely, I guess.)

Answer to original:
(defn conj-if-sequential
([] [])
([a] a)
([a b] (let [a (if (vector? a) a [a])]
(if (= (inc (last a)) b)
(conj a b)
a))))
(reduce conj-if-sequential [2 3 4 6 8 1])
A more generic solution for those interested:
(defn sequential-seqs
([] [])
([a] a)
([a b] (let [n (last (last a))]
(if (and n (= (inc n) b))
(update-in a [(dec (count a))] conj b)
(conj a [b])))))
(defn largest
([] nil)
([a] a)
([a b] (if (> (count b) (count a)) b a)))
(reduce largest (reduce sequential-seqs [] [2 3 4 6 8 1 4 5 6 7 8 9 13]))
I think this is much better.

(defn find-max-seq [lst]
(let [[f & r] lst,
longest-seq (fn [a b] (if (> (count a) (count b)) a b)),
[last-seq max-seq] (reduce
(fn [ [[prev-num & _ :as cur-seq] max-seq] cur-num ]
(if (== (inc prev-num) cur-num)
[(conj cur-seq cur-num) max-seq]
[(list cur-num) (longest-seq cur-seq max-seq)]
))
[(list f) ()]
r)]
(reverse (longest-seq last-seq max-seq))))
(find-max-seq '(2 3 4 6 8 1)) ; ==> (2 3 4)
(find-max-seq '(3 2 3 4 6 8 9 10 11)) ; ==> (8 9 10 11)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Remove duplicate elements from two sequences - clojure

(defn remove-dupl [l1 l2] (let [rmdup (fn [l1 l2] (remove (set l1) l2))] (concat (rmdup l1 l2) (rmdup l2 l1))))

Try this solution (defn remove-dup [l1 l2] (let [ls (concat l1 l2)] (loop [l (frequencies ls) res '()] (if (empty? l) res (if (>= (second (first l)) 2) (recur (rest l) res) (recur (rest l) (cons (first (first l)) res)))))))

Related

Map a function on every two elements of a list

Clojure: map function with updatable state

Clojure - sort function

Partition a seq by a "windowing" predicate in Clojure

Clojure: finding sequential items from a sequence

Categories

Resources