Better Sequence Duplicate Remover - clojure

I made this function to remove consecutive duplicates, but I wanted to know if there was a better or shorter way to express it using distinct or something like that.
(defn duplicates
[s]
(reduce
#(if-not (= (last %1) %2)
(conj %1 %2) %1)
[] s))

clojure-1.7.0-alpha1 has a correct version of this function under the name dedupe.
The one you quoted returns its input sequence without consecutive duplicates. (Almost certainly) unwittingly, it also swallows all successive nil values if they begin the input sequence.
#(if-not (= (last %1) %2)
(conj %1 %2)
%1)
The lambda to reduce says: If the last element of the accumulator (%1) is unequal to the next input element (%2), add it to the accumulator, otherwise return the accumulator.
Because (last []) evaluates to nil it will never add nil values while the accumulator is empty. I leave fixing that as an exercise to the reader:
Make sure that duplicates returns the expected result [nil true nil] for input [nil nil true true nil].
Note: When operating with a vector, using peek performs significantly better than last.
EDIT (Since you edited your question): distinct returns each value of the input sequence only once. Unlike set it returns lazy-sequence.
A more idiomatic way to write duplicates/dedupe is the one that A. Webb posted as a comment (since it is also lazy). Otherwise, fixing the lambda to work correctly with an empty accumulator as its input and using peek instead of last would be more idiomatic.
Instead of fixing the lambda, in clojure-1.7.0-alpha1 you would use the dedupe transducer for eager evaluation, e. g.:
(into [] (dedupe) [nil nil true true nil])
-> [nil true nil]
Or for lazy evaluation:
(dedupe [nil nil true true nil])
-> (nil true nil)

The anonymous function inside the reduce conjs an element to a sequence if it's different from the last element. You could rewrite it like this:
(defn maybe-conj [s e]
(if (= (last s) e)
s
(conj s e)))
Then you could rewrite compress-sequence as:
(defn compress-sequence [s]
(reduce maybe-conj [] s))
What this will do is go through each element of a sequence, and append it to the initial empty vector only if it's different from the last currently in that vector. The output will be a vector of the initial sequence with any runs removed. Example:
(compress-sequence [1 1 1 1 1 2 3 4 5 5 5 5 6 6 6 5 6 6 7 8 9])
evaluates to
[1 2 3 4 5 6 5 6 7 8 9]

Related

Clojure manually find nth element in a sequence

I am a newbie to clojure (and functional programming for that matter) and I was trying to do some basic problems. I was trying to find the nth element in a sequence without recursion.
so something like
(my-nth '(1 2 3 4) 2) => 3
I had a hard time looping through the list and returning when i found the nth element. I tried a bunch of different ways and the code that I ended up with is
(defn sdsu-nth
[input-list n]
(loop [cnt n tmp-list input-list]
(if (zero? cnt)
(first tmp-list)
(recur (dec cnt) (pop tmp-list)))))
This gives me an exception which says "cant pop from empty list"
I dont need code, but if someone could point me in the right direction it would really help!
You are using the function pop, which has different behavior for different data structures.
user> (pop '(0 1 2 3 4))
(1 2 3 4)
user> (pop [0 1 2 3 4])
[0 1 2 3]
user> (pop (map identity '(0 1 2 3 4)))
ClassCastException clojure.lang.LazySeq cannot be cast to clojure.lang.IPersistentStack clojure.lang.RT.pop (RT.java:640)
Furthermore, you are mixing calls to pop with calls to first. If iterating, use peek/pop or first/rest as pairs, mixing the two can lead to unexpected results. first / rest are the lowest common denominator, if you want to generalize over various sequential types, use those, and they will coerce the sequence to work if they can.
user> (first "hello")
\h
user> (first #{0 1 2 3 4})
0
user> (first {:a 0 :b 1 :c 2})
[:c 2]
With your function, replacing pop with rest, we get the expected results:
user> (defn sdsu-nth
[input-list n]
(loop [cnt n tmp-list input-list]
(if (zero? cnt)
(first tmp-list)
(recur (dec cnt) (rest tmp-list)))))
#'user/sdsu-nth
user> (sdsu-nth (map identity '(0 1 2 3 4)) 2)
2
user> (sdsu-nth [0 1 2 3 4] 2)
2
user> (sdsu-nth '(0 1 2 3 4) 2)
2
user> (sdsu-nth "01234" 2)
\2
given a list as list_nums, take up to n + 1 then from that return the last element which is nth.
(fn [list_nums n] (last (take (inc n) list_nums)))
and alternatively:
#(last (take (inc %2) %1))
proof:
(= (#(last (take (inc %2) %1)) '(4 5 6 7) 2) 6) ;; => true
What you would really want to do is use the built-in nth function as it does exactly what you're asking:
http://clojuredocs.org/clojure_core/clojure.core/nth
However, since you're learning this is still a good exercise. Your code actually works for me. Make sure you're giving it a list and not a vector -- pop does something different with vectors (it returns the vector without the last item rather than the first -- see here).
Your code works fine for lists if supplied index is not equal or greater then length of sequence (you've implemented zero indexed nth). You get this error when tmp-list gets empty before your cnt gets to the zero.
It does not work so well with vectors:
user> (sdsu-nth [1 2 3 4] 2)
;; => 1
user> (sdsu-nth [10 2 3 4] 2)
;; => 10
it seems to return 0 element for every supplied index. As noisesmith noticed it happens because pop works differently for vectors because of their internal structure. For vectors pop will remove elements form the end, and then first returns first value of any vector.
How to fix: use rest instead of pop, to remove differences in behavior of your function when applied to lists and vectors.
(fn [xs n]
(if (= n 0)
(first xs)
(recur (rest xs) (dec n))))
One more way that I thought of doing this and making it truly non recursive (ie without for/recur) is
(defn sdsu-nth
[input-list n]
(if (zero? (count input-list))
(throw (Exception. "IndexOutOfBoundsException"))
(if (>= n (count input-list))
(throw (Exception. "IndexOutOfBoundsException"))
(if (neg? n)
(throw (Exception. "IndexOutOfBoundsException"))
(last (take (+ n 1) input-list))))))

How do I use nth correctly here?

I'm pretty new to this, so I apologize if this seems trivial.
When trying to use nth, I get an IndexOutOfBoundsException because the size of lst could be less than 2. How can I fix this?
(defn invert-helper [lst]
(list (nth lst 1) (first lst)))
thanks!
The nth function has a 3-arity option for the case where you don't want an exception thrown for being out of bounds. You provide, as the 3rd argument, the value you want returned in case the index is out of bounds. This avoids the inefficiency of first checking for the length and then doing nth in uncounted sequences.
user=> (nth [1 2 3] 5 nil)
nil
user=> (nth [1 2 3] 5 ::not-found)
:user/not-found

repeatedly apply a function until test no longer yields true

I wrote this code to nest a function n times and am trying to extend the code to handle a test. Once the test returns nil the loop is stopped. The output be a vector containing elements that tested true. Is it simplest to add a while loop in this case? Here is a sample of what I've written:
(defn nester [a inter f]
(loop [level inter expr a]
(if (= level 0) expr
(if (> level 0) (recur (dec level) (f expr))))))
An example input would be an integer 2, and I want to nest the inc function until the output is great than 6. The output should be [2 3 4 5 6 7].
(defn nester [a inter f test-fn]
(loop [level inter
expr a]
(if (or (zero? level)
(nil? (test-fn expr)))
expr
(recur (dec level)
(f expr)))))
If you also accept false (additionally to nil) from your test-fn, you could compose this more lazily:
(defn nester [a inter f test-fn]
(->> (iterate f a)
(take (inc inter))
(drop-while test-fn)
first))
EDIT: The above was answered to your initial question. Now that you have specified completely changed the meaning of your question:
If you want to generate a vector of all iterations of a function f over a value n with a predicate p:
(defn nester [f n p]
(->> (iterate f n)
(take-while p)
vec))
(nester inc 2 (partial > 8)) ;; predicate "until the output is greater than six"
;; translated to "as long as 8 is greater than
;; the output"
=> [2 3 4 5 6 7]
To "nest" or iterate a function over a value, Clojure has the iterate function. For example, (iterate inc 2) can be thought of as an infinite lazy list [2, (inc 2), (inc (inc 2)), (inc (inc (inc 2))) ...] (I use the [] brackets not to denote a "list"--in fact, they represent a "vector" in Clojure terms--but to avoid confusion with () which can denote a data list or an s-expression that is supposed to be a function call--iterate does not return a vector). Of course, you probably don't want an infinite list, which is where the lazy part comes in. A lazy list will only give you what you ask it for. So if you ask for the first ten elements, that's what you get:
user> (take 10 (iterate inc 2))
> (2 3 4 5 6 7 8 9 10 11)
Of course, you could try to ask for the whole list, but be prepared to either restart your REPL, or dispatch in a separate thread, because this call will never end:
user> (iterate inc 2)
> (2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
=== Shutting down REPL ===
=== Starting new REPL at C:\Users\Omnomnomri\Clojure\user ===
Clojure 1.5.0
user>
Here, I'm using clooj, and this is what it looks like when I restart my REPL. Anyways, that's all just a tangent. The point is that iterate answers the core of your question. The other part, stopping upon some test condition, involves take-while. As you might imagine, take-while is a lot like take, only instead of stopping after some number of elements, it stops upon some test condition (or in Clojure parlance, a predicate):
user> (take-while #(< % 10) (iterate inc 2))
> (2 3 4 5 6 7 8 9)
Note that take-while is exclusive with its predicate test, so that here once the value fails the test (of being less than 10), it excludes that value, and only includes the previous values in the return result. At this point, solving your example is pretty straightfoward:
user> (take-while #(< % 7) (iterate inc 2))
> (2 3 4 5 6)
And if you need it to be a vector, wrap the whole thing in a call to vec:
user> (vec (take-while #(< % 7) (iterate inc 2)))
> [2 3 4 5 6]

Writing a sequence-maybe monad using clojure.algo.monads

I need help with writing a 'sequence-maybe-m' (a monad that combines the behaviour of a sequence monad with a maybe monad).
The rule should be:
If any of the inputs are nil, then the whole expression fails.
Otherwise, evaluate the body like a sequence monad would do.
(domonad sequence-maybe-m [a [1 2 3] b [1 2 3]] (+ a b))
;; => (2 3 4 3 4 5 4 5 6)
(domonad sequence-maybe-m [a [1 2 3] b nil] (+ a b))
;; => nil
(domonad sequence-maybe-m [a [1 2 3] b (range a)] (+ a b))
;; => (1 2 3 3 4 5) same as 'for'
(domonad sequence-maybe-m [a [1 2 3] b [1 nil 3]] (+ a b))
;; => nil
It'll be a bonus if it is compatible with the clojure.algo.monads library:
(defmonad sequence-maybe-m
[m-result <...>
m-bind <...>
m-zero <...>
m-plus <...>
])
where <...> are functions.
; helper function for nil-ness
(defn nil-or-has-nil? [xs] (or (nil? xs) (some nil? xs)))
; the actual monad
(defmonad sequence-maybe-m
[m-result (fn [v] [v]) ; lift any value into a sequence
m-bind (fn [mv f] ; given a monadic value and a function
(if (nil-or-has-nil? mv) ; if any nil,
nil ; result in nil
(let [result (map f mv)] ; map over valid input seq
(if (some nil? result) ; if any nils result
nil ; return nil
(apply concat result))))) ; else flatten resulting seq
m-plus (fn [& mvs] ; given a sequence of mvs
(if (some nil-or-has-nil? mvs) ; if any nil,
nil ; result in nil
(apply concat mvs))) ; otherwise, join seqs
m-zero []]) ; empty seq is identity for concatenation
The only point really worth watching out for here is the second nil-or-has-nil? in the m-bind. The first is expected - passed a monadic value, m-bind has to determine whether it's nil-ish and should immediately result in nil. The second checks the results of the computation - if it failed (producing any nil), then the overall result must be nil (as opposed to, say, the empty list resulting from (apply concat [nil nil ...])).
The output of domonad must be a monadic value, in the case of sequence-m that means it must be a sequence. Asking for an output of nil breaks that and you do not have a monad.
What you are probably looking for is adding "maybe" to the sequence-monad directly using monadic transformers, quite easy to do and described here: http://clojuredocs.org/clojure_contrib/1.2.0/clojure.contrib.monads/maybe-t.
You will want to write
(def sequence-maybe-m (maybe-t sequence-m))
where maybe-t adds the "maybe" to the sequence monad. Using this will make
(domonad sequence-maybe-m [a [1 2 3] b [1 nil 3]] (+ a b))
yield
(2 nil 4 3 nil 5 4 nil 6)
which is valid output for a monad of this type. If you need to cancel out results that have nil in them, just use some nil? on the output of the monad to check them.
Binding nil to b as you ask for in your example
(domonad sequence-maybe-m [a [1 2 3] b nil] (+ a b))
does not make sense either, since nil is not a sequence. In the transformed monad, the return value would be the empty list (). It would be more appropriate to bind [nil] to b, then you would get (nil nil nil).
It helps to remember that monads are used to compose functions of the same signature and can themselves be part of such a composition, so they must yield monadic values (in this case, sequences) themselves and in their body any binding must also be with a monadic value.

function for finding if x is a multiple of y

Look at the function below. I want to pass a vector of factors and test if any of the elements in the vector is a factor of x. How do I do that?
(defn multiple?
"Takes a seq of factors, and returns true if x is multiple of any factor."
([x & factors] (for [e m] ))
([x factor] (= 0 (rem x factor))))
You could try using some and map:
(defn multiple? [x & factors]
(some zero? (map #(rem x %) factors)))
Also some returns nil if all tests fail, if you need it to actually return false, you could put a true? in there:
(defn multiple? [x & factors]
(true? (some zero? (map #(rem x %) factors))))
Note that some short-circuits and map is lazy, so multiple? stops as soon as a match is found. e.g. the following code tests against the sequence 1,2,3,4,....
=> (apply multiple? 10 (map inc (range)))
true
Obviously this computation can only terminate if multiple? doesn't test against every number in the sequence.
You can solve it only using some.
=> (defn multiple? [x factors]
(some #(zero? (rem x %)) factors))
#'user/multiple?
=> (= true (multiple? 10 [3 4]))
false
=> (= true (multiple? 10 [3 4 5 6]))
true
some will stop at the first factor.
Try this, using explicit tail recursion:
(defn multiple? [x factors]
"if any of the elements in the vector is a factor of x"
(loop [factors factors]
(cond (empty? factors) false
(zero? (rem x (first factors))) true
:else (recur (rest factors)))))
The advantages of the above solution include: it will stop as soon as it finds if any of the elements in the vector is a factor of x, without iterating over the whole vector; it's efficient and runs in constant space thanks to the use of tail recursion; and it returns directly a boolean result, no need to consider the case of returning nil. Use it like this:
(multiple? 10 [3 4])
=> false
(multiple? 10 [3 4 5 6])
=> true
If you want to obviate the need to explicitly pass a vector (for calling the procedure like this: (multiple? 10 3 4 5 6))) then simply add a & to the parameter list, just like it was in the question.
A more Clojurian way is to write a more general-purpose function: instead of answering true/false question it would return all factors of x. And because sequences are lazy it is almost as efficient if you want to find out if it's empty or not.
(defn factors [x & fs]
(for [f fs :when (zero? (rem x f))] f))
(factors 5 2 3 4)
=> ()
(factors 6 2 3 4)
=> (2 3)
then you can answer your original question by simply using empty?:
(empty? (factors 5 2 3 4))
=> true
(empty? (factors 6 2 3 4))
=> false