Clojure: How to generate a 'trie'? - clojure

Given the following...
(def inTree
'((1 2)
(1 2 3)
(1 2 4 5 9)
(1 2 4 10 15)
(1 2 4 20 25)))
How would you transform it to this trie?
(def outTrie
'(1
(2 ()
(3 ())
(4 (5
(9 ()))
(10
(15 ()))
(20
(25 ()))))))

Here's a cleaned up solution. This fixes a bug Brian's add-to-trie method since it's currently dependent upon you inserting the seqs in increasing-length order. It also allows querying the trie by prefix, which is a common use case.
Note the memory usage here is higher since it stores the values in the leaf nodes of the trie so you can perform searches.
(defn add-to-trie [trie x]
(assoc-in trie x (merge (get-in trie x) {:val x :terminal true})))
(defn in-trie? [trie x]
"Returns true if the value x exists in the specified trie."
(:terminal (get-in trie x) false))
(defn prefix-matches [trie prefix]
"Returns a list of matches with the prefix specified in the trie specified."
(keep :val (tree-seq map? vals (get-in trie prefix))))
(defn build-trie [coll]
"Builds a trie over the values in the specified seq coll."
(reduce add-to-trie {} coll))

Lists are very clumsy here, not to mention inefficient. In Clojure it's more idiomatic to use vectors and hash-maps and sets when appropriate. Using hash-maps:
(def in-tree
'((1 2)
(1 2 3)
(1 2 4 5 9)
(1 2 4 10 15)
(1 2 4 20 25)))
(defn add-to-trie [trie x]
(assoc-in trie `(~#x :terminal) true))
(defn in-trie? [trie x]
(get-in trie `(~#x :terminal)))
If you wanted it to print sorted you could use sorted-maps instead, but you'd have to write your own version of assoc-in that used sorted maps the whole way down. In any case:
user> (def trie (reduce add-to-trie {} in-tree))
#'user/trie
user> trie
{1 {2 {4 {20 {25 {:terminal true}}, 10 {15 {:terminal true}}, 5 {9 {:terminal true}}}, 3 {:terminal true}, :terminal true}}}
user> (in-trie? trie '(1 2))
true
user> (in-trie? trie '(1 2 4))
nil
user> (in-trie? trie '(1 2 4 20 25))
true

As a general approach, here's what I would do:
Write a few functions to create a trie and to insert new elements into a trie.
Create a new trie.
Iterate through the input list and insert each element into the trie.
This problem lends itself very well to a recursive implementation. I would aim for that if possible.

I'm sure there is a prettier way (there was! see Brian's answer it is better):
(defn find-in-trie
"Finds a sub trie that matches an item, eg:
user=> (find-in-trie '(1 (2) (3 (2))) 3)
(3 (2))"
[tr item]
(first (for [ll (rest tr) :when (= (first ll) item)] ll)))
(defn add-to-trie
"Returns a new trie, the result of adding se to tr, eg:
user=> (add-to-trie nil '(1 2))
(1 (2))"
[tr se]
(cond
(empty? se) tr
(empty? tr) (add-to-trie (list (first se)) (rest se))
:else (if-let [st (find-in-trie tr (first se))]
(cons (first tr)
(cons (add-to-trie st (rest se))
(filter (partial not= st) (rest tr))))
(cons (first tr)
(cons (add-to-trie (list (first se)) (rest se))
(rest tr))))))
(def in '((1 2)
(1 2 3)
(1 2 4 5 9)
(1 2 4 10 15)
(1 2 4 20 25)))
(reduce add-to-trie '(nil) in)
-> (nil (1 (2 (4 (20 (25)) (10 (15)) (5 (9))) (3))))
Note that I've chosen to use nil as the root node and have not bothered keeping empty lists to signify no children. Actually doing it this way is not correct as it does not preserve substring identity.

Related

Clojure - using map recursively

If I have a list, I can use map to apply a function to each item of the list.
(map sqrt (list 1 4 9))
(1 2 3)
I can also use map in front of a list of lists:
(map count (list (list 1 2 3) (list 4 5)))
(4 5)
Now is there a way to apply sqrt to each number in the list of lists? I want to start from
(list (list 1 4 9) (list 16 25))
and obtain
((1 2 3)(4 5))
However, the following does not seem to work,
(map (map sqrt) (list (list 1 4 9) (list 16 25)))
nor the following.
(map (fn [x] (map sqrt x)) (list (list 1 4 9) (list 16 25)))
Why? (And how do I solve this?)
Your second to last version "nearly" works. Clojure has no automatic
currying, so (map sqrt) is not partial application, but (map sqrt)
returns a transducer, which takes one argument and returns a function
with three different arities - so running your code there will give you
back a function for each list of numbers.
To make that work, you can use partial:
user=> (map (partial map sqrt) (list (list 1 4 9) (list 16 25)))
((1 2 3) (4 5))
And of course there is the obligatory specter answer:
user=> (transform [ALL ALL] sqrt '((1 4 9)(16 25)))
((1 2 3) (4 5))
You can write recursive function for this task:
(defn deep-map [f seq1]
(cond (empty? seq1) nil
(sequential? (first seq1))
(cons (deep-map f (first seq1))
(deep-map f (rest seq1)))
:else (cons (f (first seq1))
(deep-map f (rest seq1)))))
Example:
(deep-map #(Math/sqrt %) '((1 4 9) (16 25 36)))
=> ((1.0 2.0 3.0) (4.0 5.0 6.0))
Or you can use clojure.walk and function postwalk:
(clojure.walk/postwalk
#(if (number? %) (Math/sqrt %) %)
'((1 4 9) (16 25 36)))
=> ((1.0 2.0 3.0) (4.0 5.0 6.0))
The function map is closely related to the function for, which I think is sometimes easier to use. Here is how I would solve this problem:
(let [matrix [[1 4 9]
[16 25]]
result (vec (for [row matrix]
(vec (for [num row]
(Math/sqrt num)))))]
result)
with result:
result =>
[[1.0 2.0 3.0]
[4.0 5.0]]
If you remove the two (vec ...) bits, you'll see the same result but for normally returns a lazy sequence.
#MartinPuda's answer is right.
The tail call recursive version is here:
(defn map* [f sq & {:keys [acc] :or {acc '()}}]
(cond (empty? sq) (vec (reverse acc))
(sequential? (first sq)) (map* f
(rest sq)
:acc (cons (map* f (first sq)) acc))
:else (map* f (rest sq) :acc (cons (f (first sq)) acc))))
By tradition in lisp, such recursively into the nested structure going functions are fnname* (marked by an asterisk at the end).
acc accumulates the result nested tree which is constructed by cons.
In your case this would be:
(map* Math/sqrt (list (list 1 4 9) (list 16 25)))
Test with:
(map* (partial + 1) '[1 2 [3 4 [5] 6] 7 [8 [9]]])
;; => [2 3 [4 5 [6] 7] 8 [9 [10]]]

Sum equal adjacent integers

Test case:
(def coll [1 2 2 3 4 4 4 5])
(def p-coll (partition 2 1 coll))
;; ((1 2) (2 2) (2 3) (3 4) (4 4) (4 4) (4 5))
Expected output:
(2 2 4 4 4) => 16
Here is what I am to implement: Start with vector v [0]. Take each pair, if the first element of the pair is equal to the last element of the vector, or if the elements of the pair are equal, add the first item of the pair to v. (And finally reduce v to its sum.) The code below can do if the elements of the pair are equal part, but not the first part. (Thus I get (0 2 4 4). I guess the reason is that the elements are added to v at the very end. My questions:
What is the way to compare an element with the last selected element?
What other idiomatic ways are there to implement what I am trying achieve?
How else can I approach the problem?
(let [s [0]]
(concat s (map #(first %)
(filter #(or (= (first %) (first s)) (= (first %) (second %))) p-coll))))
You are on the right track with partitioning the data here. But there
is a nicer way to do that. You can use (partition-by identity coll)
to group consecutive, same elements.
Then just keep the ones with more than one elements and sum them all up.
E.g.
(reduce
(fn [acc xs]
(+ acc (apply + xs)))
0
(filter
next
(partition-by identity coll)))
Starting out from your initial partition, with p-coll being like you described above (i.e. a list of pairs), and v being the vector [0], you can do the following:
(reduce
(fn [vect [a b]]
(if (or (= a b) (= a (last vect)))
(conj vect a)
vect))
v p-coll)
;; => [0 2 2 4 4 4]
We start from the vector [0], and reduce p-coll by processing its elements one by one. If an element satisfies one of the two conditions you specified, then we conj it onto the initial vector. Otherwise, we leave the vector as is.
Finally, you can use apply + to sum the resulting vector and get your final answer.
Generally, when you need to process a collection (here, p-coll) and some partial answer (here, the vector v) into some sort of final answer (here, the vector [0 2 2 4 4 4]), reduce is the most idiomatic and purely functional approach. After having identified those components, it's just a matter of coming up with the appropriate function to put them together.
Another approach (less idiomatic, but easier to understand from a procedural standpoint) would be to use an atom for the vector v, and keep growing it as you process the list with doseq:
(def v (atom [0]))
(doseq [[a b] p-coll]
(if (or (= a b) (= a (last #v)))
(swap! v conj a)))
(println #v)
;; => [0 2 2 4 4 4]
A solution only with flatten and map:
(defn consecutives [lst]
(flatten (map (fn [[x y z]] (cond (= x y z) [z]
(= y z) [y z]
:else []))
(map #'vector lst (rest lst) (rest (rest lst))))))
Purely tail-recursive solution
which "keeps in memory" previous and previous-previous element.
(defn consecutives
[lst]
(loop [lst lst
acc []
last-val nil
last-last-val nil]
(cond (empty? lst) acc
:else (recur (rest lst)
(if (= (first lst) last-val)
(conj (if (= last-val last-last-val)
acc
(conj acc last-val))
(first lst))
acc)
(first lst)
last-val))))
(consecutives coll)
;; => [2 2 4 4 4]

Easy way to change specific list item in list

In Clojure I want change specific item(list) in list with other.
Here is my structure:
(def myLis '((3 3) (5 5) (5 5)))
(def som '(4 4))
I want change second element in myLis with som.
Result in myLis is '((3 3) (4 4) (5 5))
This is basic example. I have few hundred items in myList.
I try assoc and update-in but this not work on list.
When I try with assoc and update:
(update-in myLis [1] som)
(assoc myLis 1 som)
(assoc-in myLis [1] som)
got error like that:
clojure.lang.PersistentList cannot be cast to clojure.lang.Associative
How can quick change nth element in this structure (list of lists).
As pointed out in the clojure bible (Clojure Programming):
Because [lists] are linked lists, they do not support efficient random
access; thus, nth on a list will run in linear time (as opposed to
constant time when used with vectors, arrays, and so on), and get does
not support lists at all because doing so would not align with get’s
objective of sublinear efficiency.
So in order to replace an element of a list you will have to traverse all the elements up to it, thus running longer the further your item is in the list, and rebuild the list with the elements before it, the new item and all the elements after it (rest). Alternatively, turn the list into a vector, use update-in and back into a list if you absolutely have to use lists.
However, if you can, it would be worth seeing if you can use sequences in your code rather than lists, and thus you can interchangeably use vectors or other abstractions that are more efficient for the processing you are performing over them.
A trivial function that would meet the basics of your requirement with lists however, is:
(defn list-update-in [l i x]
(let [newlist (take i l)
newlist (concat newlist (list x))
newlist (concat newlist (drop (+ 1 i) l))]
newlist))
user> (list-update-in '((1 2) (2 3) (3 4)) 1 '(8 9))
((1 2) (8 9) (3 4))
There's no out of bounds checks on this
Adding an additional answer using loop/recur to realign the OP's own solution to be more lisp like.
(defn list-update-in-recur [l i x]
(loop [new-data [] old-list l]
(if (seq old-list)
(if (= (count new-data) i)
(recur (conj new-data x) (rest old-list))
(recur (conj new-data (first old-list)) (rest old-list)))
(apply list new-data))))
user> (list-update-in-recur '((1 2) (2 3) (3 4)) 1 '(8 9))
((1 2) (8 9) (3 4))
A few points to note:
It's written as a function, there are no 'def' values to set any global value. The final result is the return of the function (apply list new-data)
Arguments initialise the loop, the size of the growing list is used to determine if we want to replace the nth item or not (no index variables)
The passed in list becomes the initial old-list value which reduces in size each iteration, and the exit condition is simply if there are any more elements left in it using the test (seq old-list), which returns false/nil when it is empty.
Because we conj everything (which adds to start of the list) we reverse it to return the output. It now uses a vector to create the new sequence, and converts to a list as the last step instead of reversing a list
I've replaced nth with first and rest which are more efficient and don't have to traverse entire lists every iteration.
This is still very inefficient and only provided as a learning exercise.
You should normally use vectors like [1 2 3] in preference to lists like '(1 2 3) for most purposes. In Clojure, a list is normally used for a function call like (+ 1 2), while for data literals vectors normally used like [1 2 3].
Here is code showing 2 options that work.
Main code:
(ns clj.core
(:require
[tupelo.core :as t]
))
(t/refer-tupelo)
(def myLis [ [3 3] [5 5] [5 5] ] )
(def som [4 4] )
(spyx (assoc myLis 1 som))
(spyx (assoc-in myLis [1] som))
(defn -main [& args]
(println "-main"))
Result:
~/clj > lein run
(assoc myLis 1 som) => [[3 3] [4 4] [5 5]]
(assoc-in myLis [1] som) => [[3 3] [4 4] [5 5]]
You need this in project.clj to make the (spy ...) work:
:dependencies [
[tupelo "0.9.9"]
...
Update 2016-11-2:
If you really want to keep everything in a list, you can use replace-at from the Tupelo library. It works like this:
(def myLis '( (3 3) (5 5) (5 5) ) )
(def vec-1 [4 4] )
(def list-1 '(4 4) )
(spyx (t/replace-at myLis 1 vec-1 ))
(spyx (t/replace-at myLis 1 list-1))
(spyx (apply list (t/replace-at myLis 1 list-1)))
with result
> lein run
(t/replace-at myLis 1 vec-1) => [(3 3) [4 4] (5 5)]
(t/replace-at myLis 1 list-1) => [(3 3) (4 4) (5 5)]
(apply list (t/replace-at myLis 1 list-1)) => ((3 3) (4 4) (5 5))
The first 2 examples show that the new element can be anything, such as the vector [4 4] or the list (4 4). Also, notice that replace-at always returns a vector result. If you want the final result to be a list as well, you need to use (apply list <some-collection>).
My solution with lists:
(def newL '())
(def i 1)
(loop [k (- (count myLis) 1)]
(when (> k -1)
(cond
(= k i) (def newL (conj newL som))
:else (def newL (conj newL (nth myLis k)))
)
(recur (- k 1))
)
)

Partition a seq by a "windowing" predicate in Clojure

I would like to "chunk" a seq into subseqs the same as partition-by, except that the function is not applied to each individual element, but to a range of elements.
So, for example:
(gather (fn [a b] (> (- b a) 2))
[1 4 5 8 9 10 15 20 21])
would result in:
[[1] [4 5] [8 9 10] [15] [20 21]]
Likewise:
(defn f [a b] (> (- b a) 2))
(gather f [1 2 3 4]) ;; => [[1 2 3] [4]]
(gather f [1 2 3 4 5 6 7 8 9]) ;; => [[1 2 3] [4 5 6] [7 8 9]]
The idea is that I apply the start of the list and the next element to the function, and if the function returns true we partition the current head of the list up to that point into a new partition.
I've written this:
(defn gather
[pred? lst]
(loop [acc [] cur [] l lst]
(let [a (first cur)
b (first l)
nxt (conj cur b)
rst (rest l)]
(cond
(empty? l) (conj acc cur)
(empty? cur) (recur acc nxt rst)
((complement pred?) a b) (recur acc nxt rst)
:else (recur (conj acc cur) [b] rst)))))
and it works, but I know there's a simpler way. My question is:
Is there a built in function to do this where this function would be unnecessary? If not, is there a more idiomatic (or simpler) solution that I have overlooked? Something combining reduce and take-while?
Thanks.
Original interpretation of question
We (all) seemed to have misinterpreted your question as wanting to start a new partition whenever the predicate held for consecutive elements.
Yet another, lazy, built on partition-by
(defn partition-between [pred? coll]
(let [switch (reductions not= true (map pred? coll (rest coll)))]
(map (partial map first) (partition-by second (map list coll switch)))))
(partition-between (fn [a b] (> (- b a) 2)) [1 4 5 8 9 10 15 20 21])
;=> ((1) (4 5) (8 9 10) (15) (20 21))
Actual Question
The actual question asks us to start a new partition whenever pred? holds for the beginning of the current partition and the current element. For this we can just rip off partition-by with a few tweaks to its source.
(defn gather [pred? coll]
(lazy-seq
(when-let [s (seq coll)]
(let [fst (first s)
run (cons fst (take-while #((complement pred?) fst %) (next s)))]
(cons run (gather pred? (seq (drop (count run) s))))))))
(gather (fn [a b] (> (- b a) 2)) [1 4 5 8 9 10 15 20 21])
;=> ((1) (4 5) (8 9 10) (15) (20 21))
(gather (fn [a b] (> (- b a) 2)) [1 2 3 4])
;=> ((1 2 3) (4))
(gather (fn [a b] (> (- b a) 2)) [1 2 3 4 5 6 7 8 9])
;=> ((1 2 3) (4 5 6) (7 8 9))
Since you need to have the information from previous or next elements than the one you are currently deciding on, a partition of pairs with a reduce could do the trick in this case.
This is what I came up with after some iterations:
(defn gather [pred s]
(->> (partition 2 1 (repeat nil) s) ; partition the sequence and if necessary
; fill the last partition with nils
(reduce (fn [acc [x :as s]]
(let [n (dec (count acc))
acc (update-in acc [n] conj x)]
(if (apply pred s)
(conj acc [])
acc)))
[[]])))
(gather (fn [a b] (when (and a b) (> (- b a) 2)))
[1 4 5 8 9 10 15 20 21])
;= [[1] [4 5] [8 9 10] [15] [20 21]]
The basic idea is to make partitions of the number of elements the predicate function takes, filling the last partition with nils if necessary. The function then reduces each partition by determining if the predicate is met, if so then the first element in the partition is added to the current group and a new group is created. Since the last partition could have been filled with nulls, the predicate has to be modified.
Tow possible improvements to this function would be to let the user:
Define the value to fill the last partition, so the reducing function could check if any of the elements in the partition is this value.
Specify the arity of the predicate, thus allowing to determine the grouping taking into account the current and the next n elements.
I wrote this some time ago in useful:
(defn partition-between [split? coll]
(lazy-seq
(when-let [[x & more] (seq coll)]
(lazy-loop [items [x], coll more]
(if-let [[x & more] (seq coll)]
(if (split? [(peek items) x])
(cons items (lazy-recur [x] more))
(lazy-recur (conj items x) more))
[items])))))
It uses lazy-loop, which is just a way to write lazy-seq expressions that look like loop/recur, but I hope it's fairly clear.
I linked to a historical version of the function, because later I realized there's a more general function that you can use to implement partition-between, or partition-by, or indeed lots of other sequential functions. These days the implementation is much shorter, but it's less obvious what's going on if you're not familiar with the more general function I called glue:
(defn partition-between [split? coll]
(glue conj []
(fn [v x]
(not (split? [(peek v) x])))
(constantly false)
coll))
Note that both of these solutions are lazy, which at the time I'm writing this is not true of any of the other solutions in this thread.
Here is one way, with steps split up. It can be narrowed down to fewer statements.
(def l [1 4 5 8 9 10 15 20 21])
(defn reduce_fn [f x y]
(cond
(f (last (last x)) y) (conj x [y])
:else (conj (vec (butlast x)) (conj (last x) y)) )
)
(def reduce_fn1 (partial reduce_fn #(> (- %2 %1) 2)))
(reduce reduce_fn1 [[(first l)]] (rest l))
keep-indexed is a wonderful function. Given a function f and a vector lst,
(keep-indexed (fn [idx it] (if (apply f it) idx))
(partition 2 1 lst)))
(0 2 5 6)
this returns the indices after which you want to split. Let's increment them and tack a 0 at the front:
(cons 0 (map inc (.....)))
(0 1 3 6 7)
Partition these to get ranges:
(partition 2 1 nil (....))
((0 1) (1 3) (3 6) (6 7) (7))
Now use these to generate subvecs:
(map (partial apply subvec lst) ....)
([1] [4 5] [8 9 10] [15] [20 21])
Putting it all together:
(defn gather
[f lst]
(let [indices (cons 0 (map inc
(keep-indexed (fn [idx it]
(if (apply f it) idx))
(partition 2 1 lst))))]
(map (partial apply subvec (vec lst))
(partition 2 1 nil indices))))
(gather #(> (- %2 %) 2) '(1 4 5 8 9 10 15 20 21))
([1] [4 5] [8 9 10] [15] [20 21])

Need a similar function to disj but for a list in clojure. Is it overall possible?

I need to traverse a list and do some calculations with every element and the elements excluding that element. For example, having a list (1 2 3 1), I need pairs (1) (2 3 1), (2) (1 3 1), (3) (1 2 1) and (1) (2 3 1).
(...) with every element and the elements excluding that element.
With every element sounds like a map. Excluding that element sounds like a filter. Let's start with the latter.
user=> (filter #(not= % 1) '(1 2 3))
(2 3)
Great. Now let's try to map it over all elements.
user=> (let [coll '(1 2 3)] (map (fn [elem] (filter #(not= % elem) coll)) coll))
((2 3) (1 3) (1 2))
Creating actual pairs is left as an exercise for the reader. Hint: modify the closure used in map.
Keep in mind that the solution suggested above should work fine for short lists but it has a complexity of O(n²). Another issue is the fact that collections with duplicates aren't handled correctly.
Let's take a recursive approach instead. We'll base the recursion on loop and recur.
(defn pairs [coll]
(loop [ahead coll behind [] answer []]
(if (empty? ahead)
answer
(let [[current & remaining] ahead]
(recur remaining
(conj behind current)
(conj answer [(list current)
(concat behind remaining)]))))))
A trivial example:
user=> (pairs '(1 2 3))
[[(1) (2 3)] [(2) (1 3)] [(3) (1 2)]]
A vector with duplicates:
user=> (pairs [1 5 6 5])
[[(1) (5 6 5)] [(5) (1 6 5)] [(6) (1 5 5)] [(5) (1 5 6)]]
Seems a job for a list comprehension:
(defn gimme-pairs [coll]
(for [x coll]
[(list x) (remove #{x} coll)]))
user=> (gimme-pairs [1 2 3])
([(1) (2 3)] [(2) (1 3)] [(3) (1 2)])
I would actually skip creating a list for the single element, which would make the code even easier:
(defn gimme-pairs [coll]
(for [x coll]
[x (remove #{x} coll)]))
user=> (gimme-pairs [1 2 3])
([1 (2 3)] [2 (1 3)] [3 (1 2)])
If you need to keep duplicates then you can use an indexed collection:
(defn gimme-pairs [coll]
(let [indexed (map-indexed vector coll)
remove-index (partial map second)]
(for [[i x] indexed]
[x (remove-index (remove #{[i x]} indexed))])))
user=> (gimme-pairs [1 2 3 1])
([1 (2 3 1)] [2 (1 3 1)] [3 (1 2 1)] [1 (1 2 3)])
(def l [1 2 3])
(first (reduce (fn [[res ll] e]
[(conj res [(list e) (rest ll)])
(conj (vec (rest ll)) e)])
[[] l] l))
=> [[(1) (2 3)] [(2) (3 1)] [(3) (1 2)]]