Clojure partition list of strings, accumulating the result

Clojure partition list of strings, accumulating the result - clojure

I'm sorry about the lack of precision in the title, but it might illustrate my lack of clojure experience.
I'm trying to take a large list of strings, and convert that list into another list of strings, concatenating as I go until the accumulator is less than some length.
For example, if I have
[ "a" "bc" "def" "ghij" ]
and my max string length is 4, I would walk down the list, accumulating the concat, until my accumulation len > 4, and then start the accumulator from scratch. My result would look like:
[ "abc" "def" "ghij" ]
I can't seem to come up with the proper incantation for partition-by, and it's driving me a little crazy. I've been trying to make my accumulator an atom (but can't seem to figure out where to reset!), but other than that, I can't see where/how to keep track of my accumulated string.
Thanks in advance to anyone taking mercy on me.

(defn catsize [limit strs]
(reduce (fn [res s]
(let [base (peek res)]
(if (> (+ (.length ^String base) (.length ^String s)) limit)
(conj res s)
(conj (pop res) (str base s)))))
(if (seq strs) [(first strs)] [])
(rest strs)))

Here's my take on this:
(defn collapse [maxlen xs]
(let [concats (take-while #(< (count %) maxlen) (reductions str xs))]
(cons (last concats) (drop (count concats) xs))))
(collapse 4 ["a" "bc" "def" "ghij"])
;; => ("abc" "def" "ghij")

This gets pretty close. I'm not sure why you have j at the end of the final string.
(sequence
(comp
(mapcat seq)
(partition-all 3)
(map clojure.string/join))
["a" "bc" "def" "ghij"]) => ("abc" "def" "ghi" "j")

Here is how I would do it:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(def bound 4)
(defn catter [strings-in]
(loop [merged-strs []
curr-merge (first strings-in)
remaining-strs (rest strings-in)]
;(newline) (spyx [merged-strs curr-merge remaining-strs])
(if (empty? remaining-strs)
(conj merged-strs curr-merge)
(let ; try using 'let-spy' instead
[new-str (first remaining-strs)
new-merge (str curr-merge new-str)]
(if (< (count new-merge) bound)
(recur merged-strs new-merge (rest remaining-strs))
(recur (conj merged-strs curr-merge) new-str (rest remaining-strs)))))))
(dotest
(is= ["abc" "def" "ghij"] (catter ["a" "bc" "def" "ghij"]) )
(is= ["abc" "def" "ghij"] (catter ["a" "b" "c" "def" "ghij"]) )
(is= ["abc" "def" "ghij"] (catter ["a" "b" "c" "d" "ef" "ghij"]) )
(is= ["abc" "def" "ghij"] (catter ["a" "bc" "d" "ef" "ghij"]) )
(is= ["abc" "def" "ghij"] (catter ["a" "bc" "d" "e" "f" "ghij"]) )
(is= ["abc" "def" "gh" "ij"] (catter ["abc" "d" "e" "f" "gh" "ij"]) )
(is= ["abc" "def" "ghi" "j"] (catter ["abc" "d" "e" "f" "ghi" "j"]) )
(is= ["abcdef" "ghi" "j"] (catter ["abcdef" "ghi" "j"]) )
(is= ["abcdef" "ghi" "j"] (catter ["abcdef" "g" "h" "i" "j"]) )
)
You will need to add [tupelo "0.9.71"] to project dependencies.
Update:
If you user spy and let-spy, you can see the process the algorithm uses to arrive at the result. For example:
(catter ["a" "b" "c" "d" "ef" "ghij"]) ) => ["abc" "def" "ghij"]
-----------------------------------------------------------------------------
strings-in => ["a" "b" "c" "d" "ef" "ghij"]
[merged-strs curr-merge remaining-strs] => [[] "a" ("b" "c" "d" "ef" "ghij")]
new-str => "b"
new-merge => "ab"
[merged-strs curr-merge remaining-strs] => [[] "ab" ("c" "d" "ef" "ghij")]
new-str => "c"
new-merge => "abc"
[merged-strs curr-merge remaining-strs] => [[] "abc" ("d" "ef" "ghij")]
new-str => "d"
new-merge => "abcd"
[merged-strs curr-merge remaining-strs] => [["abc"] "d" ("ef" "ghij")]
new-str => "ef"
new-merge => "def"
[merged-strs curr-merge remaining-strs] => [["abc"] "def" ("ghij")]
new-str => "ghij"
new-merge => "defghij"
[merged-strs curr-merge remaining-strs] => [["abc" "def"] "ghij" ()]
Ran 2 tests containing 10 assertions.
0 failures, 0 errors.

Related

Explanation for combination of cycle drop and take in clojure

I am trying to understand the implementation of rotating a sequence to which the answer i find in git hub is below
(fn [n coll]
(take (count coll) (drop (mod n (count coll)) (cycle coll))))
Could you please explain what is exacty happening here
(take 6 (drop 1 (cycle ["a" "b" "c"])))
("b" "c" "a" "b" "c" "a")
How is this being produced

From the documentation of cycle:
Returns a lazy (infinite!) sequence of repetitions of the items in coll.
So in your example:
(cycle ["a" "b" "c"])
;; => ["a" "b" "c" "a" "b" "c" "a" "b" "c" "a" "b" "c" ...]
(toward infinity and beyond)
To cut down an infinite sequence, you have to use take which takes the first n element of a sequence. So:
(take 6 (cycle ["a" "b" "c"]))
;; => ["a" "b" "c" "a" "b" "c"]
In your example, just before calling take, you use drop which left out the first n element of a sequence. So:
(drop 1 (cycle ["a" "b" "c"]))
;; => ["b" "c" "a" "b" "c" "a" "b" "c" "a" "b" "c" ...]
(take 6 (drop 1 (cycle ["a" "b" "c"])))
;; => ["b" "c" "a" "b" "c" "a"]
You can learn more about lazy sequences from this chapter of "Clojure from the Brave and True".

Translating vector into map

I've got this list of fields (that's Facebook's graph API fields list).
["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"]
I want to generate a map out of it. The convention is following, if after a key vector follows, then its an inner object for the key. Example vector could be represented as a map as:
{"a" "value"
"b" {"c" {"t" "value"} "d" "value"}
"e" {"f" "value"}
"g" "value"}
So I have this solution so far
(defn traverse
[data]
(mapcat (fn [[left right]]
(if (vector? right)
(let [traversed (traverse right)]
(mapv (partial into [left]) traversed))
[[right]]))
(partition 2 1 (into [nil] data))))
(defn facebook-fields->map
[fields default-value]
(->> fields
(traverse)
(reduce #(assoc-in %1 %2 nil) {})
(clojure.walk/postwalk #(or % default-value))))
(let [data ["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"]]
(facebook-fields->map data "value"))
#=> {"a" "value", "b" {"c" {"t" "value"}, "d" "value"}, "e" {"f" "value"}, "g" "value"}
But it is fat and difficult to follow. I am wondering if there is a more elegant solution.

Here's another way to do it using postwalk for the whole traversal, rather than using it only for default-value replacement:
(defn facebook-fields->map
[fields default-value]
(clojure.walk/postwalk
(fn [v] (if (coll? v)
(->> (partition-all 2 1 v)
(remove (comp coll? first))
(map (fn [[l r]] [l (if (coll? r) r default-value)]))
(into {}))
v))
fields))
(facebook-fields->map ["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"] "value")
=> {"a" "value",
"b" {"c" {"t" "value"}, "d" "value"},
"e" {"f" "value"},
"g" "value"}

Trying to read heavily nested code makes my head hurt. It is worse when the answer is something of a "force-fit" with postwalk, which does things in a sort of "inside out" manner. Also, using partition-all is a bit of a waste, since we need to discard any pairs with two non-vectors.
To me, the most natural solution is a simple top-down recursion. The only problem is that we don't know in advance if we need to remove one or two items from the head of the input sequence. Thus, we can't use a simple for loop or map.
So, just write it as a straightforward recursion, and use an if to determine whether we consume 1 or 2 items from the head of the list.
If the 2nd item is a value, we consume one item and add in
:dummy-value to make a map entry.
If the 2nd item is a vector, we recurse and use that
as the value in the map entry.
The code:
(ns tst.demo.core
(:require [clojure.walk :as walk] ))
(def data ["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"])
(defn parse [data]
(loop [result {}
data data]
(if (empty? data)
(walk/keywordize-keys result)
(let [a (first data)
b (second data)]
(if (sequential? b)
(recur
(into result {a (parse b)})
(drop 2 data))
(recur
(into result {a :dummy-value})
(drop 1 data)))))))
with result:
(parse data) =>
{:a :dummy-value,
:b {:c {:t :dummy-value}, :d :dummy-value},
:e {:f :dummy-value},
:g :dummy-value}
I added keywordize-keys at then end just to make the result a little more "Clojurey".

Since you're asking for a cleaner solution as opposed to a solution, and because I thought it was a neat little problem, here's another one.
(defn facebook-fields->map [coll]
(into {}
(keep (fn [[x y]]
(when-not (vector? x)
(if (vector? y)
[x (facebook-fields->map y)]
[x "value"]))))
(partition-all 2 1 coll)))

clojure: Next element of an item that can fallback to first

I'd like to create a getnext fn that looks for a element in a coll and when match, return the next element. Also, it should return the first element if the last one is passed as argument.
(def coll ["a" "b" "c" "d"])
(defn get-next [coll item] ...)
(get-next coll "a") ;;=> "b"
(get-next coll "b") ;;=> "c"
(get-next coll "c") ;;=> "d"
(get-next coll "d") ;;=> "a" ; back to the beginning
Thanks!

How about this:
Append first item at the end of the sequence (lazily),
Drop non-items,
Return what's left (nil if item not found).
Or in code:
(defn get-next [coll item]
(->> (concat coll [(first coll)])
(drop-while (partial not= item))
second))

There are certainly purer lisp approaches than this one but, hey, as long as we got .indexOf, we might as well use it. The key to simplicity is that, plus cycle, so we don't have to check for the last item.
(defn get-next [coll item]
(nth (cycle coll) (inc (.indexOf coll item))))
Some test runs:
(get-next ["A" "B" "C" "D"] "B")
=> "C"
(get-next ["A" "B" "C" "D"] "D")
=> "A"
(get-next ["A" "B" "C" "D"] "E")
=> "A"
Whoops! Well, we didn't specify what we wanted to do if the item wasn't in the collection. Idiomatically, we would return nil, so we need a new get-next:
(defn get-next-2 [coll item]
(let [i (.indexOf coll item)]
(if (= -1 i) nil (nth (cycle coll) (inc i)))))
And now we catch the not-there case:
(get-next-2 ["A" "B" "C" "D"] "Q")
=> nil

I would convert coll to map and use it for lookups:
(def doll (zipmap coll (rest (cycle coll))))
(doll "a") => "b"
(doll "b") => "c"
(doll "d") => "a"

This is a good job for drop-while:
(defn get-next
[coll item]
(let [remainder (drop-while #(not= % item) coll)]
(when (empty? remainder)
(throw (IllegalArgumentException. (str "Item not found: " item))))
(if (< 1 (count remainder))
(nth remainder 1)
(first coll))))
(dotest
(let [coll [1 2 3 4]]
(is= 2 (get-next coll 1))
(is= 3 (get-next coll 2))
(is= 4 (get-next coll 3))
(is= 1 (get-next coll 4))
(throws? (get-next coll 5))))

How to parse an heterogeneous tree in clojure

I'm working on some Clojure code, in which I have a tree of entities represented as a nested vector like this:
(def tree '[SYMB1 "a" [SYMB2 {:k1 [SYMB1 "b" "c"]} "x"] {:k2 ["b" "c"]})
here, leaves are strings and nodes can be either symbols or maps. Each map having a key associated to a subtree or to a collection of leaves.
How can I render the tree above to get:
[SYMB1 "a" [SYMB2 [SYMB1 "b" "c"] "x"] "b" "c"]

It looks like you just want to throw away :k1 and :k2 whenever you encounter a map (and assume each map has only 1 key). You can do this easily using postwalk:
(ns ...
(:require
[clojure.walk :as walk]
))
(def tree
'[SYMB1 "a" [SYMB2 {k1 [SYMB1 "b" "c"]} "x"] {k2 ["b" "c"]} ])
(def desired
'[SYMB1 "a" [SYMB2 [SYMB1 "b" "c"] "x"] ["b" "c"]])
(let [result (walk/postwalk
(fn [item]
(cond
(map? item) (do
(when-not (= 1 (count item))
(throw (ex-info "Must be only 1 item" {:item item})))
(val (first item)))
:else item ))
tree) ]
(is= desired result))
result => [SYMB1 "a" [SYMB2 [SYMB1 "b" "c"] "x"] ["b" "c"]]
Note that the results for :k2 are still wrapped in a vector, unlike your original question. I'm not sure if that is what you meant or not.

Using clojure.spec:
(ns tree
(:require [clojure.spec.alpha :as s]))
(def tree '[SYMB1 "a" [SYMB2 {:k1 [SYMB1 "b" "c"]} "x"] {:k2 ["b" "c"]}])
(s/def ::leaf string?)
(s/def ::leafs (s/coll-of ::leaf))
(s/def ::map
(s/and
map?
(s/conformer
(fn [m]
(let [[_ v] (first m)]
(s/conform (s/or
:node ::node
:leafs ::leafs) v))))))
(s/def ::node (s/and
(s/or :symbol ::symbol
:leaf ::leaf
:map ::map)
(s/conformer second)))
(s/def ::symbol
(s/and
(s/cat :name
symbol?
:children
(s/* ::node))
(s/conformer (fn [parsed]
(let [{:keys [name children]} parsed]
(reduce
(fn [acc v]
(case (first v)
:leafs (into acc (second v))
:node (conj acc (second v))
(conj acc v)))
[name]
children))))))
(s/conform ::node tree) ;; [SYMB1 "a" [SYMB2 [SYMB1 "b" "c"] "x"] "b" "c"]

I found a solution using postwak and some helper functions:
(defn clause-coll? [item]
(and (vector? item)
(symbol? (first item))))
(defn render-map[amap]
(let [[[_ v]] (vec amap)]
(if (clause-coll? v)
[v]
v)))
(defn render-item[item]
(if (map? item)
(render-map item)
[item]))
(defn render-level [[op & etc]]
(->> (mapcat render-item etc)
(cons op)))
(defn parse-tree[form]
(clojure.walk/postwalk #(if (clause-coll? %)
(render-level %)
%)
form))

Michiel's clojure.spec solution was clever and Alan's clojure.walk solution was concise.
Without using any libraries and walking the tree directly:
(def tree
'[SYMB1 "a"
[SYMB2 {:k1 [SYMB1 "b" "c"]}
"x"]
{:k2 ["b" "c"]}])
(defn get-new-keys
"Determines next keys vector for tree navigation, can backtrack."
[source-tree current-keys current-node]
(if (and (vector? current-node) (symbol? (first current-node)))
(conj current-keys 0)
(let [last-index (->> current-keys count dec)]
(let [forward-keys (update-in current-keys [last-index] inc)
forward-node (get-in source-tree forward-keys)]
(if forward-node
forward-keys
(if (= 1 (count current-keys))
current-keys
(recur source-tree (subvec current-keys 0 last-index) current-node)))))))
(defn convert-tree
"Converts nested vector source tree to target tree."
([source-tree] (convert-tree source-tree [0] []))
([source-tree keys target-tree]
(let [init-node (get-in source-tree keys)
node (if (map? init-node)
(first (vals init-node))
(if (vector? init-node)
[]
init-node))
new-target-tree (update-in target-tree keys (constantly node))
new-keys (get-new-keys source-tree keys init-node)]
(if (= new-keys keys)
new-target-tree
(recur source-tree new-keys new-target-tree)))))
user=> (convert-tree tree)
[SYMB1 "a" [SYMB2 [SYMB1 "b" "c"] "x"] ["b" "c"]]

Clojure - transform nested map

I would like to create a mermaid graph from nested map like this
{"a" {"b" {"c" nil
"d" nil}}
"e" {"c" nil
"d" {"h" {"i" nil
"j" nil}}}}
I think it should first convert nested map to this form. Then it should be easy.
[{:out-path "a" :out-name "a"
:in-path "a-b" :in-name "b"}
{:out-path "a-b" :out-name "b"
:in-path "a-b-c" :in-name "c"}
{:out-path "a-b" :out-name "b"
:in-path "a-b-d" :in-name "d"}
{:out-path "e" :out-name "e"
:in-path "e-f" :in-name "f"}
{:out-path "e" :out-name "e"
:in-path "e-c" :in-name "c"}
{:out-path "e" :out-name "e"
:in-path "e-d" :in-name "d"}
{:out-path "e-d" :out-name "d"
:in-path "e-d-h" :in-name "h"}
{:out-path "e-d-h" :out-name "h"
:in-path "e-d-h-i" :in-name "i"}
{:out-path "e-d-h" :out-name "h"
:in-path "e-d-h-j" :in-name "j"}]
EDIT:
This is what I have created. But I have absolutely no idea how to add path to result map.
(defn myfunc [m]
(loop [in m out []]
(let [[[k v] & ts] (seq in)]
(if (keyword? k)
(cond
(map? v)
(recur (concat v ts)
(reduce (fn [o k2]
(conj o {:out-name (name k)
:in-name (name k2)}))
out (keys v)))
(nil? v)
(recur (concat v ts) out))
out))))

as far as i can see by mermaid docs, to draw the graph it is enough to generate all the nodes in the form of "x-->y" pairs.
we could do that with some a simple recursive function (i believe there are not so many levels in a graph to worry about stack overflow):
(defn map->mermaid [items-map]
(if (seq items-map)
(mapcat (fn [[k v]] (concat
(map (partial str k "-->") (keys v))
(map->mermaid v)))
items-map)))
in repl:
user>
(map->mermaid {"a" {"b" {"c" nil
"d" nil}}
"e" {"c" nil
"d" {"h" {"i" nil
"j" nil}}}})
;; ("a-->b" "b-->c" "b-->d" "e-->c" "e-->d" "d-->h" "h-->i" "h-->j")
so now you just have to make a graph of it like this:
(defn create-graph [items-map]
(str "graph LR"
\newline
(clojure.string/join \newline (map->mermaid items-map))
\newline))
update
you could use the same strategy for the actual map transformation, just passing the current path to map->mermaid:
(defn make-result-node [path name child-name]
{:out-path path
:out-name name
:in-path (str path "-" child-name)
:in-name child-name})
(defn map->mermaid
([items-map] (map->mermaid "" items-map))
([path items-map]
(if (seq items-map)
(mapcat (fn [[k v]]
(let [new-path (if (seq path) (str path "-" k) k)]
(concat (map (partial make-result-node new-path k)
(keys v))
(map->mermaid new-path v))))
items-map))))
in repl:
user>
(map->mermaid {"a" {"b" {"c" nil
"d" nil}}
"e" {"c" nil
"d" {"h" {"i" nil
"j" nil}}}})
;; ({:out-path "a", :out-name "a", :in-path "a-b", :in-name "b"}
;; {:out-path "a-b", :out-name "b", :in-path "a-b-c", :in-name "c"}
;; {:out-path "a-b", :out-name "b", :in-path "a-b-d", :in-name "d"}
;; {:out-path "e", :out-name "e", :in-path "e-c", :in-name "c"}
;; {:out-path "e", :out-name "e", :in-path "e-d", :in-name "d"}
;; {:out-path "e-d", :out-name "d", :in-path "e-d-h", :in-name "h"}
;; {:out-path "e-d-h", :out-name "h", :in-path "e-d-h-i", :in-name "i"}
;; {:out-path "e-d-h", :out-name "h", :in-path "e-d-h-j", :in-name "j"})

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Clojure partition list of strings, accumulating the result - clojure

(defn catsize [limit strs] (reduce (fn [res s] (let [base (peek res)] (if (> (+ (.length ^String base) (.length ^String s)) limit) (conj res s) (conj (pop res) (str base s))))) (if (seq strs) [(first strs)] []) (rest strs)))

Here's my take on this: (defn collapse [maxlen xs] (let [concats (take-while #(< (count %) maxlen) (reductions str xs))] (cons (last concats) (drop (count concats) xs)))) (collapse 4 ["a" "bc" "def" "ghij"]) ;; => ("abc" "def" "ghij")

This gets pretty close. I'm not sure why you have j at the end of the final string. (sequence (comp (mapcat seq) (partition-all 3) (map clojure.string/join)) ["a" "bc" "def" "ghij"]) => ("abc" "def" "ghi" "j")

Related

Explanation for combination of cycle drop and take in clojure

Translating vector into map

clojure: Next element of an item that can fallback to first

How to parse an heterogeneous tree in clojure

Clojure - transform nested map

Categories

Resources