how to extract data in nested list/vector clojure - clojure

I have parse xml and get the following result
(({:tag :Column,
:attrs {:Name "VENDOR_KEY", :Type "Int", :NotNull "Yes"},
:content nil}
{:tag :Column,
:attrs {:Name "RETAILER_KEY", :Type "Int", :NotNull "Yes"},
:content nil}
{:tag :Column,
:attrs {:Name "ITEM_KEY", :Type "Int", :NotNull "Yes"},
:content nil})
({:tag :Column,
:attrs {:Name "Store_Key", :Type "Int", :NotNull "Yes"},
:content nil}))
then how to convert it to the following, basically I want to extract the value of key :attrs in nested list.
(
({:Name "VENDOR_KEY", :Type "Int", :NotNull "Yes"},
{:Name "RETAILER_KEY", :Type "Int", :NotNull "Yes"},
{:Name "ITEM_KEY", :Type "Int", :NotNull "Yes"}),
({:Name "Store_Key", :Type "Int", :NotNull "Yes"})
)

so yes right here your solution as
hsestupin said
(map #(map :attrs %) result)
i am assuming result is your input data.

Related

Clojure: Transform nested maps into custom map keeping only specific attributes

I have a vector of maps (result of xml/parse) which contains the following vector of nested maps (I already got rid of some parts I don't want to keep):
[
{:tag :SoapObject, :attrs nil, :content [
{:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["ID"]}
{:tag :FieldValue, :attrs nil, :content ["8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1"]}
]}
{:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
{:tag :FieldValue, :attrs nil, :content ["Value_1a"]}
]}
{:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
{:tag :FieldValue, :attrs nil, :content ["Value_2a"]}
]}
]}
{:tag :SoapObject, :attrs nil, :content [
{:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["ID"]}
{:tag :FieldValue, :attrs nil, :content ["90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]}
]}
{:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
{:tag :FieldValue, :attrs nil, :content ["Value_1b"]}
]}
{:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
{:tag :FieldValue, :attrs nil, :content ["Value_2b"]}
]}
]}
]
Now I want to extract only some specific data from this structure, producing a result which looks like this:
[
{"ID" "8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1",
"Attribute_1" "Value_1a",
"Attribute_2" "Value_1a"}
{"ID" "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1",
"Attribute_1" "Value_1b",
"Attribute_2" "Value_1b"}
]
Which clojure tool could help me accomplish this?
I've found another question which is a bit similar, but whenever I tried some version of a map call the result I got was some kind of clojure.lang.LazySeq or clojure.core$map which I couldn't get to print properly to verify the result.
usually you can start from the bottom, gradually going up:
first you would like to parse the attr item:
(def first-content (comp first :content))
(defn get-attr [{[k v] :content}]
[(first-content k)
(first-content v)])
user> (get-attr {:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["ID"]}
{:tag :FieldValue, :attrs nil, :content ["90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]}
]})
;;=> ["ID" "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]
then you would turn every item into a map of attrs:
(defn parse-item [item]
(into {} (map get-attr (:content item))))
(parse-item {:tag :SoapObject, :attrs nil, :content [
{:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["ID"]}
{:tag :FieldValue, :attrs nil, :content ["90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]}
]}
{:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
{:tag :FieldValue, :attrs nil, :content ["Value_1b"]}
]}
{:tag :ObjectData, :attrs nil, :content [
{:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
{:tag :FieldValue, :attrs nil, :content ["Value_2b"]}
]}
]})
;;=> {"ID" "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1", "Attribute_1" "Value_1b", "Attribute_2" "Value_2b"}
so the last thing you need do, is to map over the top level form, producing the required result:
(mapv parse-item data)
;;=> [{"ID" "8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1", "Attribute_1" "Value_1a", "Attribute_2" "Value_2a"}
;; {"ID" "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1", "Attribute_1" "Value_1b", "Attribute_2" "Value_2b"}]
You can easily solve tree-based problems using the Tupelo Forest library. You can see a video introduction from last year's Clojure Conj here.
For your problem, I'd approach it as follows. First, the data:
(dotest
(let [data-enlive
{:tag :root
:attrs nil
:content
[{:tag :SoapObject, :attrs nil,
:content
[{:tag :ObjectData, :attrs nil,
:content [{:tag :FieldName, :attrs nil, :content ["ID"]}
{:tag :FieldValue, :attrs nil, :content ["8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1"]}]}
{:tag :ObjectData, :attrs nil,
:content [{:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
{:tag :FieldValue, :attrs nil, :content ["Value_1a"]}]}
{:tag :ObjectData, :attrs nil,
:content [{:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
{:tag :FieldValue, :attrs nil, :content ["Value_2a"]}]}]}
{:tag :SoapObject, :attrs nil,
:content
[{:tag :ObjectData, :attrs nil,
:content [{:tag :FieldName, :attrs nil, :content ["ID"]}
{:tag :FieldValue, :attrs nil, :content ["90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"]}]}
{:tag :ObjectData, :attrs nil,
:content [{:tag :FieldName, :attrs nil, :content ["Attribute_1"]}
{:tag :FieldValue, :attrs nil, :content ["Value_1b"]}]}
{:tag :ObjectData, :attrs nil,
:content [{:tag :FieldName, :attrs nil, :content ["Attribute_2"]}
{:tag :FieldValue, :attrs nil, :content ["Value_2b"]}]}]}]}]
and then the code
(with-debug-hid
(with-forest (new-forest)
(let [root-hid (add-tree-enlive data-enlive)
soapobj-hids (find-hids root-hid [:root :SoapObject])
objdata->map (fn [objdata-hid]
(let [fieldname-node (hid->node (find-hid objdata-hid [:ObjectData :FieldName]))
fieldvalue-node (hid->node (find-hid objdata-hid [:ObjectData :FieldValue]))]
{ (grab :value fieldname-node) (grab :value fieldvalue-node) }))
soapobj->map (fn [soapobj-hid]
(apply glue
(for [objdata-hid (hid->kids soapobj-hid)]
(objdata->map objdata-hid))))
results (mapv soapobj->map soapobj-hids)]
with intermediate results:
(is= (hid->bush root-hid)
[{:tag :root}
[{:tag :SoapObject}
[{:tag :ObjectData}
[{:tag :FieldName, :value "ID"}]
[{:tag :FieldValue, :value "8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1"}]]
[{:tag :ObjectData}
[{:tag :FieldName, :value "Attribute_1"}]
[{:tag :FieldValue, :value "Value_1a"}]]
[{:tag :ObjectData}
[{:tag :FieldName, :value "Attribute_2"}]
[{:tag :FieldValue, :value "Value_2a"}]]]
[{:tag :SoapObject}
[{:tag :ObjectData}
[{:tag :FieldName, :value "ID"}]
[{:tag :FieldValue, :value "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1"}]]
[{:tag :ObjectData}
[{:tag :FieldName, :value "Attribute_1"}]
[{:tag :FieldValue, :value "Value_1b"}]]
[{:tag :ObjectData}
[{:tag :FieldName, :value "Attribute_2"}]
[{:tag :FieldValue, :value "Value_2b"}]]]])
(is= soapobj-hids [:0009 :0013])
and the final results:
(is= results
[{"ID" "8d8edbb6-cb0f-11e8-a8d5-f2801f1b9fd1",
"Attribute_1" "Value_1a",
"Attribute_2" "Value_2a"}
{"ID" "90e39036-cb0f-11e8-a8d5-f2801f1b9fd1",
"Attribute_1" "Value_1b",
"Attribute_2" "Value_2b"}]))))))
Further documentation is still in progress, but you can see API docs here and a live example of your problem here.
You can also compose transducers. I was reading the other day something on JUXT blog about creating xpath like functionality with transducers.
(def children (map :content))
(defn tagp [pred]
(filter (comp pred :tag)))
(defn tag= [tag-name]
(tagp (partial = tag-name)))
(def text (comp (mapcat :content) (filter string?)))
(defn fields [obj-datas]
(sequence (comp
(tag= :ObjectData)
(mapcat :content)
text)
obj-datas))
(defn clean [xml-map]
(let [fields-list (sequence (comp
(tag= :SoapObject)
children
(map fields))
xml-map)]
(map (partial apply hash-map) fields-list)))
No need for fancy tools here. You can get away with the simplest chunk of code.
(use '[plumbing.core])
(let [A ...your-data...]
(map (fn->> :content
(mapcat :content)
(mapcat :content)
(apply hash-map))
A))

Cleaner Way to Sort and Order a Vector of Maps in Clojure?

I have a vector of maps wherein I need to remove the maps where the value of the name key is a duplicate, keeping the one that has the highest value of age. I have a solution but I don't think it looks clean. Is there a better way to do it without breaking it up into multiple functions?
Here is my data:
(def my-maps
[{:name "jess", :age 32}
{:name "ruxpin", :age 4}
{:name "jess", :age 35}
{:name "aero", :age 33}
{:name "banner", :age 4}])
Here is my solution:
(map first (vals (group-by :name (reverse (sort-by :name my-maps)))))
Result:
({:name "ruxpin", :age 4} {:name "jess", :age 35} {:name "banner", :age 4} {:name "aero", :age 33})
another way is the combination of group-by and max-key. The advantage of this method is that you don't need to sort your collection, and sort in turn has an impact on performance and if it can be avoided it should be.
(for [[_ vs] (group-by :name my-maps)]
(apply max-key :age vs))
;;=> ({:name "jess", :age 35}
;; {:name "ruxpin", :age 4}
;; {:name "aero", :age 33}
;; {:name "banner", :age 4})
short version
(->> my-set
(sort-by (juxt :name :age) #(compare %2 %1)) ; sort-by :name, :age in reverse order
(partition-by :name)
(map first))
a transducer version
(def xf (comp (partition-by :name) (map first)))
(->> my-set
(sort-by (juxt :name :age) #(compare %2 %1))
(into [] xf))
for large dataset, the transducer should be better
Your original solution was actually broken unfortunately. It just seemed to work because of the order you had the data in my-set in. Note how you never actually sort by age, so you can never guarantee what order the ages are in.
I solved this with another call to map:
(->> my-set (group-by :name)
(vals)
; Sort by age each list that group-by returns
(map #(sort-by :age %))
(map last)) ; This could also happen in the above map
Note how I'm sorting each :name group by :age, then I take the last of each grouping.
I would do it a little differently, using the max function instead of sorting:
(def my-maps
[{:name "jess", :age 32}
{:name "ruxpin", :age 4}
{:name "jess", :age 35}
{:name "aero", :age 33}
{:name "banner", :age 4}])
(dotest
(let [grouped-data (group-by :name my-maps)
name-age-maps (for [[name map-list] grouped-data]
(let [max-age (apply max
(map :age map-list))
name-age-map {name max-age}]
name-age-map))
final-result (reduce into {} name-age-maps)]
final-result))
with results:
grouped-data =>
{"jess" [{:name "jess", :age 32} {:name "jess", :age 35}],
"ruxpin" [{:name "ruxpin", :age 4}],
"aero" [{:name "aero", :age 33}],
"banner" [{:name "banner", :age 4}]}
name-age-maps =>
({"jess" 35} {"ruxpin" 4} {"aero" 33} {"banner" 4})
final-result =>
{"jess" 35, "ruxpin" 4, "aero" 33, "banner" 4}
Compare by vector fields with different weight and data type (size has more weight), size is descending, name is ascending:
(def some-vector [{:name "head" :size 3}
{:name "mouth" :size 1}
{:name "nose" :size 1}
{:name "neck" :size 2}
{:name "chest" :size 10}
{:name "back" :size 10}
{:name "abdomen" :size 6}
])
(->> (some-vector)
(sort #(compare (str (format "%3d" (:size %2)) (:name %1))
(str (format "%3d" (:size %1)) (:name %2))
)))

Removing nested values with Specter in Clojure

Suppose that I have a Clojure map like this:
(def mymap {:a [1 2 3] :b {:c [] :d [1 2 3]}})
I would like a function remove-empties that produces a new map in which entries from (:b mymap) that have an empty sequence as a value are removed. So (remove-empties mymap) would give the value:
{:a [1 2 3] :b {:d [1 2 3]}}
Is there a way to write a function to do this using Specter?
Here's how to do it with Specter:
(use 'com.rpl.specter)
(setval [:b MAP-VALS empty?] NONE my-map)
=> {:a [1 2 3], :b {:d [1 2 3]}}
In English, this says "Under :b, find all the map values that are empty?. Set them to NONE, i.e. remove them."
(update my-map :b (fn [b]
(apply dissoc b
(map key (filter (comp empty? val) b)))))
This is the specter solution:
(ns myns.core
(:require
[com.rpl.specter :as spc]))
(def my-map
{:a [1 2 3]
:b {:c []
:d [1 2 3]}})
(defn my-function
[path data]
(let [pred #(and (vector? %) (empty? %))]
(spc/setval [path spc/MAP-VALS pred] spc/NONE data)))
;; (my-function [:b] my-map) => {:a [1 2 3]
;; :b {:d [1 2 3]}}
I don't know specter either, but this is pretty simple to do in plain clojure.
(defn remove-empties [m]
(reduce-kv (fn [acc k v]
(cond (map? v) (let [new-v (remove-empties v)]
(if (seq new-v)
(assoc acc k new-v)
acc))
(empty? v) acc
:else (assoc acc k v)))
(empty m), m))
Caveat: For extremely nested data structures it might stack overflow.
So far I haven't found an approach with specter's filterer, because when I test filterers they seem to receive each map entry twice (once as a map entry and once as a 2-length vector) and giving different results between those seems to cause issues. However, we shouldn't be removing empty sequences anywhere they might appear, just map entries where they're the value.
I did seem to get a clojure.walk approach working that might still interest you, though.
(ns nested-remove
(:require [com.rpl.specter :as s]
[clojure.walk :refer [postwalk]]))
(defn empty-seq-entry? [entry]
(and (map-entry? entry) (sequential? (val entry)) (empty? (val entry))))
(defn remove-empties [root]
(postwalk #(if (map? %) (into (empty %) (remove empty-seq-entry? %)) %) root))
(remove-empties mymap) ;;=> {:a [1 2 3], :b {:d [1 2 3]}}
Assuming we only need to go one level deep and not search recursively like the accepted answer:
(setval [:b MAP-VALS empty?] NONE mymap)
A fully recursive solution that removes empty values in a map at any level
(def my-complex-map {:a [1] :b {:c [] :d [1 2 3] :e {:f "foo" :g []}}})
; declare recursive path that traverses map values
(declarepath DEEP-MAP-VALS)
(providepath DEEP-MAP-VALS (if-path map? [MAP-VALS DEEP-MAP-VALS] STAY))
(setval [DEEP-MAP-VALS empty?] NONE my-complex-map)
; => {:a [1], :b {:d [1 2 3, :e {:f "foo"}}}}
Reference the wiki on using specter recursively.
While I am not very familiar with Specter, in addition to the postwalk solution, you can solve this using tupelo.forest from the Tupelo library. You do need to rearrange the data a bit into Hiccup or Enlive format, then it's easy to identify any nodes with no child nodes:
(ns tst.clj.core
(:use clj.core tupelo.test)
(:require
[tupelo.core :as t]
[tupelo.forest :as tf] ))
(t/refer-tupelo)
(defn hid->enlive [hid]
(tf/hiccup->enlive (tf/hid->hiccup hid)))
(defn empty-kids?
[path]
(let [hid (last path)
result (and (tf/node-hid? hid)
(empty? (grab :kids (tf/hid->tree hid))))]
result))
; delete any nodes without children
(dotest
(tf/with-forest (tf/new-forest)
(let [e0 {:tag :root
:attrs {}
:content [{:tag :a
:attrs {}
:content [1 2 3]}
{:tag :b
:attrs {}
:content [{:tag :c
:attrs {}
:content []}
{:tag :d
:attrs {}
:content [1 2 3]}
]}]}
root-hid (tf/add-tree-enlive e0)
empty-paths (tf/find-paths-with root-hid [:** :*] empty-kids?)
empty-hids (mapv last empty-paths)]
(is= (hid->enlive root-hid) ; This is the original tree structure (Enlive format)
{:tag :root,
:attrs {},
:content
[{:tag :a,
:attrs {},
:content
[{:tag :tupelo.forest/raw, :attrs {}, :content [1]}
{:tag :tupelo.forest/raw, :attrs {}, :content [2]}
{:tag :tupelo.forest/raw, :attrs {}, :content [3]}]}
{:tag :b,
:attrs {},
:content
[{:tag :c, :attrs {}, :content []}
{:tag :d,
:attrs {},
:content
[{:tag :tupelo.forest/raw, :attrs {}, :content [1]}
{:tag :tupelo.forest/raw, :attrs {}, :content [2]}
{:tag :tupelo.forest/raw, :attrs {}, :content [3]}]}]}]})
(apply tf/remove-hid empty-hids) ; remove the nodes with no child nodes
(is= (hid->enlive root-hid) ; this is the result (Enlive format)
{:tag :root,
:attrs {},
:content
[{:tag :a,
:attrs {},
:content
[{:tag :tupelo.forest/raw, :attrs {}, :content [1]}
{:tag :tupelo.forest/raw, :attrs {}, :content [2]}
{:tag :tupelo.forest/raw, :attrs {}, :content [3]}]}
{:tag :b,
:attrs {},
:content
[{:tag :d,
:attrs {},
:content
[{:tag :tupelo.forest/raw, :attrs {}, :content [1]}
{:tag :tupelo.forest/raw, :attrs {}, :content [2]}
{:tag :tupelo.forest/raw, :attrs {}, :content [3]}]}]}]})
(is= (tf/hid->hiccup root-hid) ; same result in Hiccup format
[:root
[:a
[:tupelo.forest/raw 1]
[:tupelo.forest/raw 2]
[:tupelo.forest/raw 3]]
[:b
[:d
[:tupelo.forest/raw 1]
[:tupelo.forest/raw 2]
[:tupelo.forest/raw 3]]]])
)))

Update hierarchical / tree structure in Clojure

I have an Atom, like x:
(def x (atom {:name "A"
:id 1
:children [{:name "B"
:id 2
:children []}
{:name "C"
:id 3
:children [{:name "D" :id 4 :children []}]}]}))
and need to update a submap like for example:
if :id is 2 , change :name to "Z"
resulting in an updated Atom:
{:name "A"
:id 1
:children [{:name "Z"
:id 2
:children []}
{:name "C"
:id 3
:children [{:name "D" :id 4 :children []}]}]}
how can this be done?
You could do it with postwalk or prewalk from the clojure.walk namespace.
(def x (atom {:name "A"
:id 1
:children [{:name "B"
:id 2
:children []}
{:name "C"
:id 3
:children [{:name "D" :id 4 :children []}]}]}))
(defn update-name [x]
(if (and (map? x) (= (:id x) 2))
(assoc x :name "Z")
x))
(swap! x (partial clojure.walk/postwalk update-name))
You could also use Zippers from the clojure.zip namespace
Find a working example here: https://gist.github.com/renegr/9493967

How can I sort a clojure set of maps?

I have a set of maps something like this:
#{
{:name "a" :value "b" ... more stuff here}
{:name "b" :value "b" ... more stuff here}
{:name "b" :value "b" ... more stuff here}
{:name "a" :value "b" ... more stuff here}
{:name "c" :value "b" ... more stuff here}
{:name "a" :value "b" ... more stuff here}
}
: and I want to get to an ordered list, much like sql order-by name:
[
{:name "a" :value "b" ... more stuff here}
{:name "a" :value "b" ... more stuff here}
{:name "a" :value "b" ... more stuff here}
{:name "b" :value "b" ... more stuff here}
{:name "b" :value "b" ... more stuff here}
{:name "c" :value "b" ... more stuff here}
]
: how can I do this?
Function sort-by is what you're looking for:
(def s
#{
{:name "d" :value "b" }
{:name "b" :value "b" }
{:name "c" :value "b" }
})
(sort-by :name s)
sort-by is a great answer, and makes the code a lot better in the simple cases where it works. Additionally the sort function can take a function to extract the comparason key from each map incase you need to do some processing on each item. In this example i use a sort function that extracts each name and then does a string compare on them.
(sort #(compare (:name %1) (:name %2)) data)
=> ({:name "a", :value "b"} {:name "b", :value "b"} {:name "c", :value "b"})
this is useful if your collections had different names to be compared:
(sort #(compare (:value %1) (:name %2)) data)
=> ({:name "a", :value "b"} {:name "c", :value "b"} {:name "b", :value "b"})
the compare function is a better version of java's .compareto() because it properly handles nil and compares clojure collections properly. is is basically a short cut for using the . opperator in most cases
(sort #(. (:name %1) (compareTo (:name %2))) data)
=> ({:name "a", :value "b"} {:name "b", :value "b"} {:name "c", :value "b"})
(def set-of-maps #{{:name "d"}, {:name "b"}, {:name "a"}})
-> clojure.core/sort-by
(sort-by :name set-of-maps)
; => ({:name "a", :value "b"} {:name "c", :value "b"} {:name "d", :value "b"})
sort-by is what you want, but please post snippets that are actually valid code; I wasted a fair bit of time trying to figure out a problem that wound up being because #{{:name "a" :value "b"} {:name "a" :value "b"}} makes the reader barf.
I believe the snippet from the joy of clojure is the neatest.
(def plays [{:band "Burial", :plays 979, :loved 9}
{:band "Eno", :plays 2333, :loved 15}
{:band "Bill Evans", :plays 979, :loved 9}
{:band "Magma", :plays 2665, :loved 31}])
(def sort-by-loved-ratio (partial sort-by #(/ (:plays %) (:loved %))))