Custom equality in Clojure distinct - clojure

In a Clojure program I've an array composed by maps containing peoples' names and emails.
e.g.
[
{ :name "John" :email "john#gmail.com" }
{ :name "Batman" :email "batman#gmail.com" }
{ :name "John Doe" :email "john#gmail.com" }
]
I'd like to remove the duplicate entries considering, for comparison purposes, pairs having the same e-mail to be equals. In the example above the output would be:
[
{ :name "John" :email "john#gmail.com" }
{ :name "Batman" :email "batman#gmail.com" }
]
What's the best way to achieve this in Clojure? Is there a way to let distinct knows what equals function to use?
Thanks.

yet another way to do it, kinda more idiomatic, i guess:
(let [items [{ :name "John" :email "john#gmail.com" }
{ :name "Batman" :email "batman#gmail.com" }
{ :name "John Doe" :email "john#gmail.com" }]]
(map first (vals (group-by :email items))))
output:
({:name "John", :email "john#gmail.com"}
{:name "Batman", :email "batman#gmail.com"})
that is how it works:
(group-by :email items) makes a map, whose keys are emails, and values are groups of records with this email
{"john#gmail.com" [{:name "John", :email "john#gmail.com"}
{:name "John Doe", :email "john#gmail.com"}],
"batman#gmail.com" [{:name "Batman", :email "batman#gmail.com"}]}
then you just need to take its vals (groups of records) and select firsts from them.
And another way is to create a sorted set by email, so it will treat all the records with equal emails as equal records:
(let [items [{ :name "John" :email "john#gmail.com" }
{ :name "Batman" :email "batman#gmail.com" }
{ :name "John Doe" :email "john#gmail.com" }]]
(into (sorted-set-by #(compare (:email %1) (:email %2))) items))
output:
#{{:name "Batman", :email "batman#gmail.com"}
{:name "John", :email "john#gmail.com"}}
don't really know which of them is more idiomatic and has a better performance. But i bet on the first one.

This would do it: https://crossclj.info/fun/medley.core/distinct-by.html.
The function in the link goes through every value lazily and stores everything it's seen. If the value in the coll is already seen, it does not add it.
You could then call this as: (distinct-by #(% :email) maps), where maps is your vector of people-maps.

A distinct-by can easily be implemented as
(defn distinct-by [f coll]
(let [groups (group-by f coll)]
(map #(first (groups %)) (distinct (map f coll)))))
For the example case this can be used like
(distinct-by :email
[{:name "John" :email "john#gmail.com"}
{:name "Batman" :email "batman#gmail.com"}
{:name "John Doe" :email "john#gmail.com"}])

Related

clojure - presence of a given set of keys in a clojure map

There is a map like this
{:buyers [{:name "James" :city "Berlin"} {:name "Jane" :city "Milan"}]
:sellers [{:name "Dustin" :city "Turin" :age "21"} {:name "Mark" :city "Milan"}]}
and I need to check only for :sellers that all the keys :name, :city and :age are present and if one is missing drop
that map all together and have a new structure as below:
{:buyers [{:name "James" :city "Berlin"} {:name "Jane" :city "Milan"}]
:sellers [{:name "Dustin" :city "Turin" :age "21"}]}
I came across validateur and I am trying to use it like:
(:require [validateur.validation :as v])
(def my-map {:buyers [{:name "James" :city "Berlin"} {:name "Jane" :city "Milan"}]
:sellers [{:name "Dustin" :city "Turin" :age "21"} {:name "Dustin" :city "Milan" :age "" } {:city "Rome" :age "22"}]})
(defn valid-seller? [mp]
(let [v-set (v/validation-set
(v/presence-of #{:name :city :age}))]
(fmap vec (v-set mp))))
(map valid-seller? (:sellers my-map))
=> ({} {:age ["can't be blank"]} {:name ["can't be blank"]})
But I do not know how to update my map so missing keys or nil values be dropped
To make the code more readable, I created a new predicate, valid-seller?, and put validation there. You can use any of these versions:
Pure Clojure:
(defn valid-seller? [m]
(every? #(contains? m %) [:name :city :age]))
Spec:
[org.clojure/spec.alpha "0.3.218"], require [clojure.spec.alpha :as spec]
(defn valid-seller? [m]
(spec/valid? (spec/keys :req-un [::name ::city ::age]) m))
Malli (if you also want to test type of values):
[metosin/malli "0.8.9"], require [malli.core :as malli]
(defn valid-seller? [m]
(malli/validate [:map
[:name :string]
[:city :string]
[:age :string]] m))
Then I used this predicate:
(update {:buyers [{:name "James" :city "Berlin"} {:name "Jane" :city "Milan"}]
:sellers [{:name "Dustin" :city "Turin" :age "21"} {:name "Mark" :city "Milan"}]}
:sellers
#(filter valid-seller? %))
=>
{:buyers [{:name "James", :city "Berlin"} {:name "Jane", :city "Milan"}],
:sellers ({:name "Dustin", :city "Turin", :age "21"})}
After your answer, I think you should use Malli, as it also checks the type of values. You can use some? for any non-nil value:
(defn valid-seller? [m]
(malli/validate [:map
[:name some?]
[:city some?]
[:age some?]] m))
(let [data {:buyers [{:name "James" :city "Berlin"} {:name "Jane" :city "Milan"}]
:sellers [{:name "Dustin" :city "Turin" :age "21"} {:name "Mark" :city "Milan"}]}]
(update data :sellers #(filter (every-pred :name :city :age) %)))

Clojure Map with # literal

What is the difference between
#:user{:profile{:name "Sally Clojurian"
:address {:city "Austin" :state "TX"}}}
and
{:user {:profile {:name "Sally Clojurian"
:address {:city "Austin" :state "TX"}}}}
I know that for the last on if I want to get name I can just do :
(get-in [:profile :name] map)
How would I get name for the first map?
The syntax of printing a map beginning with #:qualifer{ ... } is an abbreviated form of printing when all of the keys of the map are keywords with the same qualifier, or namespace. You can cause it to print without that abbreviation as shown below:
$ clojure
Clojure 1.10.1
user=> (def m1 #:user{:profile{:name "Sally Clojurian"
:address {:city "Austin" :state "TX"}}})
#'user/m1
user=> (pr m1)
#:user{:profile {:name "Sally Clojurian", :address {:city "Austin", :state "TX"}}}nil
user=> (doc *print-namespace-maps*)
-------------------------
clojure.core/*print-namespace-maps*
*print-namespace-maps* controls whether the printer will print
namespace map literal syntax. It defaults to false, but the REPL binds
to true.
nil
user=> (binding [*print-namespace-maps* false] (pr m1))
{:user/profile {:name "Sally Clojurian", :address {:city "Austin", :state "TX"}}}nil
If a map is def'ed as:
(def myMap #:user{:profile{:name "Sally Clojurian"
:address {:city "Austin" :state "TX"}}})
than to get to profile name you can do something like :
(get-in map [:user/profile :name])

Generating structured maps with test.check

I'm playing around with test.check, and I'm testing a function which takes a map as an argument. These maps do have a defined structure, such as:
{:name "Bob" :age 42 :email "bob#example.com" :admin true}
Key point, there is a set of expected keys, the values of which have differing clearly defined generators.
I took a look at gen/map, but it's not obvious how to use it for more structured key/value pairs:
(gen/sample (gen/map gen/keyword gen/boolean) 5)
;; => ({} {:z false} {:k true} {:v8Z false} {:9E false, :3uww false, :2s true})
This seems like a simple scenario, but I can't find an example.
How can I generate structured maps, such as the one described here, using test.check?
Use gen/hash-map instead of gen/map.
=> (gen/sample (gen/hash-map :name gen/string
:age gen/int
:email email-gen ; from test.check examples
:admin gen/boolean))
({:email "00w#hotmail.com", :age 0, :name "", :admin true}
{:email "mi6#computer.org", :age -1, :name "Á6", :admin false}
{:email "Ux4gp#hotmail.com", :age 4, :name "z", :admin true})

Display column names in table

I have created a table "sampletable". I am trying to display the name of the columns. I tried adding the text but it still doesn't show the names.
(def sampletable
(seesaw/table :model
[:columns [{:key :name, :text "Name"} :likes]
:rows [["Bobby" "Laura Palmer"]
["Agent Cooper" "Cherry Pie"]
{:likes "Laura Palmer" :name "James"}
{:name "Big Ed" :likes "Norma Jennings"}]]))
Any help would be appreciated.
Try this
(require '[seesaw.table])
(def sampletable
(seesaw/table
:model (seesaw.table/table-model
:columns [{:key :name, :text "Name"} :likes]
:rows [["Bobby" "Laura Palmer"]
["Agent Cooper" "Cherry Pie"]
{:likes "Laura Palmer" :name "James"}
{:name "Big Ed" :likes "Norma Jennings"}])))

clojure rename-keys in nested structure

Suppose I have a nested structure, something like this:
{:data1
{:categories [
{:name "abc" :id 234 :desc "whatever"}
{:name "def" :id 456 :desc "nothing"}]
}
:data2 {...}
:data3 {...}
}
And I need to transform the key names in the maps. I can transform the top level keys like this:
(rename-keys mymap {:data1 :d1})
But I'm not sure how to rename keys nested more deeply in the data structure (say I want to rename the :desc field to :description).
I'm pretty sure that zippers are the answer but can't quite figure out how to do it, or if there's a more straightforward way.
Same as Brian Carper's solution, except the walk namespace already has a specific function for this purpose. All keys at all levels are changed, be they nested inside any sort of collection or seq.
(:use 'clojure.walk)
(def x
{:data1
{:categories
[{:desc "whatever", :name "abc", :id 234}
{:desc "nothing", :name "def", :id 456}]},
:data2
{:categories
[{:desc "whatever", :name "abc", :id 234}
{:desc "nothing", :name "def", :id 456}]}})
(postwalk-replace {:desc :something} x)
{:data1
{:categories
[{:something "whatever", :name "abc", :id 234}
{:something "nothing", :name "def", :id 456}]},
:data2
{:categories
[{:something "whatever", :name "abc", :id 234}
{:something "nothing", :name "def", :id 456}]}}
postwalk is a pretty heavy sledgehammer in general, although it looks from your original question like you might need it. In many cases, you can perform updates in a nested structure with update-in:
user> (let [m {:foo {:deep {:bar 1 :baz 2}}}]
(update-in m [:foo :deep] clojure.set/rename-keys {:baz :periwinkle}))
{:foo {:deep {:periwinkle 2, :bar 1}}}
If you want to rename all :desc keys regardless of at which level of nesting they're located, this might work. If you only want to rename :desc keys at a certain level of nesting, you'll need something slightly more sophisticated.
This only works because clojure.set/rename-keys currently does nothing (returns its first argument untouched) if its first argument isn't a map.
user> (require '[clojure [set :as set] [walk :as walk]])
nil
user> (def x {:data1
{:categories
[{:desc "whatever", :name "abc", :id 234}
{:desc "nothing", :name "def", :id 456}]},
:data2
{:categories
[{:desc "whatever", :name "abc", :id 234}
{:desc "nothing", :name "def", :id 456}]}})
#'user/x
user> (walk/postwalk #(set/rename-keys % {:desc :description :id :ID}) x)
{:data1
{:categories
[{:name "abc", :ID 234, :description "whatever"}
{:name "def", :ID 456, :description "nothing"}]},
:data2
{:categories
[{:name "abc", :ID 234, :description "whatever"}
{:name "def", :ID 456, :description "nothing"}]}}
nil