why are `disj` and `dissoc` distinct functions in Clojure? - clojure

So far as I've seen, Clojure's core functions almost always work for different types of collection, e.g. conj, first, rest, etc. I'm a little puzzled why disj and dissoc are different though; they have the exact same signature:
(dissoc map) (dissoc map key) (dissoc map key & ks)
(disj set) (disj set key) (disj set key & ks)
and fairly similar semantics. Why aren't these both covered by the same function? The only argument I can see in favor of this is that maps have both (assoc map key val) and (conj map [key val]) to add entries, while sets only support (conj set k).
I can write a one-line function to handle this situation, but Clojure is so elegant so much of the time that it's really jarring to me whenever it isn't :)

Just to provide a counterpoise to Arthur's answer: conj is defined even earlier (the name conj appears on line 82 of core.clj vs.1443 for disj and 1429 for dissoc) and yet works on all Clojure collection types. :-) Clearly it doesn't use protocols – instead it uses a regular Java interface, as do most Clojure functions (in fact I believe that currently the only piece of "core" functionality in Clojure that uses protocols is reduce / reduce-kv).
I'd conjecture that it's due to an aesthetic choice, and indeed probably related to the way in which maps support conj – were they to support disj, one might expect it to take the same arguments that could be passed to conj, which would be problematic:
;; hypothetical disj on map
(disj {:foo 1
[:foo 1] 2
{:foo 1 [:foo 1] 2} 3}
}
{:foo 1 [:foo 1] 2} ;; [:foo 1] similarly problematic
)
Should that return {}, {:foo 1 [:foo 1] 2} or {{:foo 1 [:foo 1] 2} 3}? conj happily accepts [:foo 1] or {:foo 1 [:foo 1] 2} as things to conj on to a map. (conj with two map arguments means merge; indeed merge is implemented in terms of conj, adding special handling of nil).
So perhaps it makes sense to have dissoc for maps so that it's clear that it removes a key and not "something that could be conj'd".
Now, theoretically dissoc could be made to work on sets, but then perhaps one might expect them to also support assoc, which arguably wouldn't really make sense. It might be worth pointing out that vectors do support assoc and not dissoc, so these don't always go together; there's certainly some aesthetic tension here.

It's always dubious to try to answer for the motivations of others, though I strongly suspect this is a bootstrapping issue in core.clj. both of these functions are defined fairly early in core.clj and are nearly identical except that they each take exactly one type and call a method on it directly.
(. clojure.lang.RT (dissoc map key))
and
(. set (disjoin key))
both of these functions are defined before protocals are defined in core.clj so they can't use a protocol to dispatch between them based on type. Both of these where also defined in the language specification before protocols existed. They are also both called often enough that there would be a strong incentive to make them as fast as possible.

(defn del
"Removes elements from coll which can be set, vector, list, map or string"
[ coll & rest ]
(let [ [ w & tail ] rest ]
(if w
(apply del (cond
(set? coll) (disj coll w)
(list? coll) (remove #(= w %) coll)
(vector? coll) (into [] (remove #(= w % ) coll))
(map? coll) (dissoc coll w)
(string? coll) (.replaceAll coll (str w) "")) tail)
coll)))
Who cares? Just use function above and forget about the pasts...

Related

Loop through vector of vectors and remove element from vector in Clojure

I am new to clojure programming and would like some help with some code. I have this vector of vectors like below,
(def start-pos [[[:fox :goose :corn :you] [:boat] []]])
I would like to loop through the vector and remove an element from one of the internal vectors, e.g. remove ':goose' from start-pos.
I tried the code below but for some reason it doesnt work as intended,
(map #(disj (set %) :goose) start-pos)
Instead the result is,
(#{[:boat] [] [:fox :goose :corn :you]})
As you can see from the result, the internal vectors are now a set and yes, the original order is distorted, is there a way of removing the element and not disarrange the original order of the vectors, maybe without converting it to a set first? I choose this conversion to a set first because according to the docs disj only works for sets.
Add: This post is not similar to this suggested post as my vector is nested three vectors deep.
the internal vectors are now a set
That's because the result of #(disj (set %) :goose) returns a set.
original order is distorted
Sets don't preserve insertion order by default, similar to maps with over 8 keys.
I would like to loop through the vector and remove an element from one of the internal vectors, e.g. remove ':goose' from start-pos.
The function you need for removing an element from a collection by predicate is called remove, but...
The value you want to remove is actually nested three vectors deep in start-pos, so you'd need an additional iteration for each inner vector, and so on if you wanted to remove the keyword :goose from every vector recursively. That's an excuse to use clojure.walk:
(clojure.walk/postwalk
(fn [v]
(if (coll? v)
(into (empty v) (remove #{:goose}) v)
v))
start-pos)
=> [[[:fox :corn :you] [:boat] []]]
This walks every value in start-pos, removing :goose from any collections it finds.
Here is a less flexible approach, that I made more so for my own benefit (learning Clojure)
(update-in
start-pos
[0 0]
#(vec (concat
(subvec % 0 1)
(subvec % (inc 1)))))
It manually navigates in and reconstructs the :goose level of keywords to not have :goose inside
I think some alternative approaches to this problem include Specter and Zippers
you could also employ clojure zipper for that:
user> (require '[clojure.zip :as z])
user> (loop [curr (z/vector-zip start-pos)]
(cond (z/end? curr) (z/root curr)
(= :goose (z/node curr)) (recur (z/remove curr))
:else (recur (z/next curr))))
;; => [[[:fox :corn :you] [:boat] []]]
also, that is quite easy to do with clojure's core functions only:
user> (defn remv [pred data]
(if (vector? data)
(mapv (partial remv pred) (remove pred data))
data))
#'user/remv
user> (remv #{:goose} start-pos)
;; => [[[:fox :corn :you] [:boat] []]]

Clojure: How to determine if a nested list contains non-numeric items?

I need to write a Clojure function which takes an unevaluated arbitrarily deep nesting of lists as input, and then determines if any item in the list (not in function position) is non-numeric. This is my first time writing anything in Clojure so I am a bit confused. Here is my first attempt at making the function:
(defn list-eval
[x]
(for [lst x]
(for [item lst]
(if(integer? item)
(println "")
(println "This list contains a non-numeric value")))))
I tried to use a nested for-loop to iterate through each item in every nested list. Trying to test the function like so:
=> (list-eval (1(2 3("a" 5(3)))))
results in this exception:
ClassCastException java.lang.Long cannot be cast to clojure.lang.IFn listeval.core/eval7976 (form-init4504441070457356195.clj:1)
Does the problem here lie in the code, or in how I call the function and pass an argument? In either case, how can I make this work as intended?
This happens because (1 ..) is treated as calling a function, and 1 is a Long, and not a function. First you should change the nested list to '(1(2 3("a" 5(3)))). Next you can change your function to run recursively:
(defn list-eval
[x]
(if (list? x)
(for [lst x] (list-eval lst))
(if (integer? x)
(println "")
(println "This list contains a non-numeric value"))))
=> (list-eval '(1(2 3("a" 5(3)))))
There is a cool function called tree-seq that does all the hard work for you in traversing the structure. Use it then remove any collections, remove all numbers, and check if there is anything left.
(defn any-non-numbers?
[x]
(->> x
(tree-seq coll? #(if (map? %) (vals %) %))
(remove (some-fn coll? number?))
not-empty
boolean))
Examples:
user=> (any-non-numbers? 1)
false
user=> (any-non-numbers? [1 2])
false
user=> (any-non-numbers? [1 2 "sd"])
true
user=> (any-non-numbers? [1 2 "sd" {:x 1}])
true
user=> (any-non-numbers? [1 2 {:x 1}])
false
user=> (any-non-numbers? [1 2 {:x 1 :y "hello"}])
true
If you want to consider map keys as well, just change (vals %) to (interleave (keys %) (vals %)).
quoting
As others have mentioned, you need to quote a list to keep it from being evaluated as
code. That's the cause of the exception you're seeing.
for and nesting
for will only descend to the nesting depth you tell it to. It is not a for loop,
as you might expect, but a sequence comprehension, like the the python list comprehension.
(for [x xs, y ys] y) will presume that xs is a list of lists and flatten it.
(for [x xs, y ys, z zs] z) Is the same but with an extra level of nesting.
To walk down to any depth, you'd usually use recursion.
(There are ways to do this iteratively, but they're more difficult to wrap your head around.)
side effects
You're doing side effects (printing) inside a lazy sequence. This will work at the repl,
but if you're not using the result anywhere, it won't run and cause great confusion.
It's something every new clojurian bumps into at some point.
(doseq is like for, but for side effects.)
The clojure way is to separate functions that work with values from functions that
"do stuff", like printing to the console of launching missiles, and to keep the
side effecting functions as simple as possible.
putting it all together
Let's make a clear problem statement: Is there a non number anywhere inside an
arbitrarily nested list? If there is, print a message saying that to the console.
In a lot of cases, when you'd use a for loop in other langs reduce is what you want in clojure.
(defn collect-nested-non-numbers
;; If called with one argument, call itself with empty accumulator
;; and that argument.
([form] (collect-nested-non-numbers [] form))
([acc x]
(if (coll? x)
;; If x is a collection, use reduce to call itself on every element.
(reduce collect-nested-non-numbers acc x)
;; Put x into the accumulator if it's a non-number
(if (number? x)
acc
(conj acc x)))))
;; A function that ends in a question mark is (by convention) one that
;; returns a boolean.
(defn only-numbers? [form]
(empty? (collect-nested-non-numbers form)))
;; Our function that does stuff becomes very simple.
;; Which is a good thing, cause it's difficult to test.
(defn warn-on-non-numbers [form]
(when-not (only-numbers? form)
(println "This list contains a non-numeric value")))
And that'll work. There already exists a bunch of things that'll help you walk a nested structure, though, so you don't need to do it manually.
There's the clojure.walk namespace that comes with clojure. It's for when you have
a nested thing and want to transform some parts of it. There's tree-seq which is explained
in another answer. Specter is a library which is
a very powerful mini language for expressing transformations of nested structures.
Then there's my utils library comfy which contains reduce versions of the
functions in clojure.walk, for when you've got a nested thing and want to "reduce" it to a single value.
The nice thing about that is that you can use reduced which is like the imperative break statement, but for reduce. If it finds a non-number it doesn't need to keep going through the whole thing.
(ns foo.core
(:require
[madstap.comfy :as comfy]))
(defn only-numbers? [form]
(comfy/prewalk-reduce
(fn [ret x]
(if (or (coll? x) (number? x))
ret
(reduced false)))
true
form))
Maybe by "any item in the list (not in function position)" you meant this?
(defn only-numbers-in-arg-position? [form]
(comfy/prewalk-reduce
(fn [ret x]
(if (and (list? x) (not (every? (some-fn number? list?) (rest x))))
(reduced false)
ret))
true
form))

clojure: filtering a vector of maps by keys existence and values

I have a vector of maps like this one
(def map1
[{:name "name1"
:field "xxx"}
{:name "name2"
:requires {"element1" 1}}
{:name "name3"
:consumes {"element2" 1 "element3" 4}}])
I'm trying to define a functions that takes in a map like {"element1" 1 "element3" 6} (ie: with n fields, or {}) and fiters the maps in map1, returning only the ones that either have no requires and consumes, or have a lower number associated to them than the one associated with that key in the provided map (if the provided map doesn't have any key like that, it's not returned)
but I'm failing to grasp how to approach the maps recursive loop and filtering
(defn getV [node nodes]
(defn filterType [type nodes]
(filter (fn [x] (if (contains? x type)
false ; filter for key values here
true)) nodes))
(filterType :requires (filterType :consumes nodes)))
There's two ways to look at problems like this: from the outside in or from the inside out. Naming things carefully can really help when working with nested structures. For example, calling a vector of maps map1 may be adding to the confusion.
Starting from the outside, you need a predicate function for filtering the list. This function will take a map as a parameter and will be used by a filter function.
(defn comparisons [m]
...)
(filter comparisons map1)
I'm not sure I understand the comparisons precisely, but there seems to be at least two flavors. The first is looking for maps that do not have :requires or :consumes keys.
(defn no-requires-or-consumes [m]
...)
(defn all-keys-higher-than-values [m]
...)
(defn comparisons [m]
(some #(% m) [no-requires-or-consumes all-keys-higher-than-values]))
Then it's a matter of defining the individual comparison functions
(defn no-requires-or-consumes [m]
(and (not (:requires m)) (not (:consumes m))))
The second is more complicated. It operates on one or two inner maps but the behaviour is the same in both cases so the real implementation can be pushed down another level.
(defn all-keys-higher-than-values [m]
(every? keys-higher-than-values [(:requires m) (:consumes m)]))
The crux of the comparison is looking at the number in the key part of the map vs the value. Pushing the details down a level gives:
(defn keys-higher-than-values [m]
(every? #(>= (number-from-key %) (get m %)) (keys m)))
Note: I chose >= here so that the second entry in the sample data will pass.
That leaves only pulling the number of of key string. how to do that can be found at In Clojure how can I convert a String to a number?
(defn number-from-key [s]
(read-string (re-find #"\d+" s)))
Stringing all these together and running against the example data returns the first and second entries.
Putting everything together:
(defn no-requires-or-consumes [m]
(and (not (:requires m)) (not (:consumes m))))
(defn number-from-key [s]
(read-string (re-find #"\d+" s)))
(defn all-keys-higher-than-values [m]
(every? keys-higher-than-values [(:requires m) (:consumes m)]))
(defn keys-higher-than-values [m]
(every? #(>= (number-from-key %) (get m %)) (keys m)))
(defn comparisons [m]
(some #(% m) [no-requires-or-consumes all-keys-higher-than-values]))
(filter comparisons map1)

In clojure, what is the exact behaviour of identical?

I am very surprised by the behaviour of identical? in clojure.
(def a (map identity [:a :b]))
(identical? (rest a) (rest a)); false
Any idea why identical? returns false?
identical?:
Tests if 2 arguments are the same object
Since rest creates a new seq object on each invocation, its results are not identical?. The following, however, is:
(def r (rest (map identity [:a :b])))
(identical? r r) ;; => true
Update: As #mfikes pointed out, rest does not always create a new seq. It calls ISeq.more() internally which is implemented per seq type and might yield different results for lists, vectors, lazy seqs, etc.:
(->> [(map identity [:a :b])
(vector :a :b)
(list :a :b)]
(map #(identical? (rest %) (rest %))))
;; => [false false true]
identical? is the object equality predicate. It returns true if its arguments are the same object/primitive.
Use = over identical?.
identical? is the correct tool when semantics depend on pointer equality, such as testing for an end-of-file sentinel value.
Never use identical? to compare Clojure data structures. Even keywords don't guarantee identical? behaves correctly.

what advantage is there to use 'get' instead to access a map

Following up from this question: Idiomatic clojure map lookup by keyword
Map access using clojure can be done in many ways.
(def m {:a 1}
(get m :a) ;; => 1
(:a m) ;; => 1
(m :a) ;; => 1
I know I use mainly the second form, and sometimes the third, rarely the first. what are the advantages (speed/composability) of using each?
get is useful when the map could be nil or not-a-map, and the key could be something non-callable (i.e. not a keyword)
(def m nil)
(def k "some-key")
(m k) => NullPointerException
(k m) => ClassCastException java.lang.String cannot be cast to clojure.lang.IFn
(get m k) => nil
(get m :foo :default) => :default
From the clojure web page we see that
Maps implement IFn, for invoke() of one argument (a key) with an
optional second argument (a default value), i.e. maps are functions of
their keys. nil keys and values are ok.
Sometimes it is rewarding to take a look under the hoods of Clojure. If you look up what invoke looks like in a map, you see this:
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/APersistentMap.java#L196
It apparently calls the valAt method of a map.
If you look at what the get function does when called with a map, this is a call to clojure.lang.RT.get, and this really boils down to the same call to valAt for a map (maps implement ILookUp because they are Associatives):
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/RT.java#L634.
The same is true for a map called with a key and a not-found-value. So, what is the advantage? Since both ways boil down to pretty much the same, performance wise I would say nothing. It's just syntactic convenience.
You can pass get to partial etc. to build up HOFs for messing with your data, though it doesn't come up often.
user=> (def data {"a" 1 :b 2})
#'user/data
user=> (map (partial get data) (keys data))
(1 2)
I use the third form a lot when the data has strings as keys
I don't think there is a speed difference, and even if that would be the case, that would be an implementation detail.
Personally I prefer the second option (:a m) because it sometimes makes code a bit easier on the eye. For example, I often have to iterate through a sequence of maps:
(def foo '({:a 1} {:a 2} {:a 3}))
If I want to filter all values of :a I can now use:
(map :a foo)
Instead of
(map #(get % :a) foo)
or
(map #(% :a) foo)
Of course this is a matter of personal taste.
To add to the list, get is also useful when using the threading macro -> and you need to access via a key that is not a keyword
(let [m {"a" :a}]
(-> m
(get "a")))
One advantage of using the keyword first approach is it is the most concise way of accessing the value with a forgiving behavior in the case the map is nil.