Simplest way to ensure var is vector - clojure

What is the "simplest"/shortest way to ensure a var is a vector? Self-written it could look like
(defn ensure-vector [x]
(if (vector? x)
x
(vector x))
(ensure-vector {:foo "bar"})
;=> [{:foo "bar"}]
But I wonder if there is already a core function that does this? Many of them (seq, vec, vector, list) either fail on maps or always apply.
I also wonder what would be the best name for this function. box, singleton, unit, v, cast-vector, to-vector, ->vector, !vector, vector!, vec!?
I further wonder if other languages, like Haskell, have this function built-in.

I think the function you want to use when the value is a collection is vec which turns any collection into a vector. The vector function receives the items of the resulting vector as its arguments, so you could use it when the value is neither a vector or a collection.
This is a possible approach:
(defn as-vector [x]
(cond
(vector? x) x
(sequential? x) (vec x)
:else (vector x)))
(map as-vector [[1] #{2 3} 1 {:a 1}])
I chose the name for the function based on the ones from the Coercions protocol in clojure.java.io (as-file and as-url).

Related

Clojure flexible function design based on arguments?

I have functions that behave different depending on which keyword arguments have values supplied. For this question, I am wondering about functions that behave slightly differently depending on the type of argument supplied.
Example function, that increments each element of a list:
(defn inc-list [& {:keys [list-str list]}]
(let [prepared-list (if-not (nil? list) list (clojure.string/split list-str #","))]
(map inc prepared-list)))
Does it make sense to make a multimethod that instead tests for the type of argument? I have not used multimethods before, not sure about right time to use them. If it is a good idea, would the below example make sense?
Example:
(defn inc-coll [col] (map inc col))
(defmulti inc-list class)
(defmethod inc-list ::collection [col] (inc-col col))
(defmethod inc-list String [list-str]
(inc-col
(map #(Integer/parseInt %)
(clojure.string/split list-str #",")))
First things first: (map 'inc x) treats each item in x as an associative collection, and looks up the value indexed by the key 'inc.
user> (map 'inc '[{inc 0} {inc 1} {inc 2}])
(0 1 2)
you probably want inc instead
user> (map inc [0 1 2])
(1 2 3)
Next, we have an attempt to inc a string, the args to string/split out of order, and some spelling errors.
If you define your multi to dispatch on class, then the methods should be parameterized by the Class, not a keyword placeholder. I changed the multi so it would work on anything Clojure knows how to treat as a seq. Also, as a bit of bikeshedding, it is better to use type, which offers some distinctions for differentiating inputs in Clojure code that class does not offer:
user> (type (with-meta {:a 0 :b 1} {:type "foo"}))
"foo"
Putting it all together:
user> (defn inc-coll [col] (map inc col))
#'user/inc-coll
user> (defmulti inc-list type)
nil
user> (defmethod inc-list String [list-str]
(inc-coll (map #(Integer/parseInt %) (clojure.string/split list-str #","))))
#<MultiFn clojure.lang.MultiFn#6507d1de>
user> (inc-list "1,10,11")
(2 11 12)
user> (defmethod inc-list clojure.lang.Seqable [col] (inc-coll (seq col)))
#<MultiFn clojure.lang.MultiFn#6507d1de>
user> (inc-list [1 2 3])
(2 3 4)
Your first example is an obfuscated application of a technique called dispatching on type. It is obfuscated because in a message-passing style the caller must convey the type to your function.
Since in every case you only use one of the keyword args, you could as well define it as:
(defn inc-list
[m l]
(->> (case m ;; message dispatch
:list l
:list-str (map #(edn/read-string %) (str/split #",")) l)
(map inc)))
The caller could be relieved from having to pass m:
(defn inc-list
[l]
(->> (cond (string? l) (map ...)
:else l)
(map inc)))
This technique has the main disadvantage that the operation procedure code must be modified when a new type is introduced to the codebase.
In Clojure it is generally superseeded by the polymorphism construct protocols, e. g.:
(defprotocol IncableList
(inc-list [this]))
Can be implemented on any type, e. g.
(extend-type clojure.lang.Seqable
IncableList
(inc-list [this] (map inc this)))
(extend-type String
IncableList
(inc-list [this] (map #(inc ...) this)))
Multimethods allow the same and provide additional flexibility over message-passing and dispatching on type by decoupling the dispatch mechanism from the operation procedures and providing the additivity of data-directed programming. They perform slower than protocols, though.
In your example the intention is to dispatch based on type, so you don't need multimethods and protocols are the appropriate technique.

In clojure, how to map a sequence and create a hash-map

In clojure, I would like to apply a function to all the elements of a sequence and return a map with the results where the keys are the elements of the sequence and the values are the elements of the mapped sequence.
I have written the following function function. But I am wondering why such a function is not part of clojure. Maybe it's not idiomatic?
(defn map-to-object[f lst]
(zipmap lst (map f lst)))
(map-to-object #(+ 2 %) [1 2 3]) => {1 3, 2 4, 3 5}
Your function is perfectly idiomatic.
For a fn to be part of core, I think it has to be useful to most people. What is part of the core language and what is not is quite debatable. Just think about the amount of StringUtils classes that you can find in Java.
My comments were going to get too long winded, so...
Nothing wrong with your code whatsoever.
You might also see (into {} (map (juxt identity f) coll))
One common reason for doing this is to cache the results of a function over some inputs.
There are other use-cases for what you have done, e.g. when a hash-map is specifically needed.
If and only if #3 happens to be your use case, then memoize does this for you.
If the function is f, and the resultant map is m then (f x) and (m x) have the same value in the domain. However, the values of (m x) have been precalculated, in other words, memoized.
Indeed memoize does exactly the same thing behind the scene, it just doesn't give direct access to the map. Here's a tiny modification to the source of memoize to see this.
(defn my-memoize
"Exactly the same as memoize but the cache memory atom must
be supplied as an argument."
[f mem]
(fn [& args]
(if-let [e (find #mem args)]
(val e)
(let [ret (apply f args)]
(swap! mem assoc args ret)
ret))))
Now, to demonstrate
(defn my-map-to-coll [f coll]
(let [m (atom {})
g (my-memoize f m)]
(doseq [x coll] (g x))
#m))
And, as in your example
(my-map-to-coll #(+ 2 %) [1 2 3])
;=> {(3) 5, (2) 4, (1) 3}
But note that the argument(s) are enclosed in a sequence as memoize handles multiple arity functions as well.

Why doesn't (into {} x) accept a sequence of sequences for x

=> (into {} (for [x [["1" "2"] ["3" "4"]]] (map #(Long/parseLong %) x)))
ClassCastException java.lang.Long cannot be cast to java.util.Map$Entry clojure.lang.ATransientMap.conj (ATransientMap.java:44)
=> (into {} (for [x [["1" "2"] ["3" "4"]]] (seq (map #(Long/parseLong %) x))))
ClassCastException java.lang.Long cannot be cast to java.util.Map$Entry clojure.lang.ATransientMap.conj (ATransientMap.java:44)
=> (into {} (for [x [["1" "2"] ["3" "4"]]] (vec (map #(Long/parseLong %) x))))
{1 2, 3 4}
I've got two related questions:
How come (into {}) insists on vector as the container of a (key,value) pair?
Why is it trying to use the Long, which is the constituent of the pair, as the pair itself? Shouldn't it at least complain about seeing a non-vector, regardless of what it contains?
BTW testing with Clojure 1.5.1.
To answer the second question: you're passing a sequence of sequences to into. into works by repeated conjing (or conj!ing, if possible; the result is equivalent). Here at each step you'll be taking one item from your sequence of sequences and conjing it on to the map. Each such item is a sequence of two Longs. When you conj a sequence on to a map, conj assumes that it is a sequence of map entries and casts each element to Map$Entry.
So, here it'll be attempting to cast your Longs to Map$Entry.
into is implemented on top of conj.
(into {} ...) ;; equivalent to (below)
(-> {}
(conj (map #(Long/parseLong %) ["1" "2"]) ;; produces the same exception
(conj (map #(Long/parseLong %) ["3" "4"]))
conj expects either a map, a map entry or a two-element vector. Consider this excerpt from http://clojure.org/data_structures#toc17:
conj expects another (possibly single entry) map as the item, and returns a new map which is the old map plus the entries from the new, which may overwrite entries of the old. conj also accepts a MapEntry or a vector of two items (key and value).

Multimethods in Clojure, polymorphism, big switches

I have read the pattern (defmulti multi (fn [t] (cond (seq? t) :seq (map? t) :map (vec? t) :vec ... in lots of Clojure code here and there, which is basically a switch (if I add a type, I have to add a new clause) but more verbose. Is there not a way to say (defmethod seq, (defmethod vec (defmethod map.. etc ? It must be a very common thing to do. I'm aware that it's possible to manually define hierarchies, but what about common Clojure types like sequence, vector, map...would they have to be defined for each program which dispatched on type ? Please show me how I'm missing the point!
edit: ok I thought I could say (defmulti mymulti type) then (defmethod clojure.lang.PeristantSomething... etc, but that's clumsy as it refers to java classes, but I want to refer to some quality of the 'type' like whether it's sequential or associative
Dispatching on type works well for this:
user> (import '[clojure.lang Associative Sequential])
user> (defmulti foo type)
#'user/foo
user> (defmethod foo Associative [x] :map)
#<MultiFn clojure.lang.MultiFn#7e69a380>
user> (foo {:x 1})
:map
user> (foo ())
; fails, a list is not associative
user> (defmethod foo Sequential [x] :seq)
#<MultiFn clojure.lang.MultiFn#7e69a380>
user> (foo ())
:seq
user> (foo [])
; fails, a vector is both sequential and associative
user> (prefer-method foo Sequential Associative)
#<MultiFn clojure.lang.MultiFn#7e69a380>
user> (foo [])
:seq
Note that both Sequential and Associative are interfaces and not concrete classes.
choose dispatched function is type or class.

Execute function until certain condition holds

I want to repeatedly apply some function to some state until a condition holds true.
Function f takes a state, modifies it and returns it. Apply f again to the returned state and so on.
I think this would work.
(first (filter pred (iterate f x)))
But it's a bit ugly. Plus memory consumption is not ideal since iterator would be forced to evaluate and keep intermediate states until the state on which pred holds true is returned, at which point intermediate states should be garbage collected.
I know you can write a simple recursive function:
(loop [f x p] (if (p x) x (recur f (f x) p))
But I'm looking for a core library function (or some combination of functions) that does the same thing with the same memory efficiency.
What you really want is take-while:
take-while
function
Usage: (take-while pred coll)
Returns a lazy sequence of successive items from coll while
(pred item) returns true. pred must be free of side-effects.
EDIT
A way to use higher order functions to achieve the result you want might be to wrap your function into something to be consumed by trampoline, namely a function that will either return the final result or another function which will execute the next step. Here's the code:
(defn iterable [f] ; wraps your function
(fn step [pred x] ; returns a new function which will accept the predicate
(let [y (f x)] ; calculate the current step result
(if (pred y) ; recursion stop condition
(fn [] (step pred y)) ; then: return a new fn for trampoline, operates on y
y)))) ; else: return a value to exit the trampoline
The iterative execution would go as follows:
(trampoline (iterable dec) pos? 10)
Not sure what you mean by iterator - you're using it as if it were iterate, and I just want to be sure that's what you mean. At any rate, your solution looks fine to me and not at all ugly. And memory is not an issue either: iterate is free to throw away intermediate results whenever it's convenient because you aren't keeping any references to them, just calling filter on it in a "streaming" way.
I think you should just make your loop a simple recursive function:
(defn do-until [f x p]
(if (p x) x (recur f (f x) p)))
(do-until inc 0 #(> % 10)) ; => 11
How about drop-while
(first (drop-while (comp not pred) (iterate f x))
I don't think there is a core function that does this exactly and efficiently. Hence I would do this with loop/recur as follows:
(loop [x initial-value]
(if (pred x) x (recur (f x))))
Loop/recur is very efficient since it requires no additional storage and is implemented as a simple loop in the JVM.
If you're going to do this a lot, then you can always encapsulate the pattern in a macro.
Sounds like you want the while macro.
http://richhickey.github.com/clojure/clojure.core-api.html#clojure.core/while
Usage: (while test & body)
Repeatedly executes body while test expression is true. Presumes
some side-effect will cause test to become false/nil. Returns nil
In a slightly different use case the for macro supports :when and :while options too.
http://richhickey.github.com/clojure/clojure.core-api.html#clojure.core/for
Usage: (for seq-exprs body-expr)
List comprehension. Takes a vector of one or more
binding-form/collection-expr pairs, each followed by zero or more
modifiers, and yields a lazy sequence of evaluations of expr.
Collections are iterated in a nested fashion, rightmost fastest,
and nested coll-exprs can refer to bindings created in prior
binding-forms. Supported modifiers are: :let [binding-form expr ...],
:while test, :when test.
(take 100 (for [x (range 100000000) y (range 1000000) :while (< y x)] [x y]))