I have a token scanner that simply returns nil for characters I'm not interested in. Rather than conj the nils to my token vector and then later stripping them all out, I want to simply not add them.
I'm using
;; dont conjoin if value false
(defn condj [v val]
(cond-> v, val (conj val)))
to do this. Is there a specific operator or a more concise implementation?
I believe you can use transducers for this. They are explained here. Our reducing function is conj and we construct a transducer (remove nil?) that turns this function into one that will ignore nil:
(def condj ((remove nil?) conj))
Note that remove is the opposite of filter. We can also implement condj using (filter some?), some? being a function that is true for any value except nil:
(def condj ((filter some?) conj))
It seems to work:
user=> (condj [3 4 5] 9)
[3 4 5 9]
user=> (condj [3 4 5] nil)
[3 4 5]
user=> (condj [3 4 5] false)
[3 4 5 false]
I like the cond-> version and often use that to avoid repetition in the if version. Don't forget to be explicit about false values, though. I also like to use Plumatic Schema to be explicit about the data shape entering and leaving the function:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[schema.core :as s]))
(s/defn condj :- [s/Any]
"Conjoin an item onto a vector of results if the item is not nil."
[accum :- [s/Any]
item :- s/Any]
(cond-> accum
(not (nil? item)) (conj item)))
(dotest
(let [result (-> []
(condj :a)
(condj :2)
(condj false)
(condj "four")
(condj nil)
(condj "last"))]
; nil is filtered but false is retained
(is= result [:a :2 false "four" "last"])))
You may also be interested in another version using my favorite library:
(s/defn condj :- [s/Any]
"Conjoin an item onto a vector of results if the item is not nil."
[accum :- [s/Any]
item :- s/Any]
(cond-it-> accum
(not-nil? item) (append it item)))
For more complicated forms, using the placeholder symbol it makes it explicit where the value is being threaded. I also like the convenience functions not-nil? and append since they also make the intent of the code plainer.
I think the overly simplified approach is the cleanest here (it's also slightly more concise):
(defn condj [v val]
(if val (conj v val) v))
I find that this is much easier to understand quickly. The only downside is v is duplicated since it isn't being threaded, but that's not a big loss in such a simple function.
Consider using into instead of conj:
(into [1 2 3] nil) ;=> [1 2 3]
(into [1 2 3] [4]) ;=> [1 2 3 4]
Note: the downside is you must return results in a sequence which adds a bit of overhead. However, if you forget to do this you get an error, it makes it easy to extend the logic when you want to append more than one item, the code is easily understandable and doesn't require any custom functions to be created.
Related
I want to do the following in Clojure as idiomatically as possible:
transduce a collection
associate each element of the input collection with the corresponding element in the output collection
return the result in a hashmap
Is there a succinct way to do this using core library functions?
If not, what improvements can you suggest to the following implementation?
(defn to-hash [coll xform]
(reduce
merge
(map
#(apply hash-map %)
(mapcat hash-map coll (into [] xform coll)))))
something like this should do the trick without intermediate collections:
(defn process [data xform]
(zipmap data (eduction xform data)))
user> (process [1 2 3] (comp (map inc) (map #(* % %))))
;;=> {1 4, 2 9, 3 16}
the docs on eduction say the following:
Returns a reducible/iterable application of the transducers
to the items in coll. Transducers are applied in order as if
combined with comp. Note that these applications will be
performed every time reduce/iterator is called.
so no additional collection is created.
This is any good, of course, as long as there is one-to-one relationship between input and output elements. What is desired output for (process [1 -2 3] (filter pos?)) or (process [1 1 1 2 2 2] (dedupe)) ?
(by the way, your to-hash implementation has the same flaw)
A transducer is a function that takes a reducing function and returns a new reducing function. To make it work with transducers where there is not a one-to-one mapping from elements in the input collection to the output, you will have to use your transducer to create a new reducing function (step2 in the code below) that will associate elements into your hash map. Something like this.
(def ^:dynamic assoc-k nil)
(defn assoc-step [dst x]
(assoc dst assoc-k x))
(defn to-hash [coll xform]
(let [step (xform (completing assoc-step))
step2 (fn [dst x] (binding [assoc-k x] (step dst x)))]
(reduce step2 {} coll)))
This implementation is quite basic and I am not sure to which extent it will work with stateful transducers. But it will work with the stateless ones, such as map and filter.
And we can test it with a transducer that keeps odd elements in the input collection and squares them:
(defn square [x] (* x x))
(to-hash (range 10) (comp (filter odd?) (map square)))
;; => {1 1, 3 9, 5 25, 7 49, 9 81}
I want to know if this is the right way to loop through an collection:
(def citrus-list ["lemon" "orange" "grapefruit"])
(defn display-citrus [citruses]
(loop [[citrus & citruses] citruses]
(println citrus)
(if citrus (recur citruses))
))
(display-citrus citrus-list)
I have three questions:
the final print displays nil, is it ok or how can avoid it?
I understand what & is doing in this example but I donĀ“t see it in other cases, maybe you could provide a few examples
Any other example to get the same result?
Thanks,
R.
First of all your implementation is wrong. It would fail if your list contains nil:
user> (display-citrus [nil "asd" "fgh"])
;;=> nil
nil
And print unneeded nil if the list is empty:
user> (display-citrus [])
;;=> nil
nil
you can fix it this way:
(defn display-citrus [citruses]
(when (seq citruses)
(loop [[citrus & citruses] citruses]
(println citrus)
(if (seq citruses) (recur citruses)))))
1) it is totally ok: for non-empty collection the last call inside function is println, which returns nil, and for empty collection you don't call anything, meaning nil would be returned (clojure function always returns a value). To avoid nil in your case you should explicitly return some value (like this for example):
(defn display-citrus [citruses]
(when (seq citruses)
(loop [[citrus & citruses] citruses]
(println citrus)
(if (seq citruses) (recur citruses))))
citruses)
user> (display-citrus citrus-list)
;;=> lemon
;;=> orange
;;=> grapefruit
["lemon" "orange" "grapefruit"]
2) some articles about destructuring should help you
3) yes, there are some ways to do this. The simplest would be:
(run! println citrus-list)
Answering your last question, you should avoid using loop in Clojure. This form is rather for experienced users that really know what they do. In your case, you may use such more user-friendly forms as doseq. For example:
(doseq [item collection]
(println item))
You may also use map but keep in mind that it returns a new list (of nils if your case) that not sometimes desirable. Say, you are interested only in printing but not in the result.
In addition, map is lazy and won't be evaluated until it has been printed or evaluated with doall.
For most purpose, you can use either map, for or loop.
=> (map count citrus-list)
(5 6 10)
=> (for [c citrus-list] (count c))
(5 6 10)
=> (loop [[c & citrus] citrus-list
counts []]
(if-not c counts
(recur citrus (conj counts (count c)))))
[5 6 10]
I tend to use map as much of possible. The syntax is more concise, and it clearly separates the control flow (sequential loop) from the transformation logic (count the values).
For instance, you can run the same operation (count) in parallel by simply replacing map by pmap
=> (pmap count citrus-list)
[5 6 10]
In Clojure, most operations on collection are lazy. They will not take effect as long as your program doesn't need the new values. To apply the effect immediately, you can enclose your loop operation inside doall
=> (doall (map count citrus-list))
(5 6 10)
You can also use doseq if you don't care about return values. For instance, you can use doseq with println since the function will always return nil
=> (doseq [c citrus-list] (println c))
lemon
orange
grapefruit
For the most part I understand what Clojure is telling me with it's error messages. But I am still clueless as to find out where the error happened.
Here is an example of what I mean
(defn extract [m]
(keys m))
(defn multiple [xs]
(map #(* 2 %) xs))
(defn process [xs]
(-> xs
(multiple) ; seq -> seq
(extract))) ; map -> seq ... fails
(process [1 2 3])
Statically typed languages would now tell me that I tried to pass a sequence to a function that expects a map on line X. And Clojure does this in a way:
ClassCastException java.lang.Long cannot be cast to java.util.Map$Entry
But I still have no idea where the error happened. Obviously for this instance it's easy because there are just 3 functions involved, you can easily just read through all of them but as programs grow bigger this gets old very quickly.
Is there a way find out where the errors happened other than just proof reading the code from top to bottom? (which is my current approach)
You can use clojure.spec. It is still in alpha, and there's still a bunch of tooling support coming (hopefully), but instrumenting functions works well.
(ns foo.core
(:require
;; For clojure 1.9.0-alpha16 and higher, it is called spec.alpha
[clojure.spec.alpha :as s]
[clojure.spec.test.alpha :as stest]))
;; Extract takes a map and returns a seq
(s/fdef extract
:args (s/cat :m map?)
:ret seq?)
(defn extract [m]
(keys m))
;; multiple takes a coll of numbers and returns a coll of numbers
(s/fdef multiple
:args (s/cat :xs (s/coll-of number?))
:ret (s/coll-of number?))
(defn multiple [xs]
(map #(* 2 %) xs))
(defn process [xs]
(-> xs
(multiple) ; seq -> seq
(extract))) ; map -> seq ... fails
;; This needs to come after the definition of the specs,
;; but before the call to process.
;; This is something I imagine can be handled automatically
;; by tooling at some point.
(stest/instrument)
;; The println is to force evaluation.
;; If not it wouldn't run because it's lazy and
;; not used for anything.
(println (process [1 2 3]))
Running this file prints (among other info):
Call to #'foo.core/extract did not conform to spec: In: [0] val: (2
4 6) fails at: [:args :m] predicate: map? :clojure.spec.alpha/spec
#object[clojure.spec.alpha$regex_spec_impl$reify__1200 0x2b935f0d
"clojure.spec.alpha$regex_spec_impl$reify__1200#2b935f0d"]
:clojure.spec.alpha/value ((2 4 6)) :clojure.spec.alpha/args ((2 4
6)) :clojure.spec.alpha/failure :instrument
:clojure.spec.test.alpha/caller {:file "core.clj", :line 29,
:var-scope foo.core/process}
Which can be read as: A call to exctract failed because the value passed in (2 4 6) failed the predicate map?. That call happened in the file "core.clj" at line 29.
A caveat that trips people up is that instrument only checks function arguments and not return values. This is a (strange if you ask me) design decision from Rich Hickey. There's a library for that, though.
If you have a REPL session you can print a stack trace:
(clojure.stacktrace/print-stack-trace *e 30)
See http://puredanger.github.io/tech.puredanger.com/2010/02/17/clojure-stack-trace-repl/ for various different ways of printing the stack trace. You will need to have a dependency such as this in your project.clj:
[org.clojure/tools.namespace "0.2.11"]
I didn't get a stack trace using the above method, however just typing *e at the REPL will give you all the available information about the error, which to be honest didn't seem very helpful.
For the rare cases where the stack trace is not helpful I usually debug using a call to a function that returns the single argument it is given, yet has the side effect of printing that argument. I happen to call this function probe. In your case it can be put at multiple places in the threading macro.
Re-typing your example I have:
(defn extract [m]
(keys m))
(defn multiply [xs]
(mapv #(* 2 %) xs))
(defn process [xs]
(-> xs
(multiply) ; seq -> seq
(extract))) ; map -> seq ... fails ***line 21***
(println (process [1 2 3]))
;=> java.lang.ClassCastException: java.lang.Long cannot be cast
to java.util.Map$Entry, compiling:(tst/clj/core.clj:21:21)
So we get a good clue in the exception where is says the file and line/col number tst.clj.core.clj:21:21 that the extract method is the problem.
Another indispensible tool I use is Plumatic Schema to inject "gradual" type checking into clojure. The code becomes:
(ns tst.clj.core
(:use clj.core tupelo.test)
(:require
[tupelo.core :as t]
[tupelo.schema :as tsk]
[schema.core :as s]))
(t/refer-tupelo)
(t/print-versions)
(s/defn extract :- [s/Any]
[m :- tsk/Map]
(keys m))
(s/defn multiply :- [s/Num]
[xs :- [s/Num]]
(mapv #(* 2 %) xs))
(s/defn process :- s/Any
[xs :- [s/Num]]
(-> xs
(multiply) ; seq -> seq
(extract))) ; map -> seq ... fails
(println (process [1 2 3]))
clojure.lang.ExceptionInfo: Input to extract does not match schema:
[(named (not (map? [2 4 6])) m)] {:type :schema.core/error, :schema [#schema.core.One{:schema {Any Any},
:optional? false, :name m}],
:value [[2 4 6]], :error [(named (not (map? [2 4 6])) m)]},
compiling:(tst/clj/core.clj:23:17)
So, while the format of the error message is a bit lengthy, it tells right away that we passed a parameter of the wrong type and/or shape into the method extract.
Note that you need a line like this:
(s/set-fn-validation! true) ; enforce fn schemas
I create a special file test/tst/clj/_bootstrap.clj so it is always in the same place.
For more information on Plumatic Schema please see:
https://github.com/plumatic/schema
https://youtu.be/o_jtwIs2Ot8
https://github.com/plumatic/schema/wiki/Basics-Examples
https://github.com/plumatic/schema/wiki/Defining-New-Schema-Types-1.0
I am constructing a list of hash maps which is then passed to another function. When I try to print each hash maps from the list using map it is not working. I am able to print the full list or get the first element etc.
(defn m [a]
(println a)
(map #(println %) a))
The following works from the repl only.
(m (map #(hash-map :a %) [1 2 3]))
But from the program that I load using load-file it is not working. I am seeing the a but not its individual elements. What's wrong?
In Clojure tranform functions return a lazy sequence. So, (map #(println %) a) return a lazy sequence. When consumed, the map action is applied and only then the print-side effect is visible.
If the purpose of the function is to have a side effect, like printing, you need to eagerly evaluate the transformation. The functions dorun and doall
(def a [1 2 3])
(dorun (map #(println %) a))
; returns nil
(doall (map #(println %) a))
; returns the collection
If you actually don't want to map, but only have a side effect, you can use doseq. It is intended to 'iterate' to do side effects:
(def a [1 2 3])
(doseq [i a]
(println i))
If your goal is simply to call an existing function on every item in a collection in order, ignoring the returned values, then you should use run!:
(run! println [1 2 3])
;; 1
;; 2
;; 3
;;=> nil
In some more complicated cases it may be preferable to use doseq as #Gamlor suggests, but in this case, doseq only adds boilerplate.
I recommend to use tail recursion:
(defn printList [a]
(let [head (first a)
tail (rest a)]
(when (not (nil? head))
(println head)
(printList tail))))
What's the difference between doseq and for in Clojure? What are some examples of when you would choose to use one over the other?
The difference is that for builds a lazy sequence and returns it while doseq is for executing side-effects and returns nil.
user=> (for [x [1 2 3]] (+ x 5))
(6 7 8)
user=> (doseq [x [1 2 3]] (+ x 5))
nil
user=> (doseq [x [1 2 3]] (println x))
1
2
3
nil
If you want to build a new sequence based on other sequences, use for. If you want to do side-effects (printing, writing to a database, launching a nuclear warhead, etc) based on elements from some sequences, use doseq.
Note also that doseq is eager while for is lazy. The example missing in Rayne's answer is
(for [x [1 2 3]] (println x))
At the REPL, this will generally do what you want, but that's basically a coincidence: the REPL forces the lazy sequence produced by for, causing the printlns to happen. In a non-interactive environment, nothing will ever be printed. You can see this in action by comparing the results of
user> (def lazy (for [x [1 2 3]] (println 'lazy x)))
#'user/lazy
user> (def eager (doseq [x [1 2 3]] (println 'eager x)))
eager 1
eager 2
eager 3
#'user/eager
Because the def form returns the new var created, and not the value which is bound to it, there's nothing for the REPL to print, and lazy will refer to an unrealized lazy-seq: none of its elements have been computed at all. eager will refer to nil, and all of its printing will have been done.