I am new to Clojure and I am currently stuck with below code which throws a NullPointerException when I run it like this:
(mapset inc [1 1 2 2])
(defn mapset
[fn v]
(loop [[head & tail] v result-set (hash-set)]
(if (nil? head)
result-set)
(conj result-set (fn head))
(recur tail result-set)))
When I print the result-set in the if block it prints an empty set whereas I expect a set like this: #{2 3}
After trying to interpret the stacktrace I guess that the NullPointerException has something to do with the following line:
(conj result-set (fn head)).
The stacktrace and the fact that the resulting set is empty leads me to believe that the inc operation somehow gets called with nil as input.
I am glad about any explanation of why this error happens
The mentioned (shortend) stacktrace looks like this:
java.lang.NullPointerException
Numbers.java: 1013 clojure.lang.Numbers/ops
Numbers.java: 112 clojure.lang.Numbers/inc
core.clj: 908 clojure.core/inc
core.clj: 903 clojure.core/inc
REPL: 6 user/mapset
REPL: 1 user/mapset
I made some small changes:
(defn mapset [f v]
(loop [[head & tail] v
result-set (hash-set)]
(if (nil? head)
result-set
(let [res (f head)]
(recur tail (conj result-set res))))))
(defn x-1 []
(mapset inc [1 1 2 2]))
(x-1)
;;=> #{3 2}
So now mapset will call the function f on each of your inputs from v, then put the result of that call into the hash-set that was created at the outset.
The problem was in the control flow logic of your if statement. The flow of execution was continuing on after the if. Thus the function fn (which I renamed to f, rather than leave it as the name of a macro) was being called even when head was nil, which was not what you intended.
As originally coded the if (which would more clearly have been a when) was doing nothing useful. But once I realised you meant to have an if, but closed the paren off too early, then the answer slotted into place. Hence minor changes fixed the problem - your underlying logic was sound - and the amended function just worked.
Related
I'm learning core.async and have written a simple producer consumer code:
(ns webcrawler.parallel
(:require [clojure.core.async :as async
:refer [>! <! >!! <!! go chan buffer close! thread alts! alts!! timeout]]))
(defn consumer
[in out f]
(go (loop [request (<! in)]
(if (nil? request)
(close! out)
(do (print f)
(let [result (f request)]
(>! out result))
(recur (<! in)))))))
(defn make-consumer [in f]
(let [out (chan)]
(consumer in out f)
out))
(defn process
[f s no-of-consumers]
(let [in (chan (count s))
consumers (repeatedly no-of-consumers #(make-consumer in f))
out (async/merge consumers)]
(map #(>!! in %1) s)
(close! in)
(loop [result (<!! out)
results '()]
(if (nil? result)
results
(recur (<!! out)
(conj results result))))))
This code works fine when I step in through the process function in debugger supplied with Emacs' cider.
(process (partial + 1) '(1 2 3 4) 1)
(5 4 3 2)
However, if I run it by itself (or hit continue in the debugger) I get an empty result.
(process (partial + 1) '(1 2 3 4) 1)
()
My guess is that in the second case for some reason producer doesn't wait for consumers before exiting, but I'm not sure why. Thanks for help!
The problem is that your call to map is lazy, and will not run until something asks for the results. Nothing does this in your code.
There are 2 solutions:
(1) Use the eager function mapv:
(mapv #(>!! in %1) items)
(2) Use the doseq, which is intended for side-effecting operations (like putting values on a channel):
(doseq [item items]
(>!! in item))
Both will work and produce output:
(process (partial + 1) [1 2 3 4] 1) => (5 4 3 2)
P.S. You have a debug statement in (defn consumer ...)
(print f)
that produces a lot of noise in the output:
<#clojure.core$partial$fn__5561 #object[clojure.core$partial$fn__5561 0x31cced7
"clojure.core$partial$fn__5561#31cced7"]>
That is repeated 5 times back to back. You probably want to avoid that, as printing function "refs" is pretty useless to a human reader.
Also, debug printouts in general should normally use println so you can see where each one begins and ends.
I'm going to take a safe stab that this is being caused by the lazy behavior of map, and this line that's carrying out side effects:
(map #(>!! in %1) s)
Because you never explicitly use the results, it never runs. Change it to use mapv, which is strict, or more correctly, use doseq. Never use map to run side effects. It's meant to lazily transform a list, and abuse of it leads to behaviour like this.
So why is it working while debugging? I'm going to guess because the debugger forces evaluation as part of its operation, which is masking the problem.
As you can read from docstring map returns a lazy sequence. And I think the best way is to use dorun. Here is an example from clojuredocs:
;;map a function which makes database calls over a vector of values
user=> (map #(db/insert :person {:name %}) ["Fred" "Ethel" "Lucy" "Ricardo"])
JdbcSQLException The object is already closed [90007-170] org.h2.message.DbE
xception.getJdbcSQLException (DbException.java:329)
;;database connection was closed before we got a chance to do our transactions
;;lets wrap it in dorun
user=> (dorun (map #(db/insert :person {:name %}) ["Fred" "Ethel" "Lucy" "Ricardo"]))
DEBUG :db insert into person values name = 'Fred'
DEBUG :db insert into person values name = 'Ethel'
DEBUG :db insert into person values name = 'Lucy'
DEBUG :db insert into person values name = 'Ricardo'
nil
For the most part I understand what Clojure is telling me with it's error messages. But I am still clueless as to find out where the error happened.
Here is an example of what I mean
(defn extract [m]
(keys m))
(defn multiple [xs]
(map #(* 2 %) xs))
(defn process [xs]
(-> xs
(multiple) ; seq -> seq
(extract))) ; map -> seq ... fails
(process [1 2 3])
Statically typed languages would now tell me that I tried to pass a sequence to a function that expects a map on line X. And Clojure does this in a way:
ClassCastException java.lang.Long cannot be cast to java.util.Map$Entry
But I still have no idea where the error happened. Obviously for this instance it's easy because there are just 3 functions involved, you can easily just read through all of them but as programs grow bigger this gets old very quickly.
Is there a way find out where the errors happened other than just proof reading the code from top to bottom? (which is my current approach)
You can use clojure.spec. It is still in alpha, and there's still a bunch of tooling support coming (hopefully), but instrumenting functions works well.
(ns foo.core
(:require
;; For clojure 1.9.0-alpha16 and higher, it is called spec.alpha
[clojure.spec.alpha :as s]
[clojure.spec.test.alpha :as stest]))
;; Extract takes a map and returns a seq
(s/fdef extract
:args (s/cat :m map?)
:ret seq?)
(defn extract [m]
(keys m))
;; multiple takes a coll of numbers and returns a coll of numbers
(s/fdef multiple
:args (s/cat :xs (s/coll-of number?))
:ret (s/coll-of number?))
(defn multiple [xs]
(map #(* 2 %) xs))
(defn process [xs]
(-> xs
(multiple) ; seq -> seq
(extract))) ; map -> seq ... fails
;; This needs to come after the definition of the specs,
;; but before the call to process.
;; This is something I imagine can be handled automatically
;; by tooling at some point.
(stest/instrument)
;; The println is to force evaluation.
;; If not it wouldn't run because it's lazy and
;; not used for anything.
(println (process [1 2 3]))
Running this file prints (among other info):
Call to #'foo.core/extract did not conform to spec: In: [0] val: (2
4 6) fails at: [:args :m] predicate: map? :clojure.spec.alpha/spec
#object[clojure.spec.alpha$regex_spec_impl$reify__1200 0x2b935f0d
"clojure.spec.alpha$regex_spec_impl$reify__1200#2b935f0d"]
:clojure.spec.alpha/value ((2 4 6)) :clojure.spec.alpha/args ((2 4
6)) :clojure.spec.alpha/failure :instrument
:clojure.spec.test.alpha/caller {:file "core.clj", :line 29,
:var-scope foo.core/process}
Which can be read as: A call to exctract failed because the value passed in (2 4 6) failed the predicate map?. That call happened in the file "core.clj" at line 29.
A caveat that trips people up is that instrument only checks function arguments and not return values. This is a (strange if you ask me) design decision from Rich Hickey. There's a library for that, though.
If you have a REPL session you can print a stack trace:
(clojure.stacktrace/print-stack-trace *e 30)
See http://puredanger.github.io/tech.puredanger.com/2010/02/17/clojure-stack-trace-repl/ for various different ways of printing the stack trace. You will need to have a dependency such as this in your project.clj:
[org.clojure/tools.namespace "0.2.11"]
I didn't get a stack trace using the above method, however just typing *e at the REPL will give you all the available information about the error, which to be honest didn't seem very helpful.
For the rare cases where the stack trace is not helpful I usually debug using a call to a function that returns the single argument it is given, yet has the side effect of printing that argument. I happen to call this function probe. In your case it can be put at multiple places in the threading macro.
Re-typing your example I have:
(defn extract [m]
(keys m))
(defn multiply [xs]
(mapv #(* 2 %) xs))
(defn process [xs]
(-> xs
(multiply) ; seq -> seq
(extract))) ; map -> seq ... fails ***line 21***
(println (process [1 2 3]))
;=> java.lang.ClassCastException: java.lang.Long cannot be cast
to java.util.Map$Entry, compiling:(tst/clj/core.clj:21:21)
So we get a good clue in the exception where is says the file and line/col number tst.clj.core.clj:21:21 that the extract method is the problem.
Another indispensible tool I use is Plumatic Schema to inject "gradual" type checking into clojure. The code becomes:
(ns tst.clj.core
(:use clj.core tupelo.test)
(:require
[tupelo.core :as t]
[tupelo.schema :as tsk]
[schema.core :as s]))
(t/refer-tupelo)
(t/print-versions)
(s/defn extract :- [s/Any]
[m :- tsk/Map]
(keys m))
(s/defn multiply :- [s/Num]
[xs :- [s/Num]]
(mapv #(* 2 %) xs))
(s/defn process :- s/Any
[xs :- [s/Num]]
(-> xs
(multiply) ; seq -> seq
(extract))) ; map -> seq ... fails
(println (process [1 2 3]))
clojure.lang.ExceptionInfo: Input to extract does not match schema:
[(named (not (map? [2 4 6])) m)] {:type :schema.core/error, :schema [#schema.core.One{:schema {Any Any},
:optional? false, :name m}],
:value [[2 4 6]], :error [(named (not (map? [2 4 6])) m)]},
compiling:(tst/clj/core.clj:23:17)
So, while the format of the error message is a bit lengthy, it tells right away that we passed a parameter of the wrong type and/or shape into the method extract.
Note that you need a line like this:
(s/set-fn-validation! true) ; enforce fn schemas
I create a special file test/tst/clj/_bootstrap.clj so it is always in the same place.
For more information on Plumatic Schema please see:
https://github.com/plumatic/schema
https://youtu.be/o_jtwIs2Ot8
https://github.com/plumatic/schema/wiki/Basics-Examples
https://github.com/plumatic/schema/wiki/Defining-New-Schema-Types-1.0
I am constructing a list of hash maps which is then passed to another function. When I try to print each hash maps from the list using map it is not working. I am able to print the full list or get the first element etc.
(defn m [a]
(println a)
(map #(println %) a))
The following works from the repl only.
(m (map #(hash-map :a %) [1 2 3]))
But from the program that I load using load-file it is not working. I am seeing the a but not its individual elements. What's wrong?
In Clojure tranform functions return a lazy sequence. So, (map #(println %) a) return a lazy sequence. When consumed, the map action is applied and only then the print-side effect is visible.
If the purpose of the function is to have a side effect, like printing, you need to eagerly evaluate the transformation. The functions dorun and doall
(def a [1 2 3])
(dorun (map #(println %) a))
; returns nil
(doall (map #(println %) a))
; returns the collection
If you actually don't want to map, but only have a side effect, you can use doseq. It is intended to 'iterate' to do side effects:
(def a [1 2 3])
(doseq [i a]
(println i))
If your goal is simply to call an existing function on every item in a collection in order, ignoring the returned values, then you should use run!:
(run! println [1 2 3])
;; 1
;; 2
;; 3
;;=> nil
In some more complicated cases it may be preferable to use doseq as #Gamlor suggests, but in this case, doseq only adds boilerplate.
I recommend to use tail recursion:
(defn printList [a]
(let [head (first a)
tail (rest a)]
(when (not (nil? head))
(println head)
(printList tail))))
I'm trying to understand when clojure's lazy sequences are lazy, and when the work happens, and how I can influence those things.
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 4)))
#'user/lz-seq
user=> (let [[a b] lz-seq])
fn call!
fn call!
fn call!
fn call!
nil
I was hoping to see only two "fn call!"s here. Is there a way to manage that?
Anyway, moving on to something which indisputably only requires one evaluation:
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 4)))
#'user/lz-seq
user=> (first lz-seq)
fn call!
fn call!
fn call!
fn call!
0
Is first not suitable for lazy sequences?
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 4)))
#'user/lz-seq
user=> (take 1 lz-seq)
(fn call!
fn call!
fn call!
fn call!
0)
At this point, I'm completely at a loss as to how to access the beginning of my toy lz-seq without having to realize the entire thing. What's going on?
Clojure's sequences are lazy, but for efficiency are also chunked, realizing blocks of 32 results at a time.
=>(def lz-seq (map #(do (println (str "fn call " %)) (identity %)) (range 100)))
=>(first lz-seq)
fn call 0
fn call 1
...
fn call 31
0
The same thing happens once you cross the 32 boundary first
=>(nth lz-seq 33)
fn call 0
fn call 1
...
fn call 63
33
For code where considerable work needs to be done per realisation, Fogus gives a way to work around chunking, and gives a hint an official way to control chunking might be underway.
I believe that the expression produces a chunked sequence. Try replacing 4 with 10000 in the range expression - you'll see something like 32 calls on first eval, which is the size of the chunk.
A lazy sequence is one where we evaluate the sequence as and when needed. (hence lazy). Once a result is evaluated, it is cached so that it can be re-used (and we don't have to do the work again). If you try to realize an item of the sequence that hasn't been evaluated yet, clojure evaluates it and returns the value to you. However, it also does some extra work. It anticipates that you might want to evaluate the next element(s) in the sequence and does that for you too. This is done to avoid some performance overheads, the exact nature of which is beyond my skill-level. Thus, when you say (first lz-seq), it actually calculates the first as well as the next few elements in the seq. Since your println statement is a side effect, you can see the evaluation happening. Now if you were to say (second lz-seq), you will not see the println again since the result has already been evaluated and cached.
A better way to see that your sequence is lazy is :
user=> def lz-seq (map #(do (println "fn call!") (identity %)) (range 400))
#'user/lz-seq
user=> (first lz-seq)
This will print a few "fn call!" statements, but not all 400 of them. That's because the first call will actually end up evaluating more than one element of the sequence.
Hope this explanation is clear enough.
I think its some sort of optimization made by repl.
My repl is caching 32 at a time.
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 100))
#'user/lz-seq
user=> (first lz-seq)
prints 32 times
user=> (take 20 lz-seq)
does not print any "fn call!"
user=> (take 33 lz-seq)
prints 0 to 30, then prints 32 more "fn call!"s followed by 31,32
I know I can do the following in Common Lisp:
CL-USER> (let ((my-list nil))
(dotimes (i 5)
(setf my-list (cons i my-list)))
my-list)
(4 3 2 1 0)
How do I do this in Clojure? In particular, how do I do this without having a setf in Clojure?
My personal translation of what you are doing in Common Lisp would Clojurewise be:
(into (list) (range 5))
which results in:
(4 3 2 1 0)
A little explanation:
The function into conjoins all elements to a collection, here a new list, created with (list), from some other collection, here the range 0 .. 4. The behavior of conj differs per data structure. For a list, conj behaves as cons: it puts an element at the head of a list and returns that as a new list. So what is does is this:
(cons 4 (cons 3 (cons 2 (cons 1 (cons 0 (list))))))
which is similar to what you are doing in Common Lisp. The difference in Clojure is that we are returning new lists all the time, instead of altering one list. Mutation is only used when really needed in Clojure.
Of course you can also get this list right away, but this is probably not what you wanted to know:
(range 4 -1 -1)
or
(reverse (range 5))
or... the shortest version I can come up with:
'(4 3 2 1 0)
;-).
Augh the way to do this in Clojure is to not do it: Clojure hates mutable state (it's available, but using it for every little thing is discouraged). Instead, notice the pattern: you're really computing (cons 4 (cons 3 (cons 2 (cons 1 (cons 0 nil))))). That looks an awful lot like a reduce (or a fold, if you prefer). So, (reduce (fn [acc x] (cons x acc)) nil (range 5)), which yields the answer you were looking for.
Clojure bans mutation of local variables for the sake of thread safety, but it is still possible to write loops even without mutation. In each run of the loop you want to my-list to have a different value, but this can be achieved with recursion as well:
(let [step (fn [i my-list]
(if (< i 5)
my-list
(recur (inc i) (cons i my-list))))]
(step 0 nil))
Clojure also has a way to "just do the looping" without making a new function, namely loop. It looks like a let, but you can also jump to beginning of its body, update the bindings, and run the body again with recur.
(loop [i 0
my-list nil]
(if (< i 5)
my-list
(recur (inc i) (cons i my-list))))
"Updating" parameters with a recursive tail call can look very similar to mutating a variable but there is one important difference: when you type my-list in your Clojure code, its meaning will always always the value of my-list. If a nested function closes over my-list and the loop continues to the next iteration, the nested function will always see the value that my-list had when the nested function was created. A local variable can always be replaced with its value, and the variable you have after making a recursive call is in a sense a different variable.
(The Clojure compiler performs an optimization so that no extra space is needed for this "new variable": When a variable needs to be remembered its value is copied and when recur is called the old variable is reused.)
For this I would use range with the manually set step:
(range 4 (dec 0) -1) ; => (4 3 2 1 0)
dec decreases the end step with 1, so that we get value 0 out.
user=> (range 5)
(0 1 2 3 4)
user=> (take 5 (iterate inc 0))
(0 1 2 3 4)
user=> (for [x [-1 0 1 2 3]]
(inc x)) ; just to make it clear what's going on
(0 1 2 3 4)
setf is state mutation. Clojure has very specific opinions about that, and provides the tools for it if you need it. You don't in the above case.
(let [my-list (atom ())]
(dotimes [i 5]
(reset! my-list (cons i #my-list)))
#my-list)
(def ^:dynamic my-list nil);need ^:dynamic in clojure 1.3
(binding [my-list ()]
(dotimes [i 5]
(set! my-list (cons i my-list)))
my-list)
This is the pattern I was looking for:
(loop [result [] x 5]
(if (zero? x)
result
(recur (conj result x) (dec x))))
I found the answer in Programming Clojure (Second Edition) by Stuart Halloway and Aaron Bedra.