Newbie: "for" loops in functions behave in unexpected ways - clojure

I've been developing in Java and Perl for a long time but wanted to learn something new so I've begun looking into clojure. One of the first things I've tried was a solution for the Towers of Hanoi puzzle but I've been getting strange behavior on my pretty-print function. Basically, my for loop is never entered when I run it with 'lein run' but it seems to work fine when I run it from the repl. Here's a stripped down example:
(ns test-app.core
(:gen-class))
(defn for-print
"Print the same thing 3 times"
[ p-string ]
(println (str "Checkpoint: " p-string))
(for
[x [1 2 3]]
(printf "FOR: %s\n" p-string)
))
(defn -main
"I don't do a whole lot ... yet."
[& args]
(for-print "Haldo wurld!"))
When I run this with 'lein run' I only see the output from the "Checkpoint" println. If I remove that line I get no output at all. But, if I run 'lein repl' and then type (-main) it prints the string 3 times, as expected:
test-app.core=> (-main)
Checkpoint: Haldo wurld!
(FOR: Haldo wurld!
FOR: Haldo wurld!
FOR: Haldo wurld!
nil nil nil)
test-app.core=>
What's going on here? I have a feeling that I'm approaching this the wrong way, trying to use my past Perl/Java mentality to write clojure. What would be an idiomatic way to run the same task a set number of times?

The for loop is returning a lazy sequence that is evaluated only as needed.
When you run the program inside the repl the for result is realized in order to display the result on screen.
But when you run using lein run result is never used so the collection is not realized.
You have a couple alternatives:
1) use doall outside the for loop for force lazy sequence realization
Ex:
(defn for-print
"Print the same thing 3 times"
[ p-string ]
(println (str "Checkpoint: " p-string))
(doall (for
[x [1 2 3]]
(printf "FOR: %s\n" p-string))))
2) since you're only printing which is a side effect and not really creating a collection you can use doseq.
Ex:
(defn for-print
"Print the same thing 3 times"
[ p-string ]
(println (str "Checkpoint: " p-string))
(doseq [x [1 2 3]]
(printf "FOR: %s\n" p-string)))

Clojure for is not an imperative loop (you should avoid thinking about loops in Clojure at all), it's a list comprehension, which returns lazy sequence. It's made for creating a sequence, not for printing anything. You can realize it, and make it work, but it is bad way to go.
As Guillermo said, you're looking for doseq macro, which is designed for side effects. It's probably the most idiomatic Clojure in your particular case.
From my point of view the most similar to imperative loops construction in Clojure is tail recursion, made with loop/recur. Still it's rather low level construct in Clojure and certainly should not be used in a imperative loop manner. Have a better look into functional programming principles, as well as Clojure core functions. Trying to transfer Java/Perl thinking to Clojure may harm you.

The other answers are correct, and provide details. I want to add some higher-level clarification that might be helpful. In most languages, "for" means "for such and conditions, perform such and such actions (which could be of an arbitrary type, including side-effects)." This kind of thing can be done in Clojure, and I have seen experienced Clojure programmers do it when it's useful and convenient. However, using loops with side-effects works usually against the strengths of the language. So in Clojure the word "for" has a different meaning in most languages: Generating (lazy) sequences. It means something like "For these inputs, when/while/etc. they meet such and such conditions, bind then temporarily to these variables, generate a (lazy) sequence by processing each set of values in such and such a way."

Related

Clojure pipe collection one by one

How in Clojure process collections like in Java streams - one by one thru all the functions instead of evaluating all the elements in all the stack frame. Also I would describe it as Unix pipes (next program pulls chunk by chunk from previous one).
As far as I understand your question, you may want to look into two things.
First, understand the sequence abstraction. This is a way of looking at collections which consumes them one by one and lazily. It is an important Clojure idiom and you'll meet well known functions like map, filter, reduce, and many more. Also the macro ->>, which was already mentioned in a comment, will be important.
After that, when you want to dig deeper, you probably want to look into transducers and reducers. In a grossly oversimplifying summary, they allow you combine several lazy functions into one function and then process a collection with less laziness, less memory consumption, more performance, and possibly on several threads. I consider these to be advanced topics, though. Maybe the sequences are already what you were looking for.
Here is a simple example from ClojureDocs.org
;; Use of `->` (the "thread-first" macro) can help make code
;; more readable by removing nesting. It can be especially
;; useful when using host methods:
;; Arguably a bit cumbersome to read:
user=> (first (.split (.replace (.toUpperCase "a b c d") "A" "X") " "))
"X"
;; Perhaps easier to read:
user=> (-> "a b c d"
.toUpperCase
(.replace "A" "X")
(.split " ")
first)
"X"
As always, don't forget the Clojure CheatSheet or Clojure for the Brave and True.

Clojure.spec - Why is it useful and when is it used

I have recently watched Rich Hickeys talk at Cojure Conj 2016 and although it was very interesting, I didn't really understand the point in clojure.spec or when you'd use it. It seemed like most of the ideas, such as conform, valid etc, had similar functions in Clojure already.
I have only been learning clojure for around 3 months now so maybe this is due to lack of programming/Clojure experience.
Do clojure.spec and cljs.spec work in similar ways to Clojure and Cljs in that, although they are not 100% the same, they are based on the same underlying principles.
Are you tired of documenting your programs?
Does the prospect of making up yet more tests cause procrastination?
When the boss says "test coverage", do you cower with fear?
Do you forget what your data names mean?
For smooth expression of hard specifications, you need Clojure.Spec!
Clojure.spec gives you a uniform method of documenting, specifying, and automatically testing your programs, and of validating your live data.
It steals virtually every one of its ideas. And it does nothing you can't do for yourself.
But in my - barely informed - opinion, it changes the economy of specification, making it worth while doing properly. A game-changer? - quite possibly.
At the clojure/conj conference last week, probably half of the presentations featured spec in some way, and it's not even out of alpha yet. spec is a major feature of clojure; it is here to stay, and it is powerful.
As an example of its power, take static type checking, hailed as a kind of safety net by so many, and a defining characteristic of so many programming languages. It is incredibly limited in that it's only good at compile time, and it only checks types. spec, on the other hand, validates and conforms any predicate (not just type) for the args, the return, and can also validate relationships between the two. All of this is external to the function's code, separating the logic of the function from being commingled with validation and documentation about the code.
Regarding WORKFLOW:
One archetypal example of the benefits of relationship-checking, versus only type-checking, is a function which computes the substring of a string. Type checking ensures that in (subs s start end) the s is a string and start and end are integers. However, additional checking must be done within the function to ensure that start and end are positive integers, that end is greater than start, and that the resulting substring is no larger than the original string. All of these things can be spec'd out, for example (forgive me if some of this is a bit redundant or maybe even inaccurate):
(s/fdef clojure.core/subs
:args (s/and (s/cat :s string? :start nat-int? :end (s/? nat-int?))
(fn [{:keys [s start end]}]
(if end
(<= 0 start end (count s))
(<= 0 start (count s)))))
:ret string?
:fn (fn [{{:keys [s start end]} :args, substring :ret}]
(and (if end
(= (- end start) (count substring))
(= (- (count s) start) (count substring)))
(<= (count substring) (count s)))))
Call the function with sample data meeting the above args spec:
(s/exercise-fn `subs)
Or run 1000 tests (this may fail a few times, but keep running and it will work--this is due to the built-in generator not being able to satisfy the second part of the :args predicate; a custom generator can be written if needed):
(stest/check `subs)
Or, want to see if your app makes calls to subs that are invalid while it's running in real time? Just run this, and you'll get a spec exception if the function is called and the specs are not met:
(stest/instrument `subs)
We have not integrated this into our work flow yet, and can't in production since it's still alpha, but the first goal is to write specs. I'm putting them in the same namespace but in separate files currently.
I foresee our work flow being to run the tests for spec'd functions using this (found in the clojure spec guide):
(-> (stest/enumerate-namespace 'user) stest/check)
Then, it would be advantageous to turn on instrumenting for all functions, and run the app under load as we normally would test it, and ensure that "real world" data works.
You can also use s/conform to destructure complex data in functions themselves, or use s/valid as pre- and post- conditions for running functions. I'm not too keen on this, as it's overhead in a production system, but it is a possibility.
The sky's the limit, and we've just scratched the surface! Cool things coming in the next months and years with spec!

Is it good to avoid macro in this example?

I read that data > functions > macros
Say you want to evaluate code in a postfix fashion.
Which approach would be better?
;; Macro
(defmacro reverse-fn [expression]
(conj (butlast expression) (last expression)))
(reverse-fn ("hello world" println))
; => "hello world"
;; Function and data
(def data ["hello world" println])
(defn reverse-fn [data]
(apply (eval (last data)) (butlast data)))
(reverse-fn ["hello world" println])
; => "hello world"
Thanks!
If you require different evaluation behavior for any kind of data in your code macros are your best choice because they can transform unevaluated data at compile time to the code you'd like to be evaluated instead.
Clojure has a programmatic macro system which allows the compiler to be extended by user code. Macros can be used to define syntactic constructs which would require primitives or built-in support in other languages. (http://clojure.org/macros)
The example you provide has specifically that requirement ("evaluate code in a postfix fashion") which is why a macro is the correct choice.
Your macro is better than your function: a macro is better than a function employing eval.
However, the function need not employ eval. You could write it
(defn reverse-fn [data]
(apply (last data) (butlast data)))
Then, for example, as before,
(reverse-fn [3 inc])
=> 4
On a related topic, you might find this explanation of transducers interesting.
Edit:
Notice that functions are literals, in the sense that a function evaluates to itself:
((eval +) 1 1)
=> 2
In general:
Macros have their use however; macros expand at the point they are encountered so you will have one or more code blocks being inlined in the resulting byte code. They are ideal for encapsulating High Order DSL terminologies.
Functions, on the other hand, are wonderful for reuse where you localize the purity of the function and it can be called from multiple other functions without increasing the code footprint. They are ideal for localized or global usage.
REPL behavior consideration: A single function is easier to rework without worrying about fully evaluating the entire source file(s) to ensure all macro reference expansions get updated.
Hope this helps
Simple rules are the best: Use a function if you can, a macro if you must.

Clojure confusion - behavior of map, doseq in a multiprocess environment

In trying to replicate some websockets examples I've run into some behavior I don't understand and can't seem to find documentation for. Simplified, here's an example I'm running in lein that's supposed to run a function for every element in a shared map once per second:
(def clients (atom {"a" "b" "c" "d" }))
(def ticker-agent (agent nil))
(defn execute [a]
(println "execute")
(let [ keys (keys #clients) ]
(println "keys= " keys )
(doseq [ x keys ] (println x)))
;(map (fn [k] (println k)) keys)) ;; replace doseq with this?
(Thread/sleep 1000)
(send *agent* execute))
(defn -main [& args]
(send ticker-agent execute)
)
If I run this with map I get
execute
keys= (a c)
execute
keys= (a c)
...
First confusing issue: I understand that I'm likely using map incorrectly because there's no return value, but does that mean the inner println is optimized away? Especially given that if I run this in a repl:
(map #(println %) '(1 2 3))
it works fine?
Second question - if I run this with doseq instead of map I can run into conditions where the execution agent stops (which I'd append here, but am having difficulty isolating/recreating). Clearly there's something I"m missing possibly relating to locking on the maps keyset? I was able to do this even moving the shared map out of an atom. Is there default syncrhonization on the clojure map?
map is lazy. This means that it does not calculate any result until the result is accessed from the data structure it reteruns. This means that it will not run anything if its result is not used.
When you use map from the repl the print stage of the repl accesses the data, which causes any side effects in your mapped function to be invoked. Inside a function, if the return value is not investigated, any side effects in the mapping function will not occur.
You can use doall to force full evaluation of a lazy sequence. You can use dorun if you don't need the result value but want to ensure all side effects are invoked. Also you can use mapv which is not lazy (because vectors are never lazy), and gives you an associative data structure, which is often useful (better random access performance, optimized for appending rather than prepending).
Edit: Regarding the second part of your question (moving this here from a comment).
No, there is nothing about doseq that would hang your execution, try checking the agent-error status of your agent to see if there is some exception, because agents stop executing and stop accepting new tasks by default if they hit an error condition. You can also use set-error-model and set-error-handler! to customize the agent's error handling behavior.

Clojure lazy-seq over Java iterative code

I'm trying to use create a Clojure seq from some iterative Java library code that I inherited. Basically what the Java code does is read records from a file using a parser, sends those records to a processor and returns an ArrayList of result. In Java this is done by calling parser.readData(), then parser.getRecord() to get a record then passing that record into processor.processRecord(). Each call to parser.readData() returns a single record or null if there are no more records. Pretty common pattern in Java.
So I created this next-record function in Clojure that will get the next record from a parser.
(defn next-record
"Get the next record from the parser and process it."
[parser processor]
(let [datamap (.readData parser)
row (.getRecord parser datamap)]
(if (nil? row)
nil
(.processRecord processor row 100))))
The idea then is to call this function and accumulate the records into a Clojure seq (preferably a lazy seq). So here is my first attempt which works great as long as there aren't too many records:
(defn datamap-seq
"Returns a lazy seq of the records using the given parser and processor"
[parser processor]
(lazy-seq
(when-let [records (next-record parser processor)]
(cons records (datamap-seq parser processor)))))
I can create a parser and processor, and do something like (take 5 (datamap-seq parser processor)) which gives me a lazy seq. And as expected getting the (first) of that seq only realizes one element, doing count realizes all of them, etc. Just the behavior I would expect from a lazy seq.
Of course when there are a lot of records I end up with a StackOverflowException. So my next attempt was to use loop-recur to do the same thing.
(defn datamap-seq
"Returns a lazy seq of the records using the given parser and processor"
[parser processor]
(lazy-seq
(loop [records (seq '())]
(if-let [record (next-record parser processor)]
(recur (cons record records))
records))))
Now using this the same way and defing it using (def results (datamap-seq parser processor)) gives me a lazy seq and doesn't realize any elements. However, as soon as I do anything else like (first results) it forces the realization of the entire seq.
Can anyone help me understand where I'm going wrong in the second function using loop-recur that causes it to realize the entire thing?
UPDATE:
I've looked a little closer at the stack trace from the exception and the stack overflow exception is being thrown from one of the Java classes. BUT it only happens when I have the datamap-seq function like this (the one I posted above actually does work):
(defn datamap-seq
"Returns a lazy seq of the records using the given parser and processor"
[parser processor]
(lazy-seq
(when-let [records (next-record parser processor)]
(cons records (remove empty? (datamap-seq parser processor))))))
I don't really understand why that remove causes problems, but when I take it out of this funciton it all works right (I'm doing the removal of empty lists somewhere else now).
loop/recur loops within the loop expression until the recursion runs out. adding a lazy-seq around it won't prevent that.
Your first attempt with lazy-seq / cons should already work as you want, without stack overflows. I can't spot right now what the problem with it is, though it might be in the java part of the code.
I'll post here addition to Joost's answer. This code:
(defn integers [start]
(lazy-seq
(cons
start
(integers (inc start)))))
will not throw StackOverflowExceptoin if I do something like this:
(take 5 (drop 1000000 (integers)))
EDIT:
Of course better way to do it would be to (iterate inc 0). :)
EDIT2:
I'll try to explain a little how lazy-seq works. lazy-seq is a macro that returns seq-like object. Combined with cons that doesn't realize its second argument until it is requested you get laziness.
Now take a look at how LazySeq class is implemented. LazySeq.sval triggers computation of the next value which returns another instance of "frozen" lazy sequence. Method LazySeq.seq even better shows mechanics behind the concept. Notice that to fully realize sequence it uses while loop. It in itself means that stack trace use is limited to short function calls that return another instances of LazySeq.
I hope this makes any sense. I described what I could deduce from the source code. Please let me know if I made any mistakes.