Were transducers in the Reducers library in Clojure 1.5 all along? - clojure

I heard a comment made today:
"Tranducers were there all along, they came with the reducers in 1.5"
Indeed - Richs's Anatomy of a Reducer blog entry, bears remarkable resemblance to the logic used in his Strange Loop Transducers talk. (Replace 'transformers' with 'transducers').
My question is: Were transducers in the Reducers library in Clojure 1.5 all along?

Pointy is correct, the Idea what there though not accessible as it's own thing. Specifically map filter reduce etc. where not yet capable of producing a transducer and into chan sequence where not available to consume them, so in my opinioin it is safe to say that transducers where not present in Clojure < 1.6.0

Related

What is the purpose of clojure.core.reducers/reduce?

A slightly modified version of reduce was introduced with reducers, clojure.core.reducers/reduce (short r/reduce):
(defn reduce
([f coll]
(reduce f (f) coll))
([f init coll]
(if (instance? java.util.Map coll)
(clojure.core.protocols/kv-reduce coll f init)
(clojure.core.protocols/coll-reduce coll f init))))
r/reduce differs from its core sibling only in that it uses (f) as the initial value when none is provided, and it delegates to core reduce-kv for maps.
I don’t understand what use such an odd special-purpose reduce might be and why it was worth including in the reducers library.
Curiously, r/reduce is not mentioned in the two introductory blog posts as far as I can tell (first, second). The official documentation notes
In general most users will not call r/reduce directly and instead should prefer r/fold (...) However, it may be useful to execute an eager reduce with fewer intermediate results.
I’m unsure what that last sentence hints at.
What situations can r/reduce handle that the core reduces cannot? When would I reach for r/reduce with conviction?
Two possible reasons:
It has different – better! – semantics than clojure.core/reduce in the initless sequential case. During his 2014 Conj presentation Rich Hickey asked "who knows what the semantics of reduce are when you call it with a collection and no initial value?" – follow this link for the exact spot in the presentation – and then described the said semantics as "a ridiculous, complex rule" and "one of the worst things [he] ever copied from Common Lisp" – cf. Common Lisp's reduce contract. The presentation was about transducers and the context of the remark was a discussion of transduce, which has a superior, simpler contract; r/reduce does as well.
Even without considering the above, it's sort of nice to have a version of reduce with a contract very close to that of fold. That enables simple "try one, try the other" benchmarking with the same arguments, as well as simply changing one's mind.

Clojure loop/recur pattern, is it bad to use?

I'm in the process of learning Clojure, and I'm using 4Clojure
as a resource. I can solve many of the "easy" questions on the site, but for me thinking in a functional programming mindset still doesn't come naturally (I'm coming from Java). As a result, I use a loop/recur iterative pattern in most of my seq-building implementations because that's how I'm used to thinking.
However, when I look at the answers from more experienced Clojure users, they do things in a much more functional style. For example, in a problem about implementing the range function, my answer was the following:
(fn [start limit]
(loop [x start y limit output '()]
(if (< x y)
(recur (inc x) y (conj output x))
(reverse output))))
While this worked, other users did things like this:
(fn [x y] (take (- y x) (iterate inc x)))
My function is more verbose and I had no idea the "iterate" function even existed. But was my answer worse in an efficiency sense? Is loop/recur somehow worse to use than alternatives? I fear this sort of thing is going to happen a lot to me in the future, as there are still many functions like iterate I don't know about.
The second variant returns a lazy sequence, which may indeed be more efficient, especially if the range is big.
The other thing is that the second solution conveys the idea better. To put it differently, it describes the intent instead of implementation. It takes less time to understand it as compared to your code, where you have to read through the loop body and build a model of control flow in your head.
Regarding the discovery of the new functions: yes, you may not know in advance that some function is already defined. It is easier in, say, Haskell, where you can search for a function by its type signature, but with some experience you will learn to recognize the functional programming patterns like this. You will write the code like the second variant, and then look for something working like take and iterate in the standard library.
Bookmark the Clojure Cheetsheet website, and always have a browser tab open to it.
Study all of the functions, and especially read the examples they link to (the http://clojuredocs.org website).
The site http://clojure-doc.org is also very useful (yes, the two names are almost identical but not quite)
The question should not be about performance (it depends!) but about communication: when using loop/recur or plain recursion or lazy-seq or sometimes even reduce, you make your code harder to understand: because the reader has to understand how you perform your iteration before getting to understand what you are computing.
loop/recur is real Clojure, and idiomatic. It's there for a reason. And often there is no better way. But many people find that once one gets used to it, it's very convenient to build many functions out of building blocks such as iterate. Clojure has a very nice collection of them. I started out writing things from scratch using truly recursive algorithms and then loop/recur. Personally, I wouldn't claim that it's better to use the functional building blocks functions, but I've come to love using them. It's one of the things that's great about Clojure.
(Yes, the many of the building block functions are lazy, as are e.g. for and map, which are more general-purpose. Laziness can be good, but I'm not religious about it. Sometimes it's more efficient. Sometimes it's not. Sometimes it's beautiful. Sometimes it's a pain in the rear. Sometimes all that.)
Loop and recur are not bad - in fact, if you look at the source code for many of the built-in functions, you will find that is what they do - the provided functions are often an abstraction of common patterns which can make your code easier to understand. How you are doing things is typical for many when they first start. How you are approaching this seems correct to me. You are not just writing your solution and moving on. You are writing your solution and then looking at how others have solved the same problem and making a comparison. This is the right road to improvement. Highly recommend that when you find an alternative solution which seems more elegant/efficient/clear, analyse it, look at the source code of the built-in functions it uses and things will slowly come together.
loop ... recur is an optimisation for recursive tail calls, and should
always be used where it applies.
range is lazy, so your version of it should strive to be so.
loop ... recur can't do this.
All the sequence functions that can sensibly be lazy (iterate,
filter, map, take-while ...) are so. As you know, you can use some of these
to build a lazy range. As #cgrand explains, this is the preferred approach.
If you prefer, you can build a lazy range from scratch:
(defn range [x y]
(lazy-seq
(when (< x y)
(cons x (range (inc x) y)))))
I wondered the same thing for some days but truly many tims I do not see any better alternative than loop recur.
Some jobs are not fully "reduce" or "map". It is the case when you update data base on a buffer you mutates at every iteration.
Loop recur is very convienient where "non linear precise work" is require. It looks like more imperative but if I remember well Clojure was designed with pragmatism. Buy yet, pragmatism means choosing what is more effficient.
That is why in complex programs, I use both Clojure and java code mixed. sometimes java is just more clear for "low level" or iterative jobs like taking a specific value and so on while I see Clojure functions more useful for big data processing (without so much level of detail : global filters, etc.).
Some people say that we must stock with Clojure as much as possible but I do not see any reason not to use Java. I did not programmed a lot but Clojure/Java is the best interop I have ever seen, very complementary approaches.

What is the 'parallel' concept in Rich Hickey's transducers Strange Loop talk?

In the Strange Loop presentation on Transducers Rich Hickey mentions at a concept in a table called 'parallel'.
You can easily see examples of seqs and into and channels using transducers.
Now you can work out that Observables are talking about RxJava.
My Question is What is the 'parallel' concept in Rich Hickey's transducers Strange Loop talk? Is this a list of futures, or pmap or something else?
There have been some thoughts about creating parallel transducible processes. This is being tracked as CLJ-1553. Currently we are not planning to address this in Clojure 1.7, but would like to do something in Clojure 1.8.
It is possible now to set up a reducer that uses a transducer as the bottom reduce phase (along with more traditional combiner fns) but ideally we would be able to leverage the "self-reducible" concept embodied by persistent vectors and maps to support transduce in parallel in a more natural way.
It is most likely right now that this would emerge as some sort of preduce function, but still much to be decided.
One problematic area is in dealing with kv forms - reducers made some choices there that are difficult or inconvenient with transducers so that needs to be worked through.
The concept is simply that of performing computation in parallel. There are multiple possible implementations:
clojure.core.reducers/fold, which is similar to reduce, except it should only be used with associative reduction functions and it's backed by a protocol which exploits the tree structure of various Clojure data structures to parallelize the computational effort. It's not actually transducer-friendly yet, but it is reducer-friendly and it seems that a transducer-enabled version is bound to arrive eventually.
Recent releases of core.async with transducer support export a function called pipeline which parallelizes channel → channel transducer-based transformations.

How do Midje and Speclj compare? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Both look reasonably good. I'd like to understand what each library is particularly good at or lacking, especially for testing of web applications.
I haven't used speclj, and I was the first author of Midje. One point that hasn't been mentioned by others is that Midje tries to exploit differences between functional and object-oriented languages.
One difference is immutability. Because most functions depend only on their inputs, not on contained state, the kind of truth statements you make about them are different in feel than their object-oriented counterparts. In OO testing, you make examples of the form: "given this history and these inputs, this method produces such and so."
It would seem that examples in a functional language would just be simpler ones: "given these inputs, this function returns such and so". But I don't think that's quite right. I think the other functions in the system play a role analogous to state/history: they're one of the things you're trying to get intellectual control over. Functions and their relationships are the things you want the tests to help you think clearly about.
For that reason, Midje is written under the assumption that a sweet development process involves saying:
What do I want to be true of this function? In the world of the system, what's a good way to think of what this function does?
In the process of doing that, what other functions would be useful---would capture an important part of the domain---and what truth statements do I want to make about them?
And then, in typical mockist style, you develop roughly top-down or outside-in, allowing for the inevitable iteration as you recover from mistakes or have better ideas.
The end result is to be a big pile of functions, with their interrelationships documented by the tests or (as Midje calls them) the "facts" about functions and the functions they depend on. Various people have commented that Midje has a Prolog/logic programming feel to it, and that's not an accident. As always, tests are examples, but Midje tries to make them read more like truth statements. That's the justification for its only actually innovative feature, metaconstants. Here's an example of them:
(fact "right changes the direction, but not the position"
(right (snapshot north ...position...)) => (snapshot west ...position...)
(right (snapshot east ...position...)) => (snapshot north ...position...)
(right (snapshot south ...position...)) => (snapshot east ...position...)
(right (snapshot west ...position...)) => (snapshot south ...position...))
In this case, the actual position is irrelevant to what's true about the function right, except that it never changes. The idea of a metaconstant is that it is a value about which nothing is known except what's explicitly stated in the test. Too often in tests, it's hard to tell what's essential and what's incidental. That has a number of bad effects: understanding, maintainability, etc. Metaconstants provide clarity. If it matters that a value is a map or record that contains the value 3 for key :a, you say that explicitly:
(fact
(full-name ..person..) => "Brian Marick"
(provided
..person.. =contains=> {:given-name "Brian", :family-name "Marick"}))
This test is explicit about what matters about people---and also explicit about what doesn't matter (anything but the two names).
In math terms, Midje is trying to let you make statements like "for all x where x..." while still being a test tool rather than a theorem prover.
This approach was inspired by "London-style" mock-heavy TDD of the sort described in Growing Object-Oriented Software, which is the approach I usually use in writing Ruby code. But it's turned out to have a pretty different feel, in a way that's hard to describe. But it's a feel that needs more tool support than just with-redefs.
The upshot is that Midje is in part an attempt to find a style of functional TDD that's not just a port of OO TDD. It tries to be a general-purpose tool, too, but it's semi-opinionated software. As Abraham Lincoln said, "Those who like this sort of thing will find this the sort of thing they like."
The biggest benefit of using Midje is that it provides focused abstractions for testing things without testing all of their parts, parts that often drag in the whole rest of the world.
If you have a function that involves calling a subsidiary function to generate a timestamp, putting something in a database or message queue, make an API request, caching something, logging something, etc, you want know that these world-involving function calls occurred (and sometimes how many times they occurred), however actually executing them is irrelevant to the function you are testing, and the called functions will often deserve having their own unit tests.
Say you have this in your code:
(defn timestamp [] (System/currentTimeMillis))
(defn important-message [x y] (log/warnf "Really important message about %s." x))
(defn contrived [x & y]
(important-message x y)
{:x x :timestamp (timestamp)})
Here is how you could test it with midje:
(ns foo.core-test
(:require [midje.sweet :refer :all]
[foo.core :as base]))
(fact
(base/contrived 100) => {:x 100 :timestamp 1350526304739}
(provided (base/timestamp) => 1350526304739
(base/important-message 100 irrelevant) => anything :times 1))
This example is just a quick glimpse at what you can do with midje but demonstrates the essence of what it is good at. Here you can see there is very little extraneous complexity needed to express:
what the function should produce (despite the fact that what the
timestamp function would produce would be different each time you
call the function),
that the timestamp function and the logging function were called,
that the logging function was only called one time,
that the logging function received the expected first argument, and
that you don't care what the second argument it received was.
The main point I am trying to make with this example is that it's a very clean and compact way of expressing tests of complex code (and by complex I mean it has embedded parts that can be separated) in simple pieces rather than trying to test everything all at once. Testing everything all at once has its place, namely in integration testing.
I am admittedly biased because I actively use midje, whereas I have only looked at speclj, but my sense is that speclj is probably most attractive to people who have used the analogous Ruby library and find that way of thinking about tests ideal, based on that experience. That is a perfectly rspectable reason to chose a testing framework, and there are probably other nice things about it as well that hopefully others can comment on.
I'd definitely go with Speclj.
Speclj is simple to integrate and use. Its syntax is less flamboyant than Midje's. Speclj is based on RSpec to give you all the conforts that Ruby programmers are used to without losing the idiosyncrasies of Clojure.
And the auto runner in Speclj is great.
lein spec -a
Once you've used that for a while, you'll wonder how you ever got work done when you had to manually run tests.
Mocking is a non-issue since you can simply use with-redefs. #rplevy's example in Speclj would look like this.
(ns foo.core-spec
(:require [speclj.core :refer :all ]
[foo.core :as base]))
(describe "Core"
(it "contrives 100"
(let [message-params (atom nil)]
(with-redefs [base/timestamp (fn [] 1350526304739)
base/important-message #(reset! message-params [%1 %2])]
(should= {:x 100 :timestamp 1350526304739} (base/contrived 100))
(should= 100 (first #message-params))))))
This bare-bones approach to mocking is to-the-point; no misdirection.
As for testing web apps, Speclj works fine. In fact Speclj support is build into Joodo.
disclaimer: I wrote Speclj
I'd say that Midje is especially good at creating a DSL for expressing stubbing and mocking. If you care about stubbing and mocking, and want to use it a lot, I'd choose Midje over Speclj, because it has abstractions for expressing those types of tests that are more concise than the approach slagyr offered in his answer.
Another option, if you want a more light-weight approach, is the Conjure stubbing/mocking library intended to be used with clojure.test.
Where Speclj shines is in being a lot like RSpec, having 'describe' and 'it' included... Midje can support nested facts actually, but not as elegantly as Speclj.
disclaimer: I'm a contributor to Midje and Conjure. :)
I would suggest Midje over Speclj
For speclj, I don't think if it has good support for mocks, the documentation also looks sparse as compared to Midje.
Also the syntax for Midje is better:
(foo :bar) => :result compared to (should= (foo :bar) :result)

Books/tutorials about Clojure's state related issues

I'm interested in the state management part of Clojure, but most books and tutorials seem to focus on its LISP-ness.
Can you recommend a tutorial or book that gives examples and analysis on refs, vars and friends. I know there are some pages about these at clojure.org but they are a bit too terse.
The Joy of Clojure contains a whole chapter titled Mutation, which deals with Clojure's concurrency primitives. It's pretty comprehensive.
This article is quite a comprehensive discussion of all the details around the various pieces of shared state management available in Clojure. It's quite old (refers to Clojure 1.0), but the higher level parts are still valid.
The Hickey-Talk "Are we there yet" is worth looking at (the relevant part starts at minute 50): http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey