How to create multiple outputs using a transducer on a pipeline? - clojure

I am trying to understand how to set up a clojure pipeline that has multiple outputs per input, but so far I had no luck getting that to work.
The documentation for pipeline states that
[...] the transducer will be applied independently to each
element [...] and may produce zero or more outputs
per input. [...]
However, I fail to understand how to get more than 1 output per input.
I want to apply multiple transformations to the same input and put all results onto the output channel. I am sure this could also be done using mult, tap and merge, however, this introduces much more overhead compared to adding another transformation to a pipeline transducer.
I tried it with a toy example:
(def ca (chan))
(def cb (chan))
(defn f [in] in)
(defn g [in] (* 2 in))
(pipeline 1 cb (map (juxt f g)) ca)
(put! ca 1)
(<!! cb)
However, this outputs [1 2] in a single output instead of two separate outputs.
So: How can I set up a clojure pipeline between two channels such that it produces multiple (>1) outputs on the output channel per input on the input channel?

Use mapcat instead of map. The difference is: map is one to one, while mapcat is one to many.

Related

iterating through map and getting stack overflow error clojure

for an assignment I need to create a map from a text file in clojure, which I am new to. I'm specifically using a hash-map...but it's possible I should be using another type of map. I'm hoping someone here can answer that for me. I did try changing my hash-map to sorted-map but it gave me the same problem.
The first character in every line in the file is the key and the whole line is the value. The key is a number from 0-9999. There are 10,000 lines and each number after the first number in a line is a random number between 0 and 9999.
I've created the hashmap successfully I think. At least, its not giving me an error when I just run that code. However when I try to iterate through it, printing every value for keys 0-9999 it gives me a stack overflow error right at the middle of line 2764(in the text file). I'm hoping someone can tell me why it's doing this and a better way to do it?
Here's my code:
(ns clojure-project-441.core
(:gen-class))
(defn -main
[& args]
(def pages(def hash-map (file)))
(iter 0)
)
(-main)
(defn file []
(with-open [rdr (clojure.java.io/reader "pages.txt")]
(reduce conj [] (line-seq rdr))))
(defn iter [n]
(doseq [keyval (pages n)] (print keyval))
(if (< n 10000)
(iter (inc n))
)
)
here's a screenshot of my output
If it's relevant at all I'm using repl.it as my IDE.
Here are some screenshots of the text file, for clarity.
beginning of text file
where the error is being thrown
Thanks.
I think the specific problem that causes the exception to be thrown is caused because iter calls itself recursively too many times before hitting the 10,000 line limit.
There some issues in your code that are very common to all people learning Clojure; I'll try to explain:
def is used to define top-level names. They correspond with the concept of constants in the global scope on other programming languages. Think of using def in the same way you would use defn to define functions. In your code, you probably want to use let to give names to intermediate results, like:
(let [uno 1
dos 2]
(+ uno dos)) ;; returns 3
You are using the name hash-map to bind it to some result, but that will get in the way if you want to use the function hash-map that is used to create maps. Try renaming it to my-map or similar.
To call a function recursively without blowing the stack you'll need to use recur for reasons that are a bit long to explain. See the factorial example here: https://clojuredocs.org/clojure.core/recur
My advice would be to think of this assignment as a pipeline composed of the following small functions:
A function that reads the lines from the file (you already have this)
A function that, given a line, returns a pair: the first element of the pair is the first number of the line, the second element is the whole line (the input parameter) OR
A function that reads the first number of the line
To build the map, you have a few options; two off the top of my mind:
Use a loop construct and, for each line, "update" the hash-map to include a new key-value pair (the key is the first number, the value is the whole line), then return the whole hash-map you've built
Use a reduce operation: you create a collection of key-value pairs, then tell reduce to merge, one step at a time, into the original hash-map. The result is the hash-map you want
I think the key is to get familiar with the functions that you can use and build small functions that you can test in isolation and try to group them conveniently to solve your problem. Try to get familiar with functions like hash-map, assoc, let, loop and recur. There's a great documentation site at https://clojuredocs.org/ that also includes examples that will help you understand each function.

How to collect data from a bulk of core.async go channels?

How can I collect data from a bulk of go channels? I get assert failed: <! used not in (go ...) for the code below. I know why I get it, I'm asking what is the best way to consume from all channels.
(->> state :pods (map #(go [(pd/id %)
(<! (f/pod-metrics fleet %))])) (map <!) (into {}))
Use https://clojuredocs.org/clojure.core.async/merge to merge your source channels into one and then use <!! to take val from it. Note that <! can only be used inside a go block.

Alter vs Commute in Clojure: what am I doing wrong?

Clojure newbie here, I was going through the excellent "Clojure from the ground up" posts, and tried out the last exercise in this post.
When I replace alter with commute, the sum is inaccurate, but I don't understand why.
(def work (ref (apply list (range 1e5))))
(def sum (ref 0))
(defn trans-alter [work sum]
(dosync
(if-let [n (first #work)]
(do
(alter work rest)
(alter sum + n)
(count #work))
0)))
(defn trans-commute [work sum]
(dosync
(if-let [n (first #work)]
(do
(commute work rest)
(commute sum + n)
(count #work))
0)))
(I've skipped the code that sets up the futures and calls them etc)
With trans-alter here I got 4999950000 for the sum (which is the correct expected value), while with trans-commute I got a different value each time, but higher than expected (e.g. 4999998211).
What am I missing here? Thanks in advance!
Commute and alter essentially do the same thing, though commute is a little more lenient on the guarantee of correctness.
Alter instructs the STM to always ensure that this code ran all the way through without any of the refs it uses changing out from under it.
Commute is an instruction to help the STM decide when it needs to abort a transaction because the underlying data changed out from under it.
If everything in a transaction is commutative, then it's ok to let that transaction finish even if some data changed. In your case two transactions could both:
grab the first number
remove the same number from work
add the same number to result
then use commute to instruct the STM that this is OK, and it should just go ahead and commit the transaction anyway...
get the wrong answer.
So in short,the work you are asking to preform is not actually a commutative operation. Specifically removing an item from a list is not commutative. If you change any of the commutes to an alter, then step 4 would have kicked one of them out and only allowed one of them to finish. The one that got kicked out would be re-run on the fresh data and eventually would have arrived at a correct result.

Multiple transforms on an html page using enlive

Clojure and enlive are great. In trying to fathom the power of Enlive I'm attempting to apply two transformations to an html page.
The HTML page has 2 areas (divs) that I want to transform. The first div in question gets cloned ~16 times. The second div in question gets cloned 5 times. The original divs (from the html file) should be overwritten or just not appear at all.
Enlive has the idiomatic approach
(apply str (enlive-html/emit* ze-contant-transferm))
this works beautifully well for one transform.
however, I would like to apply two transforms to the page, so I tried something like:
(str
(apply str (enlive-html/emit* ze-first-wan))
(apply str (enlive-html/emit* ze-secand-wan)))
the transformations, done alone, do exactly what I wish: they eat up the original HTML and display the clones that I use for populating with infos.
However, done together in this way, the original html-page divs are preserved, so I end up having the original html file divs along with my clones, and that behavior is no bueno.
Please help.
Thanks-a-much-a.
Enlive-html provides the do-> function for this purpose.
(defn do->
"Chains (composes) several transformations. Applies functions from left to right."
[& fns]
#(reduce (fn [nodes f] (flatmap f nodes)) (as-nodes %) fns))
Which you can use something like this:
(apply str (enlive-html/emit* (enlive-html/do-> ze-first-wan ze-second-wan)))

Clojure in Action, Ch 12 Data Analysis example, dependency issues

I am working through the first edition of this book and while I enjoy it, some of the examples given seem out-dated. I would give up and find another book to learn from, but I am really interested in what the author is talking about and want to make the examples work for myself, so I am trying to update them as I go along.
The following code is a map/reduce approach to analyzing text that depends on clojure.contrib. I have tried changing the .split function to re-seq with #"\w+", used line-seq instead of read-lines, and changed the .toLowerCase to string/lower-case. I tried to follow my problems to the source code and read the docs thoroughly to learn that the read-lines function closes after you consume the entire sequence and that line-seq returns a lazy sequence of strings, implementing java.io.BufferedReader. The most helpful thing for my problem was post about how to read files after clojure 1.3. Even still, I can't get it to work.
So here's my question: What dependencies and/or functions do I need to change in the following code to make it contemporary, reliable, idiomatic Clojure?
First namespace:
(ns chapter-data.word-count-1
(:use clojure.contrib.io
clojure.contrib.seq-utils))
(defn parse-line [line]
(let [tokens (.split (.toLowerCase line) " ")]
(map #(vector % 1) tokens)))
(defn combine [mapped]
(->> (apply concat mapped)
(group-by first)
(map (fn [[k v]]
{k (map second v)}))
(apply merge-with conj)))
(defn map-reduce [mapper reducer args-seq]
(->> (map mapper args-seq)
(combine)
(reducer)))
(defn sum [[k v]]
{k (apply + v)})
(defn reduce-parsed-lines [collected-values]
(apply merge (map sum collected-values)))
(defn word-frequency [filename]
(map-reduce parse-line reduce-parsed-lines (read-lines filename)))
Second namespace:
(ns chapter-data.average-line-length
(:use rabbit-x.data-anal
clojure.contrib.io))
(def IGNORE "_")
(defn parse-line [line]
(let [tokens (.split (.toLowerCase line) " ")]
[[IGNORE (count tokens)]]))
(defn average [numbers]
(/ (apply + numbers)
(count numbers)))
(defn reducer [combined]
(average (val (first combined))))
(defn average-line-length [filename]
(map-reduce parse-line reducer (read-lines filename)))
But when I compile and run it in light table I get a bevy of errors:
1) In the word-count-1 namespace I get this when I try to reload the ns function after editing:
java.lang.IllegalStateException: spit already refers to: #'clojure.contrib.io/spit in namespace: chapter-data.word-count-1
2) In the average-line-length namespace I get similar name collision errors under the same circumstances:
clojure.lang.Compiler$CompilerException: java.lang.IllegalStateException: parse-line already refers to: #'chapter-data.word-count-1/parse-line in namespace: chapter-data.average-line-length, compiling:(/Users/.../average-line-length.clj:7:1)
3) Oddly, when I quit and restart light table, copy and paste the code directly into the files (replacing what's there) and call instances of their top level functions the word-count-1 namespace runs fine, giving me the number of occurrences of certain words in the test.txt file but the average-line-length name-space gives me this:
"Warning: *default-encoding* not declared dynamic and thus is not dynamically rebindable, but its name suggests otherwise. Please either indicate ^:dynamic *default-encoding* or change the name. (clojure/contrib/io.clj:73)...
4) At this point when I call the word-frequency functions of the first namespace it returns nil instead of the number of word occurrences and when I call the average-line-length function of the second namespace it returns
java.lang.NullPointerException: null
core.clj:1502 clojure.core/val
As far as I can tell, clojure.contrib.io and clojure.contrib.seq-utils are no longer updated, and in fact they may be conflicting with clojure.core functions like spit. I would recommend taking out those dependencies and seeing if you can do this using only core functions. spit should just work -- the error that you're getting is caused by useing clojure.contrib.io, which contains its own spit function, which looks to be roughly equivalent; perhaps the current version in clojure.core is a "new and improved" version of clojure.contrib.io/spit.
Your problem with the parse-line function looks to be caused by the fact that you've defined two functions with the same name, in two different namespaces. The namespaces don't depend on one another, but you can still run into a conflict if you load both namespaces in a REPL. If you only need to use one at a time, try using one of them, and then when you want to use the other one, make sure you do a (remove-ns name-of-first-ns) first to free up the vars so there is no conflict. Alternatively, you could make parse-line a private function in each namespace, by changing (defn parse-line ... to (defn- parse-line ....
EDIT: If you still need any functions that were in clojure.contrib.io or clojure.contrib.seq-utils that aren't available in core or elsewhere, you can always copy the source over into your namespace. See clojure.contrib.io and clojure.contrib.seq-utils on github.