How to create a channel from another with transducers? - clojure

I want to create a channel of clojure.core.async from another one that just filters specific messages. Therefore I found a function called filter<.
=> (def c1 (chan))
=> (def c2 (filter< even? c1))
=> (put! c1 1)
=> (put! c1 2)
=> (<!! c2)
2
But the function and its friends are marked as deprecated:
Deprecated - this function will be removed. Use transducer instead
There are some ways to use channels with transducer like chan with the xform parameter. How can I build a new channel from an existing one using transducers?

I did some research on this, found a couple of interesting articles (first and second), and then got something working using pipeline
(require '[clojure.core.async :as async :refer [chan <!! pipeline put!]])
(def c1 (chan))
(def c2 (chan))
(pipeline 4 c2 (filter even?) c1)
(put! c1 1)
(put! c1 2)
(<!! c2)
;;=> 2
The second article I linked makes this a bit cleaner with some helper functions around the pipeline function:
(defn ncpus []
(.availableProcessors (Runtime/getRuntime)))
(defn parallelism []
(+ (ncpus) 1))
(defn add-transducer
[in xf]
(let [out (chan (buffer 16))]
(pipeline (parallelism) out xf in)
out))
Then you can simply tie channels together with
(def c1 (chan))
(def c2 (add-transducer c1 (filter even?))
To complete the answer, as you found yourself you can use pipe in a similar fashion:
(defn pipe-trans
[ci xf]
(let [co (chan 1 xf)]
(pipe ci co)
co))
(def c1 (chan))
(def c2 (pipe-trans c1 (filter even?)))

Related

Why do I have memory leak for the following code with channel sub/unsub?

I am using [org.clojure/clojure "1.10.1"],[org.clojure/core.async "1.2.603"] and the latest Amazon Corretto 11 JVM if there was anything to do with them.
The following code is a simplified version of the code used in production and it does cause memory leak. I have no idea why that happened but suspect it might due to sub/unsub of channels. Can anyone help point out where my code may go wrong or how I can fix the memory leak?
(ns test-gc.core
(:require [clojure.core.async :as a :refer [chan put! close! <! go >! go-loop timeout]])
(:import [java.util UUID]))
(def global-msg-ch (chan (a/sliding-buffer 200)))
(def global-msg-pub (a/pub global-msg-ch :id))
(defn io-promise []
(let [id (UUID/randomUUID)
ch (chan)]
(a/sub global-msg-pub id ch)
[id (go
(let [x (<! ch)]
(a/unsub global-msg-pub id ch)
(:data x)))]))
(defn -main []
(go-loop []
(<! (timeout 1))
(let [[pid pch] (io-promise)
cmd {:id pid
:data (rand-int 1E5)}]
(>! global-msg-ch cmd)
(println (<! pch)))
(recur))
(while true
(Thread/yield)))
A quick heap dump gives the following statistics for example:
Class by number of instances
java.util.LinkedList 5,157,128 (14.4%)
java.util.concurrent.atomic.AtomicReference 3,698,382 (10.3%)
clojure.lang.Atom 3,094,279 (8.6%)
...
Class by size of instances
java.lang.Object[] 210,061,752 B (13.8%)
java.util.LinkedList 206,285,120 B (13.6%)
clojure.lang.Atom 148,525,392 B (9.8%)
clojure.core.async.impl.channels.ManyToManyChannel 132,022,336 B (8.7%)
...
I finally figured out why. By looking at the source code, we get the following segment:
(defn pub
"Creates and returns a pub(lication) of the supplied channel, ..."
...
(let [mults (atom {}) ;;topic->mult
ensure-mult (fn [topic]
(or (get #mults topic)
(get (swap! mults
#(if (% topic) % (assoc % topic (mult (chan (buf-fn topic))))))
topic)))
p (reify
Mux
(muxch* [_] ch)
Pub
(sub* [p topic ch close?]
(let [m (ensure-mult topic)]
(tap m ch close?)))
(unsub* [p topic ch]
(when-let [m (get #mults topic)]
(untap m ch)))
(unsub-all* [_] (reset! mults {}))
(unsub-all* [_ topic] (swap! mults dissoc topic)))]
...
p)))
We can see mults stores all topic hence shall increase monotonically if we do not clear it. We may add something like (a/unsub-all* global-msg-pub pid) to fix that.

Dispatching function calls on different formats of maps

I'm writing an agar.io clone. I've lately seen a lot of suggestions to limit use of records (like here), so I'm trying to do the whole project only using basic maps.*
I ended up creating constructors for different "types" of bacteria like
(defn new-bacterium [starting-position]
{:mass 0,
:position starting-position})
(defn new-directed-bacterium [starting-position starting-directions]
(-> (new-bacterium starting-position)
(assoc :direction starting-directions)))
The "directed bacterium" has a new entry added to it. The :direction entry will be used to remember what direction it was heading in.
Here's the problem: I want to have one function take-turn that accepts the bacterium and the current state of the world, and returns a vector of [x, y] indicating the offset from the current position to move the bacterium to. I want to have a single function that's called because I can think right now of at least three kinds of bacteria that I'll want to have, and would like to have the ability to add new types later that each define their own take-turn.
A Can-Take-Turn protocol is out the window since I'm just using plain maps.
A take-turn multimethod seemed like it would work at first, but then I realized that I'd have no dispatch values to use in my current setup that would be extensible. I could have :direction be the dispatch function, and then dispatch on nil to use the "directed bacterium"'s take-turn, or default to get the base aimless behavior, but that doesn't give me a way of even having a third "player bacterium" type.
The only solution I can think of it to require that all bacterium have a :type field, and to dispatch on it, like:
(defn new-bacterium [starting-position]
{:type :aimless
:mass 0,
:position starting-position})
(defn new-directed-bacterium [starting-position starting-directions]
(-> (new-bacterium starting-position)
(assoc :type :directed,
:direction starting-directions)))
(defmulti take-turn (fn [b _] (:type b)))
(defmethod take-turn :aimless [this world]
(println "Aimless turn!"))
(defmethod take-turn :directed [this world]
(println "Directed turn!"))
(take-turn (new-bacterium [0 0]) nil)
Aimless turn!
=> nil
(take-turn (new-directed-bacterium [0 0] nil) nil)
Directed turn!
=> nil
But now I'm back to basically dispatching on type, using a slower method than protocols. Is this a legitimate case to use records and protocols, or is there something about mutlimethods that I'm missing? I don't have a lot of practice with them.
* I also decided to try this because I was in the situation where I had a Bacterium record and wanted to create a new "directed" version of the record that had a single field direction added to it (inheritance basically). The original record implemented protocols though, and I didn't want to have to do something like nesting the original record in the new one, and routing all behavior to the nested instance. Every time I created a new type or changed a protocol, I would have to change all the routing, which was a lot of work.
You can use example-based multiple dispatch for this, as explained in this blog post. It is certainly not the most performant way to solve this problem, but arguably more flexible than multi-methods as it does not require you to declare a dispatch-method upfront. So it is open for extension to any data representation, even other things than maps. If you need performance, then multi-methods or protocols as you suggest, is probably the way to go.
First, you need to add a dependency on [bluebell/utils "1.5.0"] and require [bluebell.utils.ebmd :as ebmd]. Then you declare constructors for your data structures (copied from your question) and functions to test those data strucutres:
(defn new-bacterium [starting-position]
{:mass 0
:position starting-position})
(defn new-directed-bacterium [starting-position starting-directions]
(-> (new-bacterium starting-position)
(assoc :direction starting-directions)))
(defn bacterium? [x]
(and (map? x)
(contains? x :position)))
(defn directed-bacterium? [x]
(and (bacterium? x)
(contains? x :direction)))
Now we are going to register those datastructures as so called arg-specs so that we can use them for dispatch:
(ebmd/def-arg-spec ::bacterium {:pred bacterium?
:pos [(new-bacterium [9 8])]
:neg [3 4]})
(ebmd/def-arg-spec ::directed-bacterium {:pred directed-bacterium?
:pos [(new-directed-bacterium [9 8] [3 4])]
:neg [(new-bacterium [3 4])]})
For each arg-spec, we need to declare a few example values under the :pos key, and a few non-examples under the :neg key. Those values are used to resolve the fact that a directed-bacterium is more specific than just a bacterium in order for the dispatch to work properly.
Finally, we are going to define a polymorphic take-turn function. We first declare it, using declare-poly:
(ebmd/declare-poly take-turn)
And then, we can provide different implementations for specific arguments:
(ebmd/def-poly take-turn [::bacterium x
::ebmd/any-arg world]
:aimless)
(ebmd/def-poly take-turn [::directed-bacterium x
::ebmd/any-arg world]
:directed)
Here, the ::ebmd/any-arg is an arg-spec that matches any argument. The above approach is open to extension just like multi-methods, but does not require you to declare a :type field upfront and is thus more flexible. But, as I said, it is also going to be slower than both multimethods and protocols, so ultimately this is a trade-off.
Here is the full solution: https://github.com/jonasseglare/bluebell-utils/blob/archive/2018-11-16-002/test/bluebell/utils/ebmd/bacteria_test.clj
Dispatching a multimethod by a :type field is indeed polymorphic dispatch that could be done with a protocol, but using multimethods allows you to dispatch on different fields. You can add a second multimethod that dispatches on something other than :type, which might be tricky to accomplish with a protocol (or even multiple protocols).
Since a multimethod can dispatch on anything, you could use a set as the dispatch value. Here's an alternative approach. It's not fully extensible, since the keys to select are determined within the dispatch function, but it might give you an idea for a better solution:
(defmulti take-turn (fn [b _] (clojure.set/intersection #{:direction} (set (keys b)))))
(defmethod take-turn #{} [this world]
(println "Aimless turn!"))
(defmethod take-turn #{:direction} [this world]
(println "Directed turn!"))
Fast paths exist for a reason, but Clojure doesn't stop you from doing anything you want to do, per say, including ad hoc predicate dispatch. The world is definitely your oyster. Observe this super quick and dirty example below.
First, we'll start off with an atom to store all of our polymorphic functions:
(def polies (atom {}))
In usage, the internal structure of the polies would look something like this:
{foo ; <- function name
{:dispatch [[pred0 fn0 1 ()] ; <- if (pred0 args) do (fn0 args)
[pred1 fn1 1 ()]
[pred2 fn2 2 '&]]
:prefer {:this-pred #{:that-pred :other-pred}}}
bar
{:dispatch [[pred0 fn0 1 ()]
[pred1 fn1 3 ()]]
:prefer {:some-pred #{:any-pred}}}}
Now, let's make it so that we can prefer predicates (like prefer-method):
(defn- get-parent [pfn x] (->> (parents x) (filter pfn) first))
(defn- in-this-or-parent-prefs? [poly v1 v2 f1 f2]
(if-let [p (-> #polies (get-in [poly :prefer v1]))]
(or (contains? p v2) (get-parent f1 v2) (get-parent f2 v1))))
(defn- default-sort [v1 v2]
(if (= v1 :poly/default)
1
(if (= v2 :poly/default)
-1
0)))
(defn- pref [poly v1 v2]
(if (-> poly (in-this-or-parent-prefs? v1 v2 #(pref poly v1 %) #(pref poly % v2)))
-1
(default-sort v1 v2)))
(defn- sort-disp [poly]
(swap! polies update-in [poly :dispatch] #(->> % (sort-by first (partial pref poly)) vec)))
(defn prefer [poly v1 v2]
(swap! polies update-in [poly :prefer v1] #(-> % (or #{}) (conj v2)))
(sort-disp poly)
nil)
Now, let's create our dispatch lookup system:
(defn- get-disp [poly filter-fn]
(-> #polies (get-in [poly :dispatch]) (->> (filter filter-fn)) first))
(defn- pred->disp [poly pred]
(get-disp poly #(-> % first (= pred))))
(defn- pred->poly-fn [poly pred]
(-> poly (pred->disp pred) second))
(defn- check-args-length [disp args]
((if (= '& (-> disp (nth 3) first)) >= =) (count args) (nth disp 2)))
(defn- args-are? [disp args]
(or (isa? (vec args) (first disp)) (isa? (mapv class args) (first disp))))
(defn- check-dispatch-on-args [disp args]
(if (-> disp first vector?)
(-> disp (args-are? args))
(-> disp first (apply args))))
(defn- disp*args? [disp args]
(and (check-args-length disp args)
(check-dispatch-on-args disp args)))
(defn- args->poly-fn [poly args]
(-> poly (get-disp #(disp*args? % args)) second))
Next, let's prepare our define macro with some initialization and setup functions:
(defn- poly-impl [poly args]
(if-let [poly-fn (-> poly (args->poly-fn args))]
(-> poly-fn (apply args))
(if-let [default-poly-fn (-> poly (pred->poly-fn :poly/default))]
(-> default-poly-fn (apply args))
(throw (ex-info (str "No poly for " poly " with " args) {})))))
(defn- remove-disp [poly pred]
(when-let [disp (pred->disp poly pred)]
(swap! polies update-in [poly :dispatch] #(->> % (remove #{disp}) vec))))
(defn- til& [args]
(count (take-while (partial not= '&) args)))
(defn- add-disp [poly poly-fn pred params]
(swap! polies update-in [poly :dispatch]
#(-> % (or []) (conj [pred poly-fn (til& params) (filter #{'&} params)]))))
(defn- setup-poly [poly poly-fn pred params]
(remove-disp poly pred)
(add-disp poly poly-fn pred params)
(sort-disp poly))
With that, we can finally build our polies by rubbing some macro juice on there:
(defmacro defpoly [poly-name pred params body]
`(do (when-not (-> ~poly-name quote resolve bound?)
(defn ~poly-name [& args#] (poly-impl ~poly-name args#)))
(let [poly-fn# (fn ~(symbol (str poly-name "-poly")) ~params ~body)]
(setup-poly ~poly-name poly-fn# ~pred (quote ~params)))
~poly-name))
Now you can build arbitrary predicate dispatch:
;; use defpoly like defmethod, but without a defmulti declaration
;; unlike defmethods, all params are passed to defpoly's predicate function
(defpoly myinc number? [x] (inc x))
(myinc 1)
;#_=> 2
(myinc "1")
;#_=> Execution error (ExceptionInfo) at user$poly_impl/invokeStatic (REPL:6).
;No poly for user$eval187$myinc__188#5c8eee0f with ("1")
(defpoly myinc :poly/default [x] (inc x))
(myinc "1")
;#_=> Execution error (ClassCastException) at user$eval245$fn__246/invoke (REPL:1).
;java.lang.String cannot be cast to java.lang.Number
(defpoly myinc string? [x] (inc (read-string x)))
(myinc "1")
;#_=> 2
(defpoly myinc
#(and (number? %1) (number? %2) (->> %& (filter (complement number?)) empty?))
[x y & z]
(inc (apply + x y z)))
(myinc 1 2 3)
;#_=> 7
(myinc 1 2 3 "4")
;#_=> Execution error (ArityException) at user$poly_impl/invokeStatic (REPL:5).
;Wrong number of args (4) passed to: user/eval523/fn--524
; ^ took the :poly/default path
And when using your example, we can see:
(defn new-bacterium [starting-position]
{:mass 0,
:position starting-position})
(defn new-directed-bacterium [starting-position starting-directions]
(-> (new-bacterium starting-position)
(assoc :direction starting-directions)))
(defpoly take-turn (fn [b _] (-> b keys set (contains? :direction)))
[this world]
(println "Directed turn!"))
;; or, if you'd rather use spec
(defpoly take-turn (fn [b _] (->> b (s/valid? (s/keys :req-un [::direction])))
[this world]
(println "Directed turn!"))
(take-turn (new-directed-bacterium [0 0] nil) nil)
;#_=> Directed turn!
;nil
(defpoly take-turn :poly/default [this world]
(println "Aimless turn!"))
(take-turn (new-bacterium [0 0]) nil)
;#_=> Aimless turn!
;nil
(defpoly take-turn #(-> %& first :show) [this world]
(println :this this :world world))
(take-turn (assoc (new-bacterium [0 0]) :show true) nil)
;#_=> :this {:mass 0, :position [0 0], :show true} :world nil
;nil
Now, let's try using isa? relationships, a la defmulti:
(derive java.util.Map ::collection)
(derive java.util.Collection ::collection)
;; always wrap classes in a vector to dispatch off of isa? relationships
(defpoly foo [::collection] [c] :a-collection)
(defpoly foo [String] [s] :a-string)
(foo [])
;#_=> :a-collection
(foo "bob")
;#_=> :a-string
And of course we can use prefer to disambiguate relationships:
(derive ::rect ::shape)
(defpoly bar [::rect ::shape] [x y] :rect-shape)
(defpoly bar [::shape ::rect] [x y] :shape-rect)
(bar ::rect ::rect)
;#_=> :rect-shape
(prefer bar [::shape ::rect] [::rect ::shape])
(bar ::rect ::rect)
;#_=> :shape-rect
Again, the world's your oyster! There's nothing stopping you from extending the language in any direction you want.

How are transducers executed in core.async channels?

When making a channel a channel like so:
(chan 10 tx)
If i created 10 channels like this and then sent a message to all at the same time, how would the transducers be executed. Would they run concurrent or on one thread?
I think that right now the behaviour of when the transducer is run is not defined, but looking at the implementation of ManyToManyChannel, the transducer (which is the add! field) can be called both when writing and reading from the channel.
Running a simple test seems that if the channel is not full, the writing thread will execute the transducer, but if the channel is full, sometimes the reading thread runs it.
A sample with a small buffer:
(defn thread-name []
(.getName (Thread/currentThread)))
(require '[clojure.core.async :as async :refer [chan <! >! >!! go]])
(defn p [& args]
(locking *out*
(apply println (thread-name) ":" args)))
(defn log [v]
(p "Transforming" v)
v)
(def tx (map log))
(def c (chan 1 tx))
(def c2 (chan 1 tx))
(go
(loop []
(when-let [v (<! c)]
(p "Getting from c1" v)
(<! (async/timeout 100))
(recur))))
(go
(loop []
(when-let [v (<! c2)]
(p "Getting from c2" v)
(<! (async/timeout 100))
(recur))))
(dotimes [_ 5]
(p "Putting in c1" 1)
(>!! c 1)
(p "Putting in c2" 100)
(>!! c2 100))
Produces the output:
nREPL-worker-20 : Transforming 1
nREPL-worker-20 : Putting in c2 100
async-dispatch-33 : Getting from c1 1
nREPL-worker-20 : Transforming 100
nREPL-worker-20 : Putting in c1 1
async-dispatch-31 : Getting from c2 100
nREPL-worker-20 : Transforming 1
nREPL-worker-20 : Putting in c2 100
nREPL-worker-20 : Transforming 100
nREPL-worker-20 : Putting in c1 1
async-dispatch-35 : Getting from c2 100
async-dispatch-34 : Transforming 1 <---- In this case is run in the reading side
async-dispatch-34 : Getting from c1 1
nREPL-worker-20 : Putting in c2 100
nREPL-worker-20 : Transforming 100
async-dispatch-37 : Getting from c2 100
async-dispatch-36 : Getting from c1 1
nREPL-worker-20 : Putting in c1 1

Core.async <! channel deadlock

Why does Alpha stop early, when I expect it to behave like Beta? The only difference between Alpha and Beta is >! and put!, as commented below.
Alpha:
user=> (def q (chan))
#'user/q
user=> (def counter (atom 0))
#'user/counter
user=> (defn mg [event-queue]
#_=> (go-loop [event (<! event-queue)]
#_=> (swap! counter inc)
#_=> (when (< #counter 4)
#_=> (println "counter: " #counter)
#_=> (>! event-queue {:a #counter}) ;; Here's the only difference
#_=> (println "event: " event)
#_=> (recur (<! event-queue)))))
#'user/mg
user=> (mg q)
#object[clojure.core.async.impl.channels.ManyToManyChannel 0x3a1ffd56 "clojure.core.async.impl.channels.ManyToManyChannel#3a1ffd56"]
user=> (put! q "hi")
counter: true
1
user=>
Beta:
user=> (def q (chan))
#'user/q
user=> (def counter (atom 0))
#'user/counter
user=> (defn mg [event-queue]
#_=> (go-loop [event (<! event-queue)]
#_=> (swap! counter inc)
#_=> (when (< #counter 4)
#_=> (println "counter: " #counter)
#_=> (put! event-queue {:a #counter}) ;; Here's the only difference
#_=> (println "event: " event)
#_=> (recur (<! event-queue)))))
#'user/mg
user=> (mg q)
#object[clojure.core.async.impl.channels.ManyToManyChannel 0x72c9b65a "clojure.core.async.impl.channels.ManyToManyChannel#72c9b65a"]
user=> (put! q "hi")
true
counter: 1
event: hi
counter: 2
event: {:a 1}
counter: 3
event: {:a 2}
user=>
It's also interesting that, after executing Alpha, the channel #'user/q was properly enqueued:
user=> (take! q println)
event: hi
{:a 1}
nil
user=>
The same results occur in both Clojure and Clojurescript. Is this some sort of deadlock, or is the suppose to happen?
This is expected.
The channel q is created without a buffer, so when a value is placed with >!, it will block (park) the go-loop until another thread is ready to consume the value with <!.
One way to work around this is to give q a 1-slot buffer with (def q (chan 1)). The buffer allows 1 value to be placed in the channel without blocking the sender.
Beta behaves differently because put! is asynchronous wrt. the caller -- it uses a separate thread to place the new value in the channel. This avoids blocking the current go-loop, allowing the channel to be read and progress to continue.

How to model Rx's `withLatestFrom` with core.async channels?

For example given a channel with operations and another channel with data, how to write a go block that will apply the operation on whatever was the last value on the data channel?
(go-loop []
(let [op (<! op-ch)
data (<! data-ch)]
(put! result-ch (op data))))
Obviously that doesn't work because it would require both channels to have the same frequency.
(see http://rxmarbles.com/#withLatestFrom)
Using alts! you could accomplish what you want.
The with-latest-from shown below implements the same behavior found in the withLatestFrom from RxJS (I think :P).
(require '[clojure.core.async :as async])
(def op-ch (async/chan))
(def data-ch (async/chan))
(defn with-latest-from [chs f]
(let [result-ch (async/chan)
latest (vec (repeat (count chs) nil))
index (into {} (map vector chs (range)))]
(async/go-loop [latest latest]
(let [[value ch] (async/alts! chs)
latest (assoc latest (index ch) value)]
(when-not (some nil? latest)
(async/put! result-ch (apply f latest)))
(when value (recur latest))))
result-ch))
(def result-ch (with-latest-from [op-ch data-ch] str))
(async/go-loop []
(prn (async/<! result-ch))
(recur))
(async/put! op-ch :+)
;= true
(async/put! data-ch 1)
;= true
; ":+1"
(async/put! data-ch 2)
;= true
; ":+2"
(async/put! op-ch :-)
;= true
; ":-2"
There's an :priority true option for the alts!.
An expression which always returns the latest seen value in some channel would look something like this:
(def in-chan (chan))
(def mem (chan))
(go (let [[ch value] (alts! [in-chan mem] :priority true)]
(take! mem) ;; clear mem (take! is non-blocking)
(>! mem value) ;; put the new (or old) value in the mem
value ;; return a chan with the value in
It's untested, it's probably not efficient (a volatile variable is probably better). The go-block returns a channel with only the value, but the idea could be expanded to some "memoized" channel.