A list of players is defined as a set:
(def players (atom #{}))
Function that removes player should return different HTTP codes based on whether element was in the set or not:
(defn remove-player [player-name]
(if (contains? #players player-name)
(do (swap! players disj player-name)
(status (response "") 200))
(status (response "") 404)))
The problem with this code is that it might return multiple 200 to concurrent requests, even if only one request actually removed the element.
I guess I need to execute both contains? and disj atomically. Do I need to do explicit locking or is there a better way to do it?
the swap! itself is atomic operation, so inside the swap's calculating function you can be sure that the first parameter (atom's current value) is consistent. Personally i would make a helper function for that like this:
(defn remove-existent [value set-a]
(let [existed (atom false)]
(swap! set-a
#(if (contains? % value)
(do (reset! existed true)
(disj % value))
(do (reset! existed false)
%))
#existed))
as you can see, lambda expression contains both existency check and removal.
user> (def players (atom #{:user1 :user2}))
#'user/players
user> (remove-existent :user100 players)
false
user> (remove-existent :user1 players)
true
user> #players
#{:user2}
update
Inspired by #clojuremostly's excellent metadata approach, you can make it much better:
(defn remove-existent [value set-a]
(-> (swap! set-a #(with-meta (disj % value)
{:existed (contains? % value)}))
meta
:existed))
You can just add some more logic to your swapping function:
(for [el-rem [:valid-el :not-there]]
(let [a (atom #{:valid-el :another-one})
disj-res
(swap! a
(fn [a]
(with-meta (disj a el-rem)
{:before-count (count a)})))]
[disj-res
"removed:"
el-rem
(not (== (:before-count (meta disj-res))
(count disj-res)))]))
You then compare the count of the return value disj-res and the count in the meta data. If it differs, then disj did remove an element. If not, the element was not present.
Related
I'd like to create a list depending on the results of some functions. In Java (my background), I'd do something like:
List<String> messages = ...
if(condition 1)
messages.add(message 1);
if(condition 2)
messages.add(message 2);
...
if(condition N)
messages.add(message N);
In clojure, I think I'll need to create a list using let like the following (just dummy example):
(let [result
(vec
(if (= 1 1) "message1" "message2")
(if (= 1 0) "message3" "message4"))]
result)
I've also checked cond but I need to be appending the elements to the list considering all the validations (and cond breaks after one condition is satisfied).
Which way should I follow to achieve this?
If you want them to be conditionally added like in the Java example, you could use cond->, which does not short circuit:
(let [messages []]
(cond-> messages ; Conditionally thread through conj
(= 1 1) (conj "Message1")
(= 0 1) (conj "Message2")
(= 0 0) (conj "Message3")))
=> ["Message1" "Message3"]
If you want to conditionally add one or the other like your second example suggests however, you could just use plain conj with some if expressions:
(let [messages []]
(conj messages
(if (= 1 1) "Message1" "Message2")
(if (= 0 1) "Message3" "Message4")))
=> ["Message1" "Message4"]
And I'll note that your original attempt almost worked. Instead of vec, you could have used vector, or just a vector literal:
(let [messages [(if (= 1 1) "Message1" "Message2")
(if (= 1 0) "Message3" "Message4")]]
messages)
=> ["Message1" "Message4"]
Although, this is would only be beneficial if you didn't already have a messages populated that you wanted to add to. If that was the case, you'd have to use concat or into:
(let [old-messages ["old stuff"]
messages [(if (= 1 1) "Message1" "Message2")
(if (= 1 0) "Message3" "Message4")]]
(into old-messages messages))
=> ["old stuff" "Message1" "Message4"]
Take a look at cond->.
For example, your Java example could be written like:
(cond-> (some-fn-returning-messages)
(= 1 1) (conj "message1")
(= 1 2) (conj "message2")
...
(= 1 n) (conj "messagen"))
I see several answers pointing to the cond-> macro which appears to match your request most closely in that it is nearest to the style outlined in your question.
Depending on the number of conditions you have, your question seems like a good candiate for simply using filter.
(def nums (range 10))
(filter #(or (even? %) (= 7 %)) nums)
If you have a bunch of conditions (functions), and "or-ing" them together would be unwieldy, you can use some-fn.
Numbers from 0-19 that are either even, divisible by 7, greater than 17, or exactly equal to 1. Stupid example I know, just wanted to show a simple use-case.
(filter (some-fn
even?
#(zero? (mod % 7))
#(> % 17)
#(= 1 %))
(range 20))
Looks like everyone had the same idea! I did mine with keywords:
(ns tst.demo.core
(:use tupelo.core demo.core tupelo.test))
(defn accum
[conds]
(cond-> [] ; append to the vector in order 1,2,3
(contains? conds :cond-1) (conj :msg-1)
(contains? conds :cond-2) (conj :msg-2)
(contains? conds :cond-3) (conj :msg-3)))
(dotest
(is= [:msg-1] (accum #{:cond-1}))
(is= [:msg-1 :msg-3] (accum #{:cond-1 :cond-3}))
(is= [:msg-1 :msg-2] (accum #{:cond-2 :cond-1}))
(is= [:msg-2 :msg-3] (accum #{:cond-2 :cond-3}))
(is= [:msg-1 :msg-2 :msg-3] (accum #{:cond-3 :cond-2 :cond-1 })) ; note sets are unsorted
)
If you want more power, you can use cond-it-> from the Tupelo library. It threads the target value through both the condition and the action forms, and uses the special symbol it to show where the threaded value is to be placed. This modified example shows a 4th condition where, "msg-3 is jealous of msg-1" and always boots it out of the result:
(ns tst.demo.core
(:use tupelo.core demo.core tupelo.test))
(defn accum
[conds]
(cond-it-> #{} ; accumulate result in a set
(contains? conds :cond-1) (conj it :msg-1)
(contains? conds :cond-2) (conj it :msg-2)
(contains? conds :cond-3) (conj it :msg-3)
(contains? it :msg-3) (disj it :msg-1) ; :msg-3 doesn't like :msg-1
))
; remember that sets are unsorted
(dotest
(is= #{:msg-1} (accum #{:cond-1}))
(is= #{:msg-3} (accum #{:cond-1 :cond-3}))
(is= #{:msg-1 :msg-2} (accum #{:cond-2 :cond-1}))
(is= #{:msg-2 :msg-3} (accum #{:cond-2 :cond-3}))
(is= #{:msg-2 :msg-3} (accum #{:cond-3 :cond-2 :cond-1 }))
)
Not necessarily relevant to your use case, and certainly not a mainstream solution, but once in a while I like cl-format's conditional expressions:
(require '[clojure.pprint :refer [cl-format]])
(require '[clojure.data.generators :as g])
(cl-format nil
"~:[He~;She~] ~:[did~;did not~] ~:[thought about it~;care~]"
(g/boolean) (g/boolean) (g/boolean))
A typical case would be validating a piece of data to produce an error list.
I would construct a table that maps condition to message:
(def error->message-table
{condition1 message1
condition2 message2
...})
Note that the conditions are functions. Since we can never properly recognise functions by value, you could make this table a sequence of pairs.
However you implement the table, all we have to do is collect the messages for the predicates that apply:
(defn messages [stuff]
(->> error->message-table
(filter (fn [pair] ((first pair) stuff)))
(map second)))
Without a coherent example, it's difficult to be more explicit.
First-class functions and the packaged control structures within filter and map give us the means to express the algorithm briefly and clearly, isolating the content into a data structure.
This is similar to Clojure get map key by value
However, there is one difference. How would you do the same thing if hm is like
{1 ["bar" "choco"]}
The idea being to get 1 (the key) where the first element if the value list is "bar"? Please feel free to close/merge this question if some other question answers it.
I tried something like this, but it doesn't work.
(def hm {:foo ["bar", "choco"]})
(keep #(when (= ((nth val 0) %) "bar")
(key %))
hm)
You can filter the map and return the first element of the first item in the resulting sequence:
(ffirst (filter (fn [[k [v & _]]] (= "bar" v)) hm))
you can destructure the vector value to access the second and/or third elements e.g.
(ffirst (filter (fn [[k [f s t & _]]] (= "choco" s))
{:foo ["bar", "choco"]}))
past the first few elements you will probably find nth more readable.
Another way to do it using some:
(some (fn [[k [v & _]]] (when (= "bar" v) k)) hm)
Your example was pretty close to working, with some minor changes:
(keep #(when (= (nth (val %) 0) "bar")
(key %))
hm)
keep and some are similar, but some only returns one result.
in addition to all the above (correct) answers, you could also want to reindex your map to desired form, especially if the search operation is called quite frequently and the the initial map is rather big, this would allow you to decrease the search complexity from linear to constant:
(defn map-invert+ [kfn vfn data]
(reduce (fn [acc entry] (assoc acc (kfn entry) (vfn entry)))
{} data))
user> (def data
{1 ["bar" "choco"]
2 ["some" "thing"]})
#'user/data
user> (def inverted (map-invert+ (comp first val) key data))
#'user/inverted
user> inverted
;;=> {"bar" 1, "some" 2}
user> (inverted "bar")
;;=> 1
Today I tried to implement a "R-like" melt function. I use it for Big Data coming from Big Query.
I do not have big constraints about time to compute and this function takes less than 5-10 seconds to work on millions of rows.
I start with this kind of data :
(def sample
'({:list "123,250" :group "a"} {:list "234,260" :group "b"}))
Then I defined a function to put the list into a vector :
(defn split-data-rank [datatab value]
(let [splitted (map (fn[x] (assoc x value (str/split (x value) #","))) datatab)]
(map (fn[y] (let [index (map inc (range (count (y value))))]
(assoc y value (zipmap index (y value)))))
splitted)))
Launch :
(split-data-rank sample :list)
As you can see, it returns the same sequence but it replaces :list by a map giving the position in the list of each item in quoted list.
Then, I want to melt the "dataframe" by creating for each item in a group its own row with its rank in the group.
So that I created this function :
(defn split-melt [datatab value]
(let [splitted (split-data-rank datatab value)]
(map (fn [y] (dissoc y value))
(apply concat
(map
(fn[x]
(map
(fn[[k v]]
(assoc x :item v :Rank k))
(x value)))
splitted)))))
Launch :
(split-melt sample :list)
The problem is that it is heavily indented and use a lot of map. I apply dissoc to drop :list (which is useless now) and I have also to use concat because without that I have a sequence of sequences.
Do you think there is a more efficient/shorter way to design this function ?
I am heavily confused with reduce, does not know whether it can be applied here since there are two arguments in a way.
Thanks a lot !
If you don't need the split-data-rank function, I will go for:
(defn melt [datatab value]
(mapcat (fn [x]
(let [items (str/split (get x value) #",")]
(map-indexed (fn [idx item]
(-> x
(assoc :Rank (inc idx) :item item)
(dissoc value)))
items)))
datatab))
I developed a function in clojure to fill in an empty column from the last non-empty value, I'm assuming this works, given
(:require [flambo.api :as f])
(defn replicate-val
[ rdd input ]
(let [{:keys [ col ]} input
result (reductions (fn [a b]
(if (empty? (nth b col))
(assoc b col (nth a col))
b)) rdd )]
(println "Result type is: "(type result))))
Got this:
;=> "Result type is: clojure.lang.LazySeq"
The question is how do I convert this back to type JavaRDD, using flambo (spark wrapper)
I tried (f/map result #(.toJavaRDD %)) in the let form to attempt to convert to JavaRDD type
I got this error
"No matching method found: map for class clojure.lang.LazySeq"
which is expected because result is of type clojure.lang.LazySeq
Question is how to I make this conversion, or how can I refactor the code to accomodate this.
Here is a sample input rdd:
(type rdd) ;=> "org.apache.spark.api.java.JavaRDD"
But looks like:
[["04" "2" "3"] ["04" "" "5"] ["5" "16" ""] ["07" "" "36"] ["07" "" "34"] ["07" "25" "34"]]
Required output is:
[["04" "2" "3"] ["04" "2" "5"] ["5" "16" ""] ["07" "16" "36"] ["07" "16" "34"] ["07" "25" "34"]]
Thanks.
First of all RDDs are not iterable (don't implement ISeq) so you cannot use reductions. Ignoring that a whole idea of accessing previous record is rather tricky. First of all you cannot directly access values from an another partition. Moreover only transformations which don't require shuffling preserve order.
The simplest approach here would be to use Data Frames and Window functions with explicit order but as far as I know Flambo doesn't implement required methods. It is always possible to use raw SQL or access Java/Scala API but if you want to avoid this you can try following pipeline.
First lets create a broadcast variable with last values per partition:
(require '[flambo.broadcast :as bd])
(import org.apache.spark.TaskContext)
(def last-per-part (f/fn [it]
(let [context (TaskContext/get) xs (iterator-seq it)]
[[(.partitionId context) (last xs)]])))
(def last-vals-bd
(bd/broadcast sc
(into {} (-> rdd (f/map-partitions last-per-part) (f/collect)))))
Next some helper for the actual job:
(defn fill-pair [col]
(fn [x] (let [[a b] x] (if (empty? (nth b col)) (assoc b col (nth a col)) b))))
(def fill-pairs
(f/fn [it] (let [part-id (.partitionId (TaskContext/get)) ;; Get partion ID
xs (iterator-seq it) ;; Convert input to seq
prev (if (zero? part-id) ;; Find previous element
(first xs) ((bd/value last-vals-bd) part-id))
;; Create seq of pairs (prev, current)
pairs (partition 2 1 (cons prev xs))
;; Same as before
{:keys [ col ]} input
;; Prepare mapping function
mapper (fill-pair col)]
(map mapper pairs))))
Finally you can use fill-pairs to map-partitions:
(-> rdd (f/map-partitions fill-pairs) (f/collect))
A hidden assumption here is that order of the partitions follows order of the values. It may or may not be in general case but without explicit ordering it is probably the best you can get.
Alternative approach is to zipWithIndex, swap order of values and perform join with offset.
(require '[flambo.tuple :as tp])
(def rdd-idx (f/map-to-pair (.zipWithIndex rdd) #(.swap %)))
(def rdd-idx-offset
(f/map-to-pair rdd-idx
(fn [t] (let [p (f/untuple t)] (tp/tuple (dec' (first p)) (second p))))))
(f/map (f/values (.rightOuterJoin rdd-idx-offset rdd-idx)) f/untuple)
Next you can map using similar approach as before.
Edit
Quick note on using atoms. What is the problem there is lack of referential transparency and that you're leveraging incidental properties of a given implementation not a contract. There is nothing in the map semantics that requires elements to be processed in a given order. If internal implementation changes it may be no longer valid. Using Clojure
(defn foo [x] (let [aa #a] (swap! a (fn [&args] x)) aa))
(def a (atom 0))
(map foo (range 1 20))
compared to:
(def a (atom 0))
(pmap foo (range 1 20))
For example given a channel with operations and another channel with data, how to write a go block that will apply the operation on whatever was the last value on the data channel?
(go-loop []
(let [op (<! op-ch)
data (<! data-ch)]
(put! result-ch (op data))))
Obviously that doesn't work because it would require both channels to have the same frequency.
(see http://rxmarbles.com/#withLatestFrom)
Using alts! you could accomplish what you want.
The with-latest-from shown below implements the same behavior found in the withLatestFrom from RxJS (I think :P).
(require '[clojure.core.async :as async])
(def op-ch (async/chan))
(def data-ch (async/chan))
(defn with-latest-from [chs f]
(let [result-ch (async/chan)
latest (vec (repeat (count chs) nil))
index (into {} (map vector chs (range)))]
(async/go-loop [latest latest]
(let [[value ch] (async/alts! chs)
latest (assoc latest (index ch) value)]
(when-not (some nil? latest)
(async/put! result-ch (apply f latest)))
(when value (recur latest))))
result-ch))
(def result-ch (with-latest-from [op-ch data-ch] str))
(async/go-loop []
(prn (async/<! result-ch))
(recur))
(async/put! op-ch :+)
;= true
(async/put! data-ch 1)
;= true
; ":+1"
(async/put! data-ch 2)
;= true
; ":+2"
(async/put! op-ch :-)
;= true
; ":-2"
There's an :priority true option for the alts!.
An expression which always returns the latest seen value in some channel would look something like this:
(def in-chan (chan))
(def mem (chan))
(go (let [[ch value] (alts! [in-chan mem] :priority true)]
(take! mem) ;; clear mem (take! is non-blocking)
(>! mem value) ;; put the new (or old) value in the mem
value ;; return a chan with the value in
It's untested, it's probably not efficient (a volatile variable is probably better). The go-block returns a channel with only the value, but the idea could be expanded to some "memoized" channel.