Clojure - Codeexamples to show off software transactional memory - clojure

I created some exercises to show the capabilties and funcionality of the clojure stm. Are they good enough, or are there better ways to show the benefits of stm?
My work:
Item transfer
Define the three characters as follows:
(def smaug (ref {:life 500, :atk 50, :def 25, :alive true, :items (set(range 100))))
(def bilbo (ref {:life 250, :atk 35, :def 12, :alive true, :items ()}))
(def gandalf (ref {:life 300, :atk 27, :def 8, :mana 50, :alive true, :items ()}))
Now implement an appropriate function to transfer one of a character's :items to one of the other characters.
Now that you can easily transfer :items from one character to another, start two threads, one for bilbo and one for gandalf, and let both threads transfer :items from smaug to bilbo and gandalf until smaug has no :items left.
Why does the following transaction throw an exception?
(dosync
#(future (dosync
(alter bilbo update-in [:items] conj (first (get #smaug :items)))
(alter smaug update-in [:items] disj (first (get #smaug :items)))))
(alter gandalf update-in [:items] conj (first (get #smaug :items)))
(alter smaug update-in [:items] disj (first (get #smaug :items))))
Battle system
Start over with three characters from the beginning. Now implement a function which simulates a battle between two characters. Therefore the :atk value of the attacker minus the :def value of the defender is subtracted from the :life value of the defender. Per battle there is only one attacker and one defender.
Now let one of the characters fight against smaug and if one defeats smaug, he gets all of his :items. Keep in mind that the :life value shall never be smaller than 0!!!
If the :life value of a character is 0, the character is dead. Therefore the :alive value is set to false.
As you might have noticed can no character defeat smaug by his own. Lucky, gandalf can heal other characters. But first you have to implement a function, which lets gandalf heal bilbo. Therefore the :life value is increased. Simultanously healing decreases gandalf's :mana. If his :mana reaches 0 he can;t heal. Now let gandalf heal bilbo after every battle. Now that both characters are involved in the battle, both shall receive :items from smaug. Keep in mind that the :mana value should never be smaller than 0. Bilbos :life value should never be smaller than 0 too!!!
Now we want to make the game more dynamic. We want to make the :atk value of bilbo and gandalf depending of the day time.
Now implement day cycle. To represent the day time define an integer value as a ref. This value should be updated every second in a seperate thread. Therefore attle and day cycle are executed by seperated threads. Every update increments the day time by 1. The maximum value is 23.
If it is day (6 – 17), bilbo und gandalf get a bonus of 1/3 of their :atk value. At night (18 - 5) they get a negative bonus of 1/3.

Related

Clojure CSV Matching Files

I have 2 csv files with around 22K records in file F1, and 50K records in file F2, both containing company name, and address information. I need to do a fuzzy-match on name, address, and phone. Each record in F1 needs to be fuzzy-matched against each record in F2. I have made a third file R3 which is a csv containing the rules for fuzzy-matching which column from F1 to which column with F2, with a fuzzy-tolerance-level. I am trying to do it with for loop, this way -
(for [j f1-rows
h f2-rows
r r3-rows
:while (match-row j h r)]
(merge j h))
(defn match-row [j h rules]
(every?
identity
(map (fn [rule]
(<= (fuzzy/jaccard
((keyword (first rule)) j)
((keyword (second rule)) h))
((nth rule 2))))
rules)))
f1-rows and f2-rows are collections of map. Rules is a collection of sequences containing column name from f1, f2, and the tolerance level. The code is running and functioning as expected. But my problem is, it is taking around 2 hours to execute. I read up on how transducers help improving performance by eliminating intermediate chunks, but I am not able to visualize how I would apply that in my case. Any thoughts on how I can make this better/faster ?
:while vs :when
Your use of :while in this case doesn't seem to agree with your stated problem. Your for-expression will keep going while match-row is true, and stop altogether at the first false result. :when will iterate through all the combinations and include only ones where match-row is true in the resulting lazy-seq. The difference is explained here.
For example:
(for [i (range 10)
j (range 10)
:while (= i j)]
[i j]) ;=> ([0 0])
(for [i (range 10)
j (range 10)
:when (= i j)]
[i j]) ;=> ([0 0] [1 1] [2 2] [3 3] [4 4] [5 5] [6 6] [7 7] [8 8] [9 9])
It's really strange that your code kept running for 2 hours, because that means that during those two hours every invocation of (match-row j h r) returned true, and only the last one returned false. I would check the result again to see if it really makes sense.
How long should it take?
Let's first do some back-of-the-napkin math. If you want to compare every one of 22k records with every one of 55k records, you're gonna be doing 22k * 55k comparisons, there's no way around that.
22k * 55k = 1,210,000,000
That's a big number!
What's the cost of a comparison?
From a half-a-minute glance at wikipedia, jaccard is something about sets.
The following will do to get a ballpark estimate of the cost, though it's probably very much on the low end.
(time (clojure.set/difference (set "foo") (set "bar")))
That takes about a tenth of a millisecond on my computer.
(/ (* 22e3 55e3) ;; Number of comparisons.
10 ; milliseconds
1000 ;seconds
60 ;minutes
60) ;hours
;=> 33.611111111111114
That's 33 and a half hours. And that's with a low-end estimate of the individual cost, and not counting the fact that you want to compare name, address and phone on each one (?). So that' 33 hours if every comparison fails at the first row, and 99 hours if they all get to the last row.
Before any micro-optimizations, you need to work on the algorithm, by finding some clever way of not needing to do more than a billion comparisons. If you want help on that, you need to at least supply some sample data.
Nitpicks
The indentation of the anon fn inside match-row is confusing. I'd use an editor that indents automatically, and stick to that 99% of the time, because lisp programmers read nesting by the indentation, like python. There are some slight differences between editors/autoindenters, but they're all consistent with the nesting.
(defn match-row [j h rules]
(every?
identity
(map (fn [rule]
(<= (fuzzy/jaccard
((keyword (first rule)) j)
((keyword (second rule)) h))
((nth rule 2))))
rules)))
Also,match-row needs to be defined before it is used (which it probably is in your actual code, seeing as it compiles).
22k x 50k is over 1 billion combinations. Multiply by 3 fuzzy rules & you're at 3 billion calculations. Most of which are wasted.
The only way to speed it up is to do some pre-sorting or other pre-trimming of all the combinations. For example, only do the fuzzy calc if zipcodes are identical. If you waste time trying to match people from N.Y. and Florida, you are throwing away 99% or more of your work

Compare values in a list of maps in clojure

I have a list of maps like
(def listofmaps
({:directory_path "/some/path/1", :directory_size "8.49 GB"} {:directory_path "/user/dod/yieldbook/yb_sec_char", :directory_size "14.1 MB"})
containing many values and size can be in gb or mb.
Also I have a limitlistofmaps like
(def limitlistofmaps
({:directory_path "/some/path/8", :directory_size "15.2 GB"} {:directory_path "some/path/3", :directory_size "2.1 GB"}
{:directory_path "/some/path/1", :directory_size "17.2 GB"})
with many values..
I need to print "limit exceeded" if any map in list of maps had the same :directory_path as in limitlistofmaps but :directory_size exceeds the value specified. The problem is that size is in string format and unit has to be considered.
Can you help me with a way to do this in clojure?
I don't get why people down voted your question. I think as a Clojure community we are much better. I'm also a Clojure newb and I have nothing but great things to say about the community. I would like that it stays this way.
Firstly, why not have all the directory sizes in the same unit ? That way it's easier to compare them, say in KB.
Here is one version of a function that would transform any string like "100.23 Gb", "12 B", "123.3443 MB" to a Double representing Kilobytes.
(defn convert-to-kb
"Converts a string 'number (B|KB|MB|GB)' to Double representing KBytes"
[str]
(let [[number-str unit] (map str/lower-case (str/split (str/trim str) #"\s+"))
number (Double/parseDouble number-str)]
(condp = unit
"b" (/ number 1000)
"kb" number
"mb" (* number 1000)
"gb" (* number 1000000))))
Secondly, I would suggest you put the directory size limits in the same map data that lives inside your listofmaps, so you don't have state duplication like you have now in limitslistofmaps. But if for some reason you need this second map, here a piece of ugly code that returns a list of maps that are the same as in your listofmaps with two added key/val entries, :max_size_kb and :directory_size_kb.
for [dir listofmaps :let [{:keys [directory_path directory_size]} dir]]
(let [limit-map (first
(get (group-by :directory_path limitlistofmaps) directory_path))
max-size-kb (convert-to-kb (:directory_size limit-map))]
(-> dir
(assoc :max_size_kb max-size-kb)
(assoc :directory_size_kb (convert-to-kb directory_size)))))

For a function that updates a world "state", I want to return a vector of strings of events that happened

I have this small game world state, something like the following:
(defn odds [percentage]
(< (rand-int 100) percentage))
(defn world []
{:entities []})
(defn make-bird []
{:pos [(rand-int 100) (rand-int 100)]
:age 0
:dir (vec/dir (rand (. Math PI)))})
(defn generate-entities [entities]
(if (odds 10)
(conj entities (make-bird))
entities))
(defn update-entity [entity]
(-> entity
(update :pos (partial vec/add (:dir entity)))
(update :age inc)))
(defn update-entities [entities]
(vec (map update-entity entities)))
(defn old? [{age :age}]
(> age 10))
(defn prune-entities [entities]
(vec (filter #(not (old? %)) entities)))
(defn update-world [world]
(-> world
(update :entities generate-entities)
(update :entities update-entities)
(update :entities prune-entities)))
So update-world goes through three steps. First there's a 1/10 chance of generating a new bird entity, which flies in a random direction. Then it updates all birds, updating their position and incrementing their age. Then it prunes all old birds.
I use this same technique for generating particles systems. You can do fun stuff like (iterate update-world (world)) to get a lazy list of world states which you can consume at whatever frame rate you want.
However, I now have a game world with autonomous entities which roam around and do stuff, kind of like the birds. But I want to get a textual representation of what happened when evaluating update-world. For example, update-world would ideally return a tuple of the new world state and a vector of strings - ["A bird was born at [12, 8].", "A bird died of old age at [1, 2]."].
But then I really can't use (iterate update-world (world)) anymore. I can't really see how to do this.
Is this something you'd use with-out-string for?
If you want to enhance only your top-level function (update-world) in your case you can just create a wrapper function that you can use in iterate. A simple example:
(defn increment [n]
(inc n))
(defn logging-increment [[_ n]]
(let [new-n (increment n)]
[(format "Old: %s New: %s" n new-n) new-n]))
(take 3 (iterate logging-increment [nil 0]))
;; => ([nil 0] ["Old: 0 New: 1" 1] ["Old: 1 New: 2" 2])
In case you want to do it while collecting data at multiple level and you don't want to modify the signatures of your existing functions (e.g. you want to use it only for debugging), then using dynamic scope seems like a reasonable option.
Alternatively you can consider using some tracing tools, like clojure/tools.trace. You could turn on and off logging of your function calls by simply changing defn to deftrace or using trace-ns or trace-vars.
There are two potential issues with using with-out-str
It returns a string, not a vector. If you need to use a vector, you'll need to use something else.
Only the string is returned. If you are using with-out-str to wrap a side-effect (e.g., swap!), this might be fine.
For debugging purposes, I usually just use println. You can use with-out if you want control over where the output goes. You could even implement a custom stream that collects the output into a vector of strings if you wanted. You could get similar results with a dynamically bound vector that you accumulate (via set!) the output string (or wrap the vector in an atom and use swap!).
If the accumulated vector is part of the computation per se, and you want to remain pure, you might consider using a monad.
What about using clojure.data/diff to generate the string representation of changes? You could do something like this:
(defn update-world [[world mutations]]
(let [new-world (-> world
(update :entities generate-entities)
(update :entities update-entities)
(update :entities prune-entities))]
[new-world (mutations (clojure.data/diff world new-world))]))
Then you could do something like (iterate update-world [(world) []]) to get the ball rolling.

"nth not supported" on PersistentHashSet when destructuring Set in Loop header

Clojure noob here.
I want to pull the front and rest out of a Set. Doing (front #{1}) and (rest #{1}) produce 1 and () respectively, which is mostly what I'd expect.
However in the code below, I use the destructuring [current-node & open-nodes] #{start} in my loop to pull something out of the set (at this point I don't really care about if it was the first or last item. I just want this form working) and it breaks.
Here's my function, half-implementing a grid search:
(defn navigate-to [grid start dest]
"provides route from start to dest, not including start"
(loop [[current-node & open-nodes] #{start} ;; << throws exception
closed-nodes #{}]
(if (= dest current-node)
[] ;; todo: return route
(let [all-current-neighbours (neighbours-of grid current-node) ;; << returns a set
open-neighbours (set/difference all-current-neighbours closed-nodes)]
(recur (set/union open-nodes open-neighbours)
(conj closed-nodes current-node))))))
When stepping through (with Cider), on the start of the first loop, it throws this exception:
UnsupportedOperationException nth not supported on this type: PersistentHashSet clojure.lang.RT.nthFrom (RT.java:933)
I could use a nested let form that does first/rest manually, but that seems wasteful. Is there a way to get destructured Sets working like this in the loop form? Is it just not supported on Sets?
Sets are unordered, so positional destructuring doesn't make much sense.
According to the documentation for Special Forms, which treats destructuring as well, sequential (vector) binding is specified to use nth and nthnext to look up the elements to bind.
Vector binding-exprs allow you to bind names to parts of sequential things (not just vectors), like vectors, lists, seqs, strings, arrays, and anything that supports nth.
Clojure hash sets (being instances of java.util.Set) do not support lookup by index.
I don't know the context of your example code, but in any case pouring the set contents into an ordered collection, for example (vec #{start}), would make the destructuring work.
As mentioned by others you cannot bind a set to a vector literal, because a set is not sequential. So even this simple let fails with nth not supported:
(let [[x] #{1}])
You could work around this by "destructuring" the set with the use of first and disj:
(loop [remaining-nodes #{start}
closed-nodes #{}]
(let [current-node (first remaining-nodes)
open-nodes (disj remaining-nodes current-node)]
;; rest of your code ...
))
Using (rest remaining-nodes) instead of (disj remaining-nodes current-node) could be possible, but as sets are unordered, rest is in theory not obliged to take out the same element as was extracted with first. Anyway disj will do the job.
NB: be sure to detect remaining-nodes being nil, which could lead to an endless loop.
Algorithm for returning the route
For implementing the missing part in the algorithm (returning the route) you could maintain
a map of paths. It would have one path for each visited node: a vector with the nodes leading from the start node to that node, keyed by that node.
You could use reduce to maintain that map of paths as you visit new nodes. With a new function used together with that reduce and an added nil test, the program could look like this:
(defn add-path [[path paths] node]
"adds a node to a given path, which is added to a map of paths, keyed by that node"
[path (assoc paths node (conj path node))])
(defn navigate-to [grid start dest]
"provides route from start to dest, including both"
(loop [remaining-nodes #{start}
closed-nodes #{}
paths (hash-map start [start])]
(let [current-node (first remaining-nodes)
current-path (get paths current-node)
all-current-neighbours (neighbours-of grid current-node)
open-neighbours (set/difference all-current-neighbours closed-nodes)]
(if (contains? #{dest nil} current-node)
current-path ;; search complete
(recur (set/union (disj remaining-nodes current-node) open-neighbours)
(conj closed-nodes current-node)
(second (reduce add-path [current-path paths] open-neighbours)))))))
The essence of the algorithm is still the same, although I merged the original let with the one needed for destructuring the nodes. This is not absolutely needed, but it probably makes the code more readable.
Test
I tested this with a poor-mans definition of grid and neighbours-of, based on this graph (digits are nodes, bars indicate linked nodes:
0--1 2
| | |
3--4--5
|
6--7--8
This graph seems a good candidate for a test as it has a loop, a dead end, and is connected.
The graph is encoded with grid being a vector, where each element represents a node. An element's index in that vector is the node's identifier. The content of each element is a set of neighbours, making the neighbours-of function a trivial thing (your implementation will be different):
(def grid [#{1 3} #{0 4} #{5}
#{0 4 6} #{1 3 5} #{2 4}
#{3 7} #{6 8} #{7} ])
(defn neighbours-of [grid node]
(get grid node))
Then the test is to find the route from node 0 to node 8:
(println (navigate-to grid 0 8))
Output is:
[0 1 4 3 6 7 8]
This outcome demonstrates that the algoritm does not guarantee a shortest route, only that a route will be found if it exists. I suppose the outcome could be different on different engines, depending on how the Conjure internals decide which element to take from a set with first.
After removing one of the necessary node links, like the one between node 7 and 8, the output is nil.
NB: I found this an interesting question, and probably went a bit too far in my answer.

How to use "Update-in" in Clojure?

I'm trying to use Clojure's update-in function but I can't seem to understand why I need to pass in a function?
update-in takes a function, so you can update a value at a given position depending on the old value more concisely. For example instead of:
(assoc-in m [list of keys] (inc (get-in m [list of keys])))
you can write:
(update-in m [list of keys] inc)
Of course if the new value does not depend on the old value, assoc-in is sufficient and you don't need to use update-in.
This isn't a direct answer to your question, but one reason why a function like update-in could exist would be for efficiency—not just convenience—if it were able to update the value in the map "in-place". That is, rather than
seeking the key in the map,
finding the corresponding key-value tuple,
extracting the value,
computing a new value based on the current value,
seeking the key in the map,
finding the corresponding key-value tuple,
and overwriting the value in the tuple or replacing the tuple with a new one
one can instead imagine an algorithm that would omit the second search for the key:
seek the key in the map,
find the corresponding key-value tuple,
extract the value,
compute a new value based on the current value,
and overwrite the value in the tuple
Unfortunately, the current implementation of update-in does not do this "in-place" update. It uses get for the extraction and assoc for the replacement. Unless assoc is using some caching of the last looked up key and the corresponding key-value tuple, the call to assoc winds up having to seek the key again.
I think the short answer is that the function passed to update-in lets you update values in a single step, rather than 3 (lookup, calculate new value, set).
Coincidentally, just today I ran across this use of update-in in a Clojure presentation by Howard Lewis Ship:
(def in-str "this is this")
(reduce
(fn [m k] (update-in m [k] #(inc (or % 0))))
{}
(seq in-str))
==> {\space 2, \s 3, \i 3, \h 2, \t 2}
Each call to update-in takes a letter as a key, looks it up in the map, and if it's found there increments the letter count (else sets it to 1). The reduce drives the process by starting with an empty map {} and repeatedly applies the update-in with successive characters from the input string. The result is a map of letter frequencies. Slick.
Note 1: clojure.core/frequencies is similar but uses assoc! rather than update-in.
Note 2: You can replace #(inc (or % 0)) with (fnil inc 0). From here: fnil
A practical example you see here.
Type this snippet (in your REPL):
(def my-map {:useless-key "key"})
;;{:useless-key "key"}
(def my-map (update-in my-map [:yourkey] #(cons 1 %)))
;;{:yourkey (1), :useless-key "key"}
Note that :yourkey is new. So the value - of :yourkey - passed to the lambda is null. cons will put 1 as the single element of your list. Now do the following:
(def my-map (update-in my-map [:yourkey] #(cons 25 %)))
;;{:yourkey (25 1), :useless-key "key"}
And that is it, in the second part, the anonymous function takes the list - the value for :yourkey - as argument and just cons 25 to it.
Since our my-map is immutable, update-in will always return a new version of your map letting you do something with the old value of the given key.
Hope it helped!