Removing items from a map based on the contents of another map - clojure

Still working through Programming Collective Intelligence and using Clojure to write the code. I've got it working, but some parts are really ugly, so I thought I'd ask some of the experts around here to help clean it up.
Let's suppose I have a map that looks like this (bound to "recs"):
{"Superman Returns" 3.902419556891574, "Lady in the Water" 2.8325499182641614,
"Snakes on a Plane" 3.7059737842895792, "The Night Listener" 3.3477895267131017,
"You, Me and Dupree" 2.651006036204627, "Just My Luck" 2.5309807037655645}
and I want to remove those items with keys that are also in the map (bound to "mymovies"):
{"Snakes on a Plane" 4.5, "You, Me and Dupree" 1.0, "Superman Returns" 4.0}
so that I get the map:
{"Lady in the Water" 2.8325499182641614, "The Night Listener" 3.3477895267131017,
"Just My Luck" 2.5309807037655645}
the code that I managed to get to do this looks like:
(apply merge (map #(hash-map (first %) (second %))
(remove #(contains? mymovies (first %))
recs)))
That seems pretty ugly to me. It doesn't seem like it should be necessary to create a map from the value I get back from "remove". Is there a cleaner way to do this?
UPDATE: Joost's answer below sparked another idea. If I turn the keys of the two maps into sets I can use select-keys like this:
(select-keys recs (difference (set (keys recs))
(set (keys mymovies))))
Joost, thanks for turning me on to select-keys. I didn't know about that function before. Now to go rewrite several other sections with this new found knowledge!

(apply dissoc recs (keys mymovies))

The following first builds a seq of keys to keep, then extracts the "submap" for those keys from recs using select-keys. It also takes advantage of the fact that sets are predicates.
(select-keys recs (remove (apply hash-set (keys mymovies)) (keys recs)))

I think ponzao's answer is best for this case, but I wouldn't have thought to apply dissoc. Here are the two solutions I might have come up with: hopefully looking over them will help with similar future problems.
Note that the second solution will fail if your mymovies map contains nil or false values.
(into {}
(for [[k v] recs
:when (not (contains? mymovies k))]
[k v]))
(into {}
(remove (comp mymovies key) recs))

Related

For a function that updates a world "state", I want to return a vector of strings of events that happened

I have this small game world state, something like the following:
(defn odds [percentage]
(< (rand-int 100) percentage))
(defn world []
{:entities []})
(defn make-bird []
{:pos [(rand-int 100) (rand-int 100)]
:age 0
:dir (vec/dir (rand (. Math PI)))})
(defn generate-entities [entities]
(if (odds 10)
(conj entities (make-bird))
entities))
(defn update-entity [entity]
(-> entity
(update :pos (partial vec/add (:dir entity)))
(update :age inc)))
(defn update-entities [entities]
(vec (map update-entity entities)))
(defn old? [{age :age}]
(> age 10))
(defn prune-entities [entities]
(vec (filter #(not (old? %)) entities)))
(defn update-world [world]
(-> world
(update :entities generate-entities)
(update :entities update-entities)
(update :entities prune-entities)))
So update-world goes through three steps. First there's a 1/10 chance of generating a new bird entity, which flies in a random direction. Then it updates all birds, updating their position and incrementing their age. Then it prunes all old birds.
I use this same technique for generating particles systems. You can do fun stuff like (iterate update-world (world)) to get a lazy list of world states which you can consume at whatever frame rate you want.
However, I now have a game world with autonomous entities which roam around and do stuff, kind of like the birds. But I want to get a textual representation of what happened when evaluating update-world. For example, update-world would ideally return a tuple of the new world state and a vector of strings - ["A bird was born at [12, 8].", "A bird died of old age at [1, 2]."].
But then I really can't use (iterate update-world (world)) anymore. I can't really see how to do this.
Is this something you'd use with-out-string for?
If you want to enhance only your top-level function (update-world) in your case you can just create a wrapper function that you can use in iterate. A simple example:
(defn increment [n]
(inc n))
(defn logging-increment [[_ n]]
(let [new-n (increment n)]
[(format "Old: %s New: %s" n new-n) new-n]))
(take 3 (iterate logging-increment [nil 0]))
;; => ([nil 0] ["Old: 0 New: 1" 1] ["Old: 1 New: 2" 2])
In case you want to do it while collecting data at multiple level and you don't want to modify the signatures of your existing functions (e.g. you want to use it only for debugging), then using dynamic scope seems like a reasonable option.
Alternatively you can consider using some tracing tools, like clojure/tools.trace. You could turn on and off logging of your function calls by simply changing defn to deftrace or using trace-ns or trace-vars.
There are two potential issues with using with-out-str
It returns a string, not a vector. If you need to use a vector, you'll need to use something else.
Only the string is returned. If you are using with-out-str to wrap a side-effect (e.g., swap!), this might be fine.
For debugging purposes, I usually just use println. You can use with-out if you want control over where the output goes. You could even implement a custom stream that collects the output into a vector of strings if you wanted. You could get similar results with a dynamically bound vector that you accumulate (via set!) the output string (or wrap the vector in an atom and use swap!).
If the accumulated vector is part of the computation per se, and you want to remain pure, you might consider using a monad.
What about using clojure.data/diff to generate the string representation of changes? You could do something like this:
(defn update-world [[world mutations]]
(let [new-world (-> world
(update :entities generate-entities)
(update :entities update-entities)
(update :entities prune-entities))]
[new-world (mutations (clojure.data/diff world new-world))]))
Then you could do something like (iterate update-world [(world) []]) to get the ball rolling.

Map over first element of list of vectors

How can I map a function over just the first elements of vectors in a list?
So I have
(["1" "sometexthere" ...]["2" "somemoretext" ...] ....)
I need to use read-string to convert the stringy numbers into ints (or longs).
If you want just the list of results, you can combine the function with first and map it, as #leetwinski recommended in the comments.
(map #(clojure.edn/read-string (first %)) items)
If you want to get back the structure you had, but with those particular elements mapped by the function, update and update-in are your friends:
(map #(update % 0 clojure.edn/read-string) items)
For more involved transformations you may also be interested in specter's transform.
You can use comp to compose functions:
(require '[clojure.edn :as edn])
(def items [["1" "sometexthere" ,,,] ["2" "somemoretext" ,,,] ,,,])
(map (comp edn/read-string first) items)
;=> (1 2 ,,,)
I like the comp solution by Elogent, however I think for readability I prefer the use of a threading macro:
(map #(-> % first clojure.edn/read-string) items)
To each his/her own, just my personal preference.

use 'for' inside 'let' return a list of hash-map

Sorry for the bad title 'cause I don't know how to describe in 10 words. Here's the detail:
I'd like to loop a file in format like:
a:1 b:2...
I want to loop each line, collect all 'k:v' into a hash-map.
{ a 1, b 2...}
I initialize a hash-map in a 'let' form, then loop all lines with 'for' inside let form.
In each loop step, I use 'assoc' to update the original hash-map.
(let [myhash {}]
(for [line #{"A:1 B:2" "C:3 D:4"}
:let [pairs (clojure.string/split line #"\s")]]
(for [[k v] (map #(clojure.string/split %1 #":") pairs)]
(assoc myhash k (Float. v)))))
But in the end I got a lazy-seq of hash-map, like this:
{ {a 1, b 2...} {x 98 y 99 z 100 ...} }
I know how to 'merge' the result now, but still don't understand why 'for' inside 'let' return
a list of result.
What I'm confused is: does the 'myhash' in the inner 'for' refers to the 'myhash' declared in the 'let' form every time? If I do want a list of hash-map like the output, is this the idiomatic way in Clojure ?
Clojure "for" is a list comprehension, so it creates list. It is NOT a for loop.
Also, you seem to be trying to modify the myhash, but Clojure's datastructures are immutable.
The way I would approach the problem is to try to create a list of pair like (["a" 1] ["b" 2] ..) and the use the (into {} the-list-of-pairs)
If the file format is really as simple as you're describing, then something much more simple should suffice:
(apply hash-map (re-seq #"\w+" (slurp "your-file.txt")))
I think it's more readable if you use the ->> threading macro:
(->> "your-file.txt" slurp (re-seq #"\w+") (apply hash-map))
The slurp function reads an entire file into a string. The re-seq function will just return a sequence of all the words in your file (basically the same as splitting on spaces and colons in this case). Now you have a sequence of alternating key-value pairs, which is exactly what hash-map expects...
I know this doesn't really answer your question, but you did ask about more idiomatic solutions.
I think #dAni is right, and you're confused about some fundamental concepts of Clojure (e.g. the immutable collections). I'd recommend working through some of the exercises on 4Clojure as a fun way to get more familiar with the language. Each time you solve a problem, you can compare your own solution to others' solutions and see other (possibly more idomatic) ways to solve the problem.
Sorry, I didn't read your code very thorougly last night when I was posting my answer. I just realized you actually convert the values to Floats. Here are a few options.
1) partition the sequence of inputs into key/val pairs so that you can map over it. Since you now how a sequence of pairs, you can use into to add them all to a map.
(->> "kvs.txt" slurp (re-seq #"\w") (partition 2)
(map (fn [[k v]] [k (Float. v)])) (into {}))
2) Declare an auxiliary map-values function for maps and use that on the result:
(defn map-values [m f]
(into {} (for [[k v] m] [k (f v)])))
(->> "your-file.txt" slurp (re-seq #"\w+")
(apply hash-map) (map-values #(Float. %)))
3) If you don't mind having symbol keys instead of strings, you can safely use the Clojure reader to convert all your keys and values.
(->> "your-file.txt" slurp (re-seq #"\w+")
(map read-string) (apply hash-map))
Note that this is a safe use of read-string because our call to re-seq would filter out any hazardous input. However, this will give you longs instead of floats since numbers like 1 are long integers in Clojure
Does the myhash in the inner for refer to the myhash declared in the let form every time?
Yes.
The let binds myhash to {}, and it is never rebound. myhash is always {}.
assoc returns a modified map, but does not alter myhash.
So the code can be reduced to
(for [line ["A:1 B:2" "C:3 D:4"]
:let [pairs (clojure.string/split line #"\s")]]
(for [[k v] (map #(clojure.string/split %1 #":") pairs)]
(assoc {} k (Float. v))))
... which produces the same result:
(({"A" 1.0} {"B" 2.0}) ({"C" 3.0} {"D" 4.0}))
If I do want a list of hash-map like the output, is this the idiomatic way in Clojure?
No.
See #DaoWen's answer.

How can I improve this Clojure function?

I just wrote my first Clojure function based on my very limited knowledge of the language. I would love some feedback in regards to performance and use of types. For example, I'm not sure
if I should be using lists or vectors.
(defn actor-ids-for-subject-id [subject-id]
(sql/with-connection (System/getenv "DATABASE_URL")
(sql/with-query-results results
["SELECT actor_id FROM entries WHERE subject_id = ?" subject-id]
(let [res (into [] results)]
(map (fn [row] (get row :actor_id)) res)))))
It passes the following test (given proper seed data):
(deftest test-actor-ids-for-subject-id
(is (= ["123" "321"] (actor-ids-for-subject-id "123"))))
If it makes a difference (and I imagine it does) my usage characteristics of the returned data will almost exclusively involve generating the union and intersection of another set returned by the same function.
it's slightly more concise to use 'vec' instead of 'into' when the initial vector is empty. it may express the intent more clearly, though that's more a matter of preference.
(vec (map :actor_id results))
the results is a clojure.lang.Cons, is lazy sequence, return by clojure.java.jdbc/resultset-seq. each record is a map:
(defn actor-ids-for-subject-id [subject-id]
(sql/with-connection (System/getenv "DATABASE_URL")
(sql/with-query-results results
["SELECT actor_id FROM entries WHERE subject_id = ?" subject-id]
(into [] (map :actor_id results)))))

How to get a random acces by index on a hash map in Clojure?

I'd like to perform a number (MAX_OPERATIONS) of money transfers from one account to another. The accounts are stored as refs in a hash-map caller my-map (int account-id, double balance).
The money transfer takes a "random index" from the hash map and passes it as account-from to transfer. account-destination and amount should both be fixed.
Unfortunately I can't make it work.
(defn transfer [from-account to-account amount]
(dosync
(if (> amount #from-account)
(throw (Exception. "Not enough money")))
(alter from-account - amount)
(alter to-account + amount)))
(defn transfer-all []
(dotimes [MAX_OPERATIONS]
(transfer (get mymap (rand-int[MAX_ACCOUNT]) :account-id) account-destination amount)))
Maps do not implament nth so you need to use an intermediate structure that does implament nth.
you can make a seq of either just the keys or the entire map entries depending on what you want as output. I like using rand-nth for this kind of thing because it reads nicely
you can get an nthable seq of the keys and then use one at random:
user> (def mymap {:a 1, :b 2, :c 3})
#'user/mymap
user> (get mymap (rand-nth (keys mymap)))
1
user> (get mymap (rand-nth (keys mymap)))
1
user> (get mymap (rand-nth (keys mymap)))
3
Or you can turn the map into an nthable vector and then grab one at random
user> (rand-nth (vec mymap))
[:a 1]
user> (rand-nth (vec mymap))
[:c 3]
A couple of issues I see immediately:
Your syntax for dotimes is wrong, you need to include a loop variable. Something like:
(dotimes [i MAX_OPERATIONS]
....)
Also rand-int just needs an integer parameter raher than a vector, something like:
(rand-int MAX_ACCOUNT)
Also, I'm not sure that your (get ...) call is doing quite what you intend. As currently written, it will return the keyword :account-id if it doesn't find the randomly generated integer key, which is going to cause problems as the transfer function requires two refs as the from-account and to-account.
As more general advice, you should probably try coding this up bit by bit at the REPL, checking that each part works as intended. This is often the best way to develop in Clojure - if you write too much code at once without testing it then it's likely to contain several errors and you may get lost trying to track down the root of the problem.