Writing Clojure Macro To Generate map Forms - clojure

I have a sequence of sequences, output from clojure-csv.
(def s1 [[:000-00-0000 "SMITH" "JOHN" 27][:000-00-0001 "FARMER" "FANNY" 28]])
I have a vector of column numbers [0 3] that I want to use to extract data from each sequence.
Rather than writing a function to zip together a variable number of map forms, I thought a macro might do the trick. But, I am having trouble.
This macro accepts a sequence and a column "mask"
(defmacro map-cols [seq & columns]
(for [col# columns
:let [mc# `(map #(nth % ~col# nil) ~seq)]]
mc#))
(map-cols s1 cols)
ClassCastException clojure.lang.LazySeq cannot be cast to clojure.lang.IFn bene-csv.core/eval2168
I was hoping to generate the multiple map forms that show up in
the following:
(zipmap (map #(nth % 0 nil) s1) (map #(nth % 1 nil) s1))
{:000-00-0001 "FARMER", :000-00-0000 "SMITH"}
I would appreciate some ideas of what I am doing wrong. I can, of course, just tailor the a function to the number of columns I need to extract.
Thank You.
Edit:
Modified macro
(defmacro map-cols [seq & columns]
(vec (for [col columns
:let [mc `(map #(nth % ~col nil) ~seq)]]
mc)))
(apply zipmap (map-cols s1 cols))
ArityException Wrong number of args (1) passed to: core$zipmap clojure.lang.AFn.throwArity

You are mixing up code that will execute at macro expansion with code that the macro will output. Until you enter a syntax-quote, you don't need to use auto-gensym (col#, mc# in your example). As for the macro output, it must always produce exactly one form. It seems that you are expecting it to produce several forms (one for each col), but that's not how this can work. Your macro output will currently look like
((map #(nth % 0 nil) s1) (map #(nth % 1 nil) s1))
This is a form with two members. The member in the head position is expected to be a function, and the whole form is supposed to be evaluated as a function call.
The way to salvage this is to wrap your for in the macro with a vec and then use (apply zipmap (map-cols s1 cols)).
That answers your immediate question, but the solution will still not make sense: zipmap wants exactly two args, not a variable number of them as you said you want, and the output is a map, not a sequence that zips together your fields. Zipping is achieved using (map vector seq1 seq2 ...).

Related

Beginner in clojure: Tokenizing lists of different characters

So I know this isn't the best method of solving this issue, but I'm trying to go through a list of lines from an input file, which end up being expressions. I've got a list of expressions, and each expression has it's own list thanks to the split-the-list function. My next step is to replace characters with id, ints with int, and + or - with addop. I've got the regexes to find whether or not my symbols match any of those, but when I try and replace them, I can only get the last for loop I call to leave any lasting results. I know what it stems down to is the way functional programming works, but I can't wrap my head around the trace of this program, and how to replace each separate type of input and keep the results all in one list.
(def reint #"\d++")
(def reid #"[a-zA-Z]+")
(def readdop #"\+|\-")
(def lines (into () (into () (clojure.string/split-lines (slurp "input.txt")) )))
(defn split-the-line [line] (clojure.string/split line #" " ))
(defn split-the-list [] (for [x (into [] lines)] (split-the-line x)))
(defn tokenize-the-line [line]
(for [x line] (clojure.string/replace x reid "id"))
(for [x line] (clojure.string/replace x reint "int"))
(for [x line] (clojure.string/replace x readdop "addop")))
(defn tokenize-the-list [] (for [x (into [] (split-the-list) )] (tokenize-the-line x)))
And as you can probably tell, I'm pretty new to functional programming, so any advice is welcome!
You're using a do block, which evaluates several expressions (normally for side effects) and then returns the last one. You can't see it because fn (and hence defn) implicitly contain one. As such, the lines
(for [x line] (clojure.string/replace x reid "id"))
(for [x line] (clojure.string/replace x reint "int"))
are evaluated (into two different lazy sequences) and then thrown away.
In order for them to affect the return value, you have to capture their return values and use them in the next round of replacements.
In this case, I think the most natural way to compose your replacements is the threading macro ->:
(for [x line]
(-> x
(clojure.string/replace reid "id")
(clojure.string/replace reint "int")
(clojure.string/replace readdop "addop")))
This creates code which does the reid replace with x as the first argument, then does the reint replace with the result of that as the first argument and so on.
Alternatively you could do this by using comp to compose anonymous functions like (fn [s] (clojure.string/replace s reid "id") (partial application of replace). In the imperative world we get pretty used to running several procedures that "bash the data in place" - in the functional world you more often combine several functions together to do all the operations and then run the result.

Lazy concatenation of sequence in Clojure

Here's a beginner's question: Is there a way in Clojure to lazily concatenate an arbitrary number of sequences? I know there's lazy-cat macro, but I can't think of its correct application for an arbitrary number of sequences.
My use case is lazy loading data from an API via paginated (offseted/limited) requests. Each request executed via request-fn below retrieves 100 results:
(map request-fn (iterate (partial + 100) 0))
When there are no more results, request-fn returns an empty sequence. This is when I stop the iteration:
(take-while seq (map request-fn (iterate (partial + 100) 0)))
For example, the API might return up to 500 results and can be mocked as:
(defn request-fn [offset] (when (< offset 500) (list offset)))
If I want to concatenate the results, I can use (apply concat results) but that eagerly evaluates the results sequence:
(apply concat (take-while seq (map request-fn (iterate (partial + 100) 0))))
Is there a way how to concatenate the results sequence lazily, using either lazy-cat or something else?
For the record, apply will consume only enough of the arguments sequence as it needs to determine which arity to call for the provided function. Since the maximum arity of concat is 3, apply will realize at most 3 items from the underlying sequence.
If those API calls are expensive and you really can't afford to make unnecessary ones, then you will need a function that accepts a seq-of-seqs and lazily concatenates them one at a time. I don't think there's anything built-in, but it's fairly straightforward to write your own:
(defn lazy-cat' [colls]
(lazy-seq
(if (seq colls)
(concat (first colls) (lazy-cat' (next colls))))))

use 'for' inside 'let' return a list of hash-map

Sorry for the bad title 'cause I don't know how to describe in 10 words. Here's the detail:
I'd like to loop a file in format like:
a:1 b:2...
I want to loop each line, collect all 'k:v' into a hash-map.
{ a 1, b 2...}
I initialize a hash-map in a 'let' form, then loop all lines with 'for' inside let form.
In each loop step, I use 'assoc' to update the original hash-map.
(let [myhash {}]
(for [line #{"A:1 B:2" "C:3 D:4"}
:let [pairs (clojure.string/split line #"\s")]]
(for [[k v] (map #(clojure.string/split %1 #":") pairs)]
(assoc myhash k (Float. v)))))
But in the end I got a lazy-seq of hash-map, like this:
{ {a 1, b 2...} {x 98 y 99 z 100 ...} }
I know how to 'merge' the result now, but still don't understand why 'for' inside 'let' return
a list of result.
What I'm confused is: does the 'myhash' in the inner 'for' refers to the 'myhash' declared in the 'let' form every time? If I do want a list of hash-map like the output, is this the idiomatic way in Clojure ?
Clojure "for" is a list comprehension, so it creates list. It is NOT a for loop.
Also, you seem to be trying to modify the myhash, but Clojure's datastructures are immutable.
The way I would approach the problem is to try to create a list of pair like (["a" 1] ["b" 2] ..) and the use the (into {} the-list-of-pairs)
If the file format is really as simple as you're describing, then something much more simple should suffice:
(apply hash-map (re-seq #"\w+" (slurp "your-file.txt")))
I think it's more readable if you use the ->> threading macro:
(->> "your-file.txt" slurp (re-seq #"\w+") (apply hash-map))
The slurp function reads an entire file into a string. The re-seq function will just return a sequence of all the words in your file (basically the same as splitting on spaces and colons in this case). Now you have a sequence of alternating key-value pairs, which is exactly what hash-map expects...
I know this doesn't really answer your question, but you did ask about more idiomatic solutions.
I think #dAni is right, and you're confused about some fundamental concepts of Clojure (e.g. the immutable collections). I'd recommend working through some of the exercises on 4Clojure as a fun way to get more familiar with the language. Each time you solve a problem, you can compare your own solution to others' solutions and see other (possibly more idomatic) ways to solve the problem.
Sorry, I didn't read your code very thorougly last night when I was posting my answer. I just realized you actually convert the values to Floats. Here are a few options.
1) partition the sequence of inputs into key/val pairs so that you can map over it. Since you now how a sequence of pairs, you can use into to add them all to a map.
(->> "kvs.txt" slurp (re-seq #"\w") (partition 2)
(map (fn [[k v]] [k (Float. v)])) (into {}))
2) Declare an auxiliary map-values function for maps and use that on the result:
(defn map-values [m f]
(into {} (for [[k v] m] [k (f v)])))
(->> "your-file.txt" slurp (re-seq #"\w+")
(apply hash-map) (map-values #(Float. %)))
3) If you don't mind having symbol keys instead of strings, you can safely use the Clojure reader to convert all your keys and values.
(->> "your-file.txt" slurp (re-seq #"\w+")
(map read-string) (apply hash-map))
Note that this is a safe use of read-string because our call to re-seq would filter out any hazardous input. However, this will give you longs instead of floats since numbers like 1 are long integers in Clojure
Does the myhash in the inner for refer to the myhash declared in the let form every time?
Yes.
The let binds myhash to {}, and it is never rebound. myhash is always {}.
assoc returns a modified map, but does not alter myhash.
So the code can be reduced to
(for [line ["A:1 B:2" "C:3 D:4"]
:let [pairs (clojure.string/split line #"\s")]]
(for [[k v] (map #(clojure.string/split %1 #":") pairs)]
(assoc {} k (Float. v))))
... which produces the same result:
(({"A" 1.0} {"B" 2.0}) ({"C" 3.0} {"D" 4.0}))
If I do want a list of hash-map like the output, is this the idiomatic way in Clojure?
No.
See #DaoWen's answer.

Clojure- Amending code for different functionality

This line of code:
dates (distinct (map (keyword :cobdate) data))
had to be amended to this line of code
dates (distinct (map #(get % "cobdate") data))
in order to use in the way I required
Could anyone tell me how to convert this line of code:
grouped-by-token (group-by :severity data)
in order to make the same conversion?
The first argument to group-by is a function. :severity, in your third sample, is being used as a function, because keywords can be treated as functions: (:severity {:severity 1}) ;; => 1.
Because strings cannot be treated as functions, you must use the alternate syntax to extract the value.
grouped-by-token (group-by #(% "severity") data)

Clojure apply vs map

I have a sequence (foundApps) returned from a function and I want to map a function to all it's elements. For some reason, apply and count work for the sequnece but map doesn't:
(apply println foundApps)
(map println rest foundApps)
(map (fn [app] (println app)) foundApps)
(println (str "Found " (count foundApps) " apps to delete"))))
Prints:
{:description another descr, :title apptwo, :owner jim, :appstoreid 1235, :kind App, :key #<Key App(2)>} {:description another descr, :title apptwo, :owner jim, :appstoreid 1235, :kind App, :key #<Key App(4)>}
Found 2 apps to delete for id 1235
So apply seems to happily work for the sequence, but map doesn't. Where am I being stupid?
I have a simple explanation which this post is lacking. Let's imagine an abstract function F and a vector. So,
(apply F [1 2 3 4 5])
translates to
(F 1 2 3 4 5)
which means that F has to be at best case variadic.
While
(map F [1 2 3 4 5])
translates to
[(F 1) (F 2) (F 3) (F 4) (F 5)]
which means that F has to be single-variable, or at least behave this way.
There are some nuances about types, since map actually returns a lazy sequence instead of vector. But for the sake of simplicity, I hope it's pardonable.
Most likely you're being hit by map's laziness. (map produces a lazy sequence which is only realised when some code actually uses its elements. And even then the realisation happens in chunks, so that you have to walk the whole sequence to make sure it all got realised.) Try wrapping the map expression in a dorun:
(dorun (map println foundApps))
Also, since you're doing it just for the side effects, it might be cleaner to use doseq instead:
(doseq [fa foundApps]
(println fa))
Note that (map println foundApps) should work just fine at the REPL; I'm assuming you've extracted it from somewhere in your code where it's not being forced. There's no such difference with doseq which is strict (i.e. not lazy) and will walk its argument sequences for you under any circumstances. Also note that doseq returns nil as its value; it's only good for side-effects. Finally I've skipped the rest from your code; you might have meant (rest foundApps) (unless it's just a typo).
Also note that (apply println foundApps) will print all the foundApps on one line, whereas (dorun (map println foundApps)) will print each member of foundApps on its own line.
A little explanation might help. In general you use apply to splat a sequence of elements into a set of arguments to a function. So applying a function to some arguments just means passing them in as arguments to the function, in a single function call.
The map function will do what you want, create a new seq by plugging each element of the input into a function and then storing the output. It does it lazily though, so the values will only be computed when you actually iterate over the list. To force this you can use the (doall my-seq) function, but most of the time you won't need to do that.
If you need to perform an operation immediately because it has side effects, like printing or saving to a database or something, then you typically use doseq.
So to append "foo" to all of your apps (assuming they are strings):
(map (fn [app] (str app "foo")) found-apps)
or using the shorhand for an anonymous function:
(map #(str % "foo") found-apps)
Doing the same but printing immediately can be done with either of these:
(doall (map #(println %) found-apps))
(doseq [app found-apps] (println app))