I have a function in developed in clojure using flambo spark api functions
(:require [flambo.api :as f]
[clojure.string :as s])
(defn get-distinct-column-val
"input = {:col val}"
[ xctx input ]
(let [{:keys [ col ]} input
column-values []
result (f/map (:rdd xctx) (f/fn [row]
(if (s/blank? (get row col))
(assoc column-values (get row col)))))]
(println "Col values: " column-values)
(distinct column-values)))
And I try to print out the values of column-values and I'm getting
Col values: []
Is there any reason as to why this is so?
I tried replacing the println in the above function with this:
(println "Result: " result)
and got the following:
#<JavaRDD MapPartitionsRDD[16] at map at>

Nothing in your code alters the column-values binding. I'm not sure how flambo works in the specifics here, but you should be looking at result, not column-values.
assoc takes two arguments - an associative collection and a position. I suspect that here you actually want conj. Neither assoc or conj alters the collection it is provided - we are using immutable data types here.
I expect that accessing result won't yet have the answer you expect, because you expect assoc to build up a value across each call (differently from the result from f/map). In this case, you probably want reduce.


clojure-more readable way to write this function?

I have written this function to convert a vector of maps into string. There is a second map called field-name-to-columns which contains a mapping between the field-name and the actual name of columns in my database.
My goal is to get a string like in the example where if the key is not present in the field-name-to-columns be ignored. Plus I want to have “ DESC” as a default if the :sorting key is empty or missing or none of the field-names matches any key in field-name-to-columns.
(def field-name-to-columns {"name" ""
"birthday" "client.birthday"
"last-name" "client.last-name"
"city" ""})
(def request {:sorting [{:field-name "city" :desc true}
{:field-name "country" :desc true}
{:field-name "birthday" :desc false}]})
(defn request-to-string
"this function creates the sorting part of query"
(if (empty? sorting)
(str "" "DESC")
(->> (filter some? (for [{:keys [field-name desc]} sorting]
(when (some? (field-name-to-columns field-name)) (str (field-name-to-columns field-name) (when desc " DESC")))))
(st/join ", "))))
(request-to-string (request :sorting))
=>" DESC, client.birthday"
Any comments on how to write this function more readable would be highly appriciated
What you've written is very reasonable in my opinion. I'd just add some whitespace for a visual break and tidy up your null handling a bit: it's silly to put nulls into the result sequence and then filter them out, rather than producing only non-nil values.
(defn request-to-string [sorting]
(str/join ", "
(or (seq (for [{:keys [field-name desc]} sorting
:let [column (field-name-to-columns field-name)]
:when column]
(str column (when desc " DESC"))))
[" DESC"])))
I've also moved the str/join up front; this is a stylistic choice most people disagree with me about, but you asked for opinions. I just think it's nice to emphasize that part by putting it up front, since it's an important part of the process, rather than hiding it at the back and making a reader remember the ->> as they read through the body of the function.
I also prefer using or rather than if to choose defaults, but it's not especially beautiful here. I also considered (or (non-empty (join ...)) " DESC"). You might prefer either of these options, or your own choice, but I thought you'd like to see alternate approaches.
Here is one idea, based on my favorite template project.
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
[tupelo.string :as str]))
(def field-name->columns {"name" ""
"birthday" "client.birthday"
"last-name" "client.last-name"
"city" ""})
(defn field->string
[{:keys [field-name desc]}]
; give names to intermediate and/or temp values
(let [col-name (field-name->columns field-name)]
(when (some? col-name)
(str col-name
(when desc " DESC")))))
(defn request->string
"this function creates the sorting part of query"
; accept only valid input
(when-not sorting ; WAS: (str "" "DESC")
(throw (ex-info "sorting array required, value=" {:sorting sorting})))
; give names to intermediate values
(let [fiels-strs (filter some?
(for [entry sorting]
(field->string entry)))
result (str/join ", " fiels-strs)]
and unit tests
(is= (field->string {:field-name "city", :desc true}) " DESC")
(is= (field->string {:field-name "country", :desc true}) nil)
(is= (field->string {:field-name "birthday", :desc false}) "client.birthday")
(let [sorting [{:field-name "city" :desc true}
{:field-name "country" :desc true}
{:field-name "birthday" :desc false}]]
(is= (spyx-pretty (request->string sorting))
" DESC, client.birthday")))
I prefer the (->> (map ...) (filter ...)) pattern over the for macro:
(defn request-to-string [sorting]
(or (->> sorting
(map (fn [{:keys [field-name desc]}]
[(field-name-to-columns field-name)
(when desc " DESC")]))
(filter first)
(map #(apply str %))
(clojure.string/join ", ")
" DESC"))

clojure: filtering a vector of maps by keys existence and values

I have a vector of maps like this one
(def map1
[{:name "name1"
:field "xxx"}
{:name "name2"
:requires {"element1" 1}}
{:name "name3"
:consumes {"element2" 1 "element3" 4}}])
I'm trying to define a functions that takes in a map like {"element1" 1 "element3" 6} (ie: with n fields, or {}) and fiters the maps in map1, returning only the ones that either have no requires and consumes, or have a lower number associated to them than the one associated with that key in the provided map (if the provided map doesn't have any key like that, it's not returned)
but I'm failing to grasp how to approach the maps recursive loop and filtering
(defn getV [node nodes]
(defn filterType [type nodes]
(filter (fn [x] (if (contains? x type)
false ; filter for key values here
true)) nodes))
(filterType :requires (filterType :consumes nodes)))
There's two ways to look at problems like this: from the outside in or from the inside out. Naming things carefully can really help when working with nested structures. For example, calling a vector of maps map1 may be adding to the confusion.
Starting from the outside, you need a predicate function for filtering the list. This function will take a map as a parameter and will be used by a filter function.
(defn comparisons [m]
(filter comparisons map1)
I'm not sure I understand the comparisons precisely, but there seems to be at least two flavors. The first is looking for maps that do not have :requires or :consumes keys.
(defn no-requires-or-consumes [m]
(defn all-keys-higher-than-values [m]
(defn comparisons [m]
(some #(% m) [no-requires-or-consumes all-keys-higher-than-values]))
Then it's a matter of defining the individual comparison functions
(defn no-requires-or-consumes [m]
(and (not (:requires m)) (not (:consumes m))))
The second is more complicated. It operates on one or two inner maps but the behaviour is the same in both cases so the real implementation can be pushed down another level.
(defn all-keys-higher-than-values [m]
(every? keys-higher-than-values [(:requires m) (:consumes m)]))
The crux of the comparison is looking at the number in the key part of the map vs the value. Pushing the details down a level gives:
(defn keys-higher-than-values [m]
(every? #(>= (number-from-key %) (get m %)) (keys m)))
Note: I chose >= here so that the second entry in the sample data will pass.
That leaves only pulling the number of of key string. how to do that can be found at In Clojure how can I convert a String to a number?
(defn number-from-key [s]
(read-string (re-find #"\d+" s)))
Stringing all these together and running against the example data returns the first and second entries.
Putting everything together:
(defn no-requires-or-consumes [m]
(and (not (:requires m)) (not (:consumes m))))
(defn number-from-key [s]
(read-string (re-find #"\d+" s)))
(defn all-keys-higher-than-values [m]
(every? keys-higher-than-values [(:requires m) (:consumes m)]))
(defn keys-higher-than-values [m]
(every? #(>= (number-from-key %) (get m %)) (keys m)))
(defn comparisons [m]
(some #(% m) [no-requires-or-consumes all-keys-higher-than-values]))
(filter comparisons map1)

destructuring in clojure - embedded maps

I encountered the below destructuring in a ring handler function -
[{{:keys [params remote]} :params :as request}]
Its strange as this is the first time I have seen two levels of braces. Does clojure support n levels in destructuring ? I am assuming in the above the :params map is being destructured into [params remote] ?
Yes, Clojure supports destructuring nested data structures, although I don't know if it supports arbitrarily deep nesting. Here's a simple example of destructuring a map, where one of the two keys has a vector for its corresponding value:
(let [{[x y] :pos c :color}
{:color "blue" :pos [1 2]}]
[x y c])
Your example is more than that though, since it also uses the :keys directive, which binds a local variable with the same name as a map's keys. The following are equivalent:
(let [{{:keys [params remotes]} :params}
{:params {:params "PARAMS" :remotes "REMOTES"}}]
[remotes params])
(let [{{params :params remotes :remotes} :params}
{:params {:params "PARAMS" :remotes "REMOTES"}}]
[remotes params])
Both evaluate to ["REMOTES" "PARAMS"].

Looping If And Until Result Found and Then Exiting

I do not think the key-pres? function below is working the way I expect it to. First, here is the input data.
Data from which cmp-val derived:
["2" "000-00-0000" "TOAST" "FRENCH" "" "M" "26-Aug-99" "" "ALL CARE PLAN" "MEDICAL"]
Data that is missing the key (ssn).
["000-00-0000" "TOAST " "FRENCH " "RE-PART B - INSURED "]
The problem is, if I make one of the input's 000-00-0000 something else, I should
see that conj'd onto a log vector. I don't, and I don't see it printed with the if-not empty?.
(defn is-a-in-b
"This is a helper function that takes a value, a column index, and a
returned clojure-csv row (vector), and checks to see if that value
is present. Returns value or nil if not present."
[cmp-val col-idx csv-row]
(let [csv-row-val (nth csv-row col-idx nil)]
(if (= cmp-val csv-row-val)
(defn key-pres?
"Accepts a value, like an index, and output from clojure-csv, and looks
to see if the value is in the sequence at the index. Given clojure-csv
returns a vector of vectors, will loop around until and if the value
is found."
[cmp-val cmp-idx csv-data]
(let [ret-rc
(for [csv-row csv-data
:let [rc (is-a-in-b cmp-val cmp-idx csv-row)]
:when (true? (is-a-in-b cmp-val cmp-idx csv-row))]
(vec ret-rc)))
(defn test-key-inclusion
"Accepts csv-data file and an index, a second csv-data param and an index,
and searches the second csv-data instances' rows (at index) to see if
the first file's data is located in the second csv-data instance."
[csv-data1 pkey-idx1 csv-data2 pkey-idx2]
(fn [out-log csv-row1]
(let [cmp-val (nth csv-row1 pkey-idx1 nil)]
(doseq [csv-row2 csv-data2]
(let [temp-rc (key-pres? cmp-val pkey-idx2 csv-row2)]
(if-not (empty? temp-rc)
(println cmp-val, " ", (nth csv-row2 pkey-idx2 nil), " ", temp-rc))
(if (nil? temp-rc)
(conj out-log cmp-val))))))
What I want the function to do is traverse data returned by clojure-csv (a vector of vectors). If cmp-val can be found at the cmp-idx location in csv-row, I'd like that
assigned to rc, and the loop to terminate.
How can I fix the for loop, and if not, what looping mechanism can I use to accomplish this?
Thank you.
you don't need true?, it specifically checks for the boolean true value;
don't repeat the call to is-a-in-b;
it would be more idiomatic (and readable) to use a-in-b? as the fn name;
I suggest simplifying the code, you don't really need that let.
(vec (for [csv-row csv-data
:let [rc (a-in-b? cmp-val cmp-idx csv-row)]
:when rc)]
But, this are just some general comments on code style... what you're implementing here is just a simple filter:
(vec (filter #(a-in-b? cmp-val cmp-idx %) csv-data))
Furthermore, this will return not only the first, but all matches. If I read your question right, you just need to find the first match? Then use some:
(some #(a-in-b? cmp-val cmp-idx %) csv-data)
Rereading your question I get the feeling that you consider for to be a loop construct. It's not -- it's a list comprehension, producing a lazy seq. To write a loop where you control when to iterate, you must use loop-recur. But in Clojure you'll almost never need to write you own loops, except for performance. In all other cases you compose higher-order functions from clojure.core.

Accessing a map inside a list in Clojure

Here's the code :
(def entry {:name tempName :num tempNum})
(def tempList '(entry))
(println (get (nth tempList 0) (:name)))
Exception in thread "main" java.lang.IllegalArgumentException: Wrong number of args passed to keyword: :name
In this bit of code, I define a map called entry containing a :name and a :num, then I put it in a list, then I try to print the :name field of the first (and only) element of the list. (or at least this is what I think my code does :o)
I can access name from the entry map before I put it in the list, but once it's in the list I get this error. What args am I supposed to give ?
There are two problems.
First, for lists that contain symbols to be resolved (like the symbol entry in your case), you have to use syntax-quote (backtick) instead of regular quote (apostrophe); so this line:
(def tempList '(entry))
should be:
(def tempList `(entry))
or just (using a vector, which is more idiomatic and easier to use in Clojure):
(def tempList [entry]) ; no quoting needed for vectors
Then, change this line
(println (get (nth tempList 0) (:name)))
to either this:
(println (get (nth tempList 0) :name))
or this:
(println (:name (nth tempList 0)))
Using nth on a list is a bad idea because it has to do a linear search to retrieve your element, every time. Vectors are the right collection type to use here.
Vectors are "maps" of indices to values. If you use a vector instead of a list you can do this:
(:name (tempList 0))
(get (get tempList 0) :name)
(get-in tempList [0 :name]))
take the ( ) off from (:name) on the 3rd line.
:keywords are functions that take a map as an argument and "look themselves up" which is quite handy though it makes the error slightly more confusing in this case
(get (nth '({:name "asdf"}) 0) :name))
I would write your code like this:
(def entry {:name tempName :num tempNum})
(def tempList (list entry))
(println (:name (first tempList)))
Note that first is much neater than using nth, and keywords can act as functions to look themselves up in the map. Another equivalent approach is to compose the functions and apply them to the list:
((comp println :name first) tempList)