clojure iteratively destructure array of strings - clojure

I'm learning clojure, and I want to take a vector of names, in last name -> first name order, of multiple people, and convert it to a vector of maps...
["Pan" "Peter" "Mouse" "Mickey"]
Should become...
[{:firstName Peter, :lastName Pan} {:firstName Mickey, :lastName Mouse}]
I've tried this, which doesn't work...
(for [[lastName firstName]
(list ["Pan" "Peter" "Mouse" "Mickey"])]
{:firstName firstName, :lastName lastName}
)
If I remove the list it turns the first/last name into individual characters.
I'm at a complete loss as to how to go about doing this.

You can convert your input vector into a sequence of pairs with partition:
(def names ["Pan" "Peter" "Mouse" "Mickey"])
(partition 2 names)
You can convert a pair of lastName/firstName into a map using zipmap e.g.
(zipmap [:lastName :firstName] ["Pan" "Peter"])
You can convert the sequences of pairs into a sequence of maps using map:
(map #(zipmap [:lastName :firstName] %) (partition 2 names))

Your first attempt was also pretty close. An example:
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(dotest
(let [pairs (partition 2 ["Pan" "Peter" "Mouse" "Mickey"])
result (for [[lastName firstName] pairs]
{:firstName firstName, :lastName lastName})]
(is= result [{:firstName "Peter", :lastName "Pan"}
{:firstName "Mickey", :lastName "Mouse"}])))
I usually prefer using for instead of map or mapv since it allows you to spread out the code a bit more. I think this is a bit more explicit to the reader.

Clojure Spec is perfect for these sorts of things. In your case, you would use the sequence operators to define a spec for your array, and then use conform:
(ns playground.arrayofstrings
(:require [clojure.spec.alpha :as spec]))
(spec/def ::name-pairs (spec/* (spec/cat :lastName string?
:firstName string?)))
(spec/conform ::name-pairs ["Pan" "Peter" "Mouse" "Mickey"])
;; => [{:lastName "Pan", :firstName "Peter"} {:lastName "Mouse", :firstName "Mickey"}]

Related

clojure-more readable way to write this function?

I have written this function to convert a vector of maps into string. There is a second map called field-name-to-columns which contains a mapping between the field-name and the actual name of columns in my database.
My goal is to get a string like in the example where if the key is not present in the field-name-to-columns be ignored. Plus I want to have “client.name DESC” as a default if the :sorting key is empty or missing or none of the field-names matches any key in field-name-to-columns.
(def field-name-to-columns {"name" "client.name"
"birthday" "client.birthday"
"last-name" "client.last-name"
"city" "client.city"})
(def request {:sorting [{:field-name "city" :desc true}
{:field-name "country" :desc true}
{:field-name "birthday" :desc false}]})
(defn request-to-string
"this function creates the sorting part of query"
[sorting]
(if (empty? sorting)
(str "client.name" "DESC")
(->> (filter some? (for [{:keys [field-name desc]} sorting]
(when (some? (field-name-to-columns field-name)) (str (field-name-to-columns field-name) (when desc " DESC")))))
(st/join ", "))))
(request-to-string (request :sorting))
=>"client.city DESC, client.birthday"
Any comments on how to write this function more readable would be highly appriciated
What you've written is very reasonable in my opinion. I'd just add some whitespace for a visual break and tidy up your null handling a bit: it's silly to put nulls into the result sequence and then filter them out, rather than producing only non-nil values.
(defn request-to-string [sorting]
(str/join ", "
(or (seq (for [{:keys [field-name desc]} sorting
:let [column (field-name-to-columns field-name)]
:when column]
(str column (when desc " DESC"))))
["client.name DESC"])))
I've also moved the str/join up front; this is a stylistic choice most people disagree with me about, but you asked for opinions. I just think it's nice to emphasize that part by putting it up front, since it's an important part of the process, rather than hiding it at the back and making a reader remember the ->> as they read through the body of the function.
I also prefer using or rather than if to choose defaults, but it's not especially beautiful here. I also considered (or (non-empty (join ...)) "client.name DESC"). You might prefer either of these options, or your own choice, but I thought you'd like to see alternate approaches.
Here is one idea, based on my favorite template project.
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require
[tupelo.string :as str]))
(def field-name->columns {"name" "client.name"
"birthday" "client.birthday"
"last-name" "client.last-name"
"city" "client.city"})
(defn field->string
[{:keys [field-name desc]}]
; give names to intermediate and/or temp values
(let [col-name (field-name->columns field-name)]
(when (some? col-name)
(str col-name
(when desc " DESC")))))
(defn request->string
"this function creates the sorting part of query"
[sorting]
; accept only valid input
(when-not sorting ; WAS: (str "client.name" "DESC")
(throw (ex-info "sorting array required, value=" {:sorting sorting})))
; give names to intermediate values
(let [fiels-strs (filter some?
(for [entry sorting]
(field->string entry)))
result (str/join ", " fiels-strs)]
result))
and unit tests
(verify
(is= (field->string {:field-name "city", :desc true}) "client.city DESC")
(is= (field->string {:field-name "country", :desc true}) nil)
(is= (field->string {:field-name "birthday", :desc false}) "client.birthday")
(let [sorting [{:field-name "city" :desc true}
{:field-name "country" :desc true}
{:field-name "birthday" :desc false}]]
(is= (spyx-pretty (request->string sorting))
"client.city DESC, client.birthday")))
I prefer the (->> (map ...) (filter ...)) pattern over the for macro:
(defn request-to-string [sorting]
(or (->> sorting
(map (fn [{:keys [field-name desc]}]
[(field-name-to-columns field-name)
(when desc " DESC")]))
(filter first)
(map #(apply str %))
(clojure.string/join ", ")
not-empty)
"client.name DESC"))

How can I filter JSON data by a given date in Clojure?

I have many JSON objects, and I am trying to filter those objects by the date. These objects are being parsed from several JSON files using Cheshire.core, meaning that the JSON objects are in a collection. The date is being passed in in the following format "YYYY-MM-DD" (eg. 2015-01-10). I have tried using the filter and contains? functions to do this, but I am having no luck so far. How can I filter these JSON objects by my chosen date?
Current Clojure code:
(def filter-by-date?
(fn [orders-data date-chosen]
(contains? (get (get orders-data :date) :date) date-chosen)))
(prn (filter (filter-by-date? orders-data "2017-12-25")))
Example JSON object:
{
"id":"05d8d404-b3f6-46d1-a0f9-dbdab7e0261f",
"date":{
"date":"2015-01-10T19:11:41.000Z"
},
"total":{
"GBP":57.45
}
}
JSON after parsing with Cheshire:
[({:id "05d8d404-b3f6-46d1-a0f9-dbdab7e0261f",
:date {:date "2015-01-10T19:11:41.000Z"},
:total {:GBP 57.45}}) ({:id "325bd04-b3f6-46d1-a0f9-dbdab7e0261f",
:date {:date "2015-02-23T10:15:14.000Z"},
:total {:GBP 32.90}})]
First, I'm going to assume you've parsed the JSON first into something like this:
(def parsed-JSON {:id "05d8d404-b3f6-46d1-a0f9-dbdab7e0261f",
:date {:date "2015-01-10T19:11:41.000Z"},
:total {:GBP 57.45}})
The main problem is the fact that the date as stored in the JSON contains time information, so you aren't going to be able to check it directly using equality.
You can get around this by using clojure.string/starts-with? to check for prefixes. I'm using s/ here as an alias for clojure.string:
(defn filter-by-date [date jsons]
(filter #(s/starts-with? (get-in % [:date :date]) date)
jsons))
You were close, but I made a few changes:
You can't use contains? like that. From the docs of contains?: Returns true if key is present in the given collection, otherwise returns false. It can't be used to check for substrings; it's used to test for the presence of a key in a collection.
Use -in postfix versions to access nested structures instead of using multiple calls. I'm using (get-in ...) here instead of (get (get ...)).
You're using (def ... (fn [])) which makes things more complicated than they need to be. This is essentially what defn does, although defn also adds some more stuff as well.
To address the new information, you can just flatten the nested sequences containing the JSONs first:
(->> nested-json-colls ; The data at the bottom of the question
(flatten)
(filter-by-date "2015-01-10"))
#!/usr/bin/env boot
(defn deps [new-deps]
(merge-env! :dependencies new-deps))
(deps '[[org.clojure/clojure "1.9.0"]
[cheshire "5.8.0"]])
(require '[cheshire.core :as json]
'[clojure.string :as str])
(def orders-data-str
"[{
\"id\":\"987654\",
\"date\":{
\"date\":\"2015-01-10T19:11:41.000Z\"
},
\"total\":{
\"GBP\":57.45
}
},
{
\"id\":\"123456\",
\"date\":{
\"date\":\"2016-01-10T19:11:41.000Z\"
},
\"total\":{
\"GBP\":23.15
}
}]")
(def orders (json/parse-string orders-data-str true))
(def ret (filter #(clojure.string/includes? (get-in % [:date :date]) "2015-01-") orders))
(println ret) ; ({:id 987654, :date {:date 2015-01-10T19:11:41.000Z}, :total {:GBP 57.45}})
You can convert the date string to Date object using any DateTime library like joda-time and then do a proper filter if required.
clj-time has functions for parsing strings and comparing date-time objects. So you could do something like:
(ns filter-by-time-example
(:require [clj-time.coerce :as tc]
[clj-time.core :as t]))
(def objs [{"id" nil
"date" {"date" "2015-01-12T19:11:41.000Z"}
"total" nil}
{"id" "05d8d404-b3f6-46d1-a0f9-dbdab7e0261f"
"date" {"date" "2015-01-10T19:11:41.000Z"}
"total" {"GBP" :57.45}}
{"id" nil
"date" {"date" "2015-01-11T19:11:41.000Z"}
"total" nil}])
(defn filter-by-day
[objs y m d]
(let [start (t/date-time y m d)
end (t/plus start (t/days 1))]
(filter #(->> (get-in % ["date" "date"])
tc/from-string
(t/within? start end)) objs)))
(clojure.pprint/pprint (filter-by-day objs 2015 1 10)) ;; Returns second obj
If you're going to repeatedly do this (e.g. for multiple days) you could parse all dates in your collection into date-time objects with
(map #(update-in % ["date" "date"] tc/from-string) objs)
and then just work with that collection to avoid repeating the parsing step.
(ns filter-by-time-example
(:require [clj-time.format :as f]
[clj-time.core :as t]
[cheshire.core :as cheshire]))
(->> json-coll
(map (fn [json] (cheshire/parse-string json true)))
(map (fn [record] (assoc record :dt-date (f/format (get-in record [:date :date])))))
(filter (fn [record] (t/after? (tf/format "2017-12-25") (:dt-date record))))
(map (fn [record] (dissoc record :dt-date))))
Maybe something like this? You might need to change the filter for your usecase but as :dt-time is now a jodo.DateTime you can leverage all the clj-time predicates.

How do I iterate through a nested dict/hash-map in Clojure to custom-flatten/transform my data structure?

I have something that looks like this:
{:person-123 {:xxx [1 5]
:zzz [2 3 4]}
:person-456 {:yyy [6 7]}}
And I want to transform it so it looks like this:
[{:person "123" :item "xxx"}
{:person "123" :item "zzz"}
{:person "456" :item "yyy"}]
This is a flatten-like problem, and I know I can convert the keywords into strings by calling name on them, but I couldn't come across a convenient way to do this.
This is how I did it, but it seems inelegant (nested for loops, I'm looking at you):
(require '[clojure.string :refer [split]])
(into []
(apply concat
(for [[person nested-data] input-data]
(for [[item _] nested-data]
{:person (last (split (name person) #"person-"))
:item (name item)}))))
Your solution is not too bad, as for the nested for loops, well for actually supports nested loops, so you could write it as:
(vec
(for [[person nested-data] input-data
[item _] nested-data]
{:person (last (clojure.string/split (name person) #"person-"))
:item (name item)}))
personally, I tend to use for exclusively for that purpose (nested loops), otherwise I am usually more comfortable with map et al. But thats just a personal preference.
I also very much agree with #amalloy's comment on the question, I would put some effort into having a better looking map structure to begin with.
(let [x {:person-123 {:xxx [1 5]
:zzz [2 3 4]}
:person-456 {:yyy [6 7]}}]
(clojure.pprint/pprint
(mapcat
(fn [[k v]]
(map (fn [[k1 v1]]
{:person (clojure.string/replace (name k) #"person-" "") :item (name k1)}) v))
x))
)
I am not sure if there is a single high-order function, at least in the core, that does what you want in one go.
On the other hand, similar methods exist in GNU R reshape library, which, by the way, has been recreated for clojure:
https://crossclj.info/ns/grafter/0.8.6/grafter.tabular.melt.html#_melt-column-groups which might interest you.
This is how it works in Gnu R: http://www.statmethods.net/management/reshape.html
Lots of good solutions so far. All I would add is a simplification with keys:
(vec
(for [[person nested-data] input-data
item (map name (keys nested-data))]
{:person (clojure.string/replace-first
(name person)
#"person-" "")
:item item}))
Note btw the near universal preference for replace over last/split. Guessing the spirit of the transformation is "lose the leading person- prefix", replace says that better. If OTOH the spirit is "find the number and use that", a bit of regex to isolate the digits would be truer.
(reduce-kv (fn [ret k v]
(into ret (map (fn [v-k]
{:person (last (str/split (name k) #"-"))
:item (name v-k)})
(keys v))))
[]
{:person-123 {:xxx [1 5] :zzz [2 3 4]}
:person-456 {:yyy [6 7]}})
=> [{:person "123", :item "xxx"}
{:person "123", :item "zzz"}
{:person "456", :item "yyy"}]
Here are three solutions.
The first solution uses Python-style lazy generator functions via lazy-gen and yield functions from the Tupelo library. I think this method is the simplest since the inner loop produces maps and the outer loop produces a sequence. Also, the inner loop can run zero, one, or multiple times for each outer loop. With yield you don't need to think about that part.
(ns tst.clj.core
(:use clj.core clojure.test tupelo.test)
(:require
[clojure.string :as str]
[clojure.walk :as walk]
[clojure.pprint :refer [pprint]]
[tupelo.core :as t]
[tupelo.string :as ts]
))
(t/refer-tupelo)
(def data
{:person-123 {:xxx [1 5]
:zzz [2 3 4]}
:person-456 {:yyy [6 7]}})
(defn reformat-gen [data]
(t/lazy-gen
(doseq [[outer-key outer-val] data]
(let [int-str (str/replace (name outer-key) "person-" "")]
(doseq [[inner-key inner-val] outer-val]
(let [inner-key-str (name inner-key)]
(t/yield {:person int-str :item inner-key-str})))))))
If you really want to be "pure", the following is another solution. However, with this solution I made a couple of errors and required many, many debug printouts to fix. This version uses tupelo.core/glue instead of concat since it is "safer" and verifies that the collections are all maps, all vectors/list, etc.
(defn reformat-glue [data]
(apply t/glue
(forv [[outer-key outer-val] data]
(let [int-str (str/replace (name outer-key) "person-" "")]
(forv [[inner-key inner-val] outer-val]
(let [inner-key-str (name inner-key)]
{:person int-str :item inner-key-str}))))))
Both methods give the same answer:
(newline) (println "reformat-gen:")
(pprint (reformat-gen data))
(newline) (println "reformat-glue:")
(pprint (reformat-glue data))
reformat-gen:
({:person "123", :item "xxx"}
{:person "123", :item "zzz"}
{:person "456", :item "yyy"})
reformat-glue:
[{:person "123", :item "xxx"}
{:person "123", :item "zzz"}
{:person "456", :item "yyy"}]
If you wanted to be "super-pure", here is a third solution (although I think this one is trying too hard!). Here we use the ability of the for macro to have nested elements in a single expression. for can also embed let expressions inside itself, although here that leads to duplicate evaluation of int-str.
(defn reformat-overboard [data]
(for [[outer-key outer-val] data
[inner-key inner-val] outer-val
:let [int-str (str/replace (name outer-key) "person-" "") ; duplicate evaluation
inner-key-str (name inner-key)]]
{:person int-str :item inner-key-str}))
(newline)
(println "reformat-overboard:")
(pprint (reformat-overboard data))
reformat-overboard:
({:person "123", :item "xxx"}
{:person "123", :item "zzz"}
{:person "456", :item "yyy"})
I would probably stick with the first one since it is (at least to me) much simpler and more bulletproof. YMMV.
Update:
Notice that the 3rd method yields a single sequence of maps, even though there are 2 nested for iterations happening. This is different than having two nested for expressions, which would yield a sequence of a sequence of maps.

How I can apply map in this code? (Clojure)

I start learn Clojure and need help with task.
I have to write this function:
(data-table student-tbl)
;; => ({:surname "Ivanov", :year "1996", :id "1"}
;; {:surname "Petrov", :year "1996", :id "2"}
;; {:surname "Sidorov", :year "1997", :id "3"})
I must use let, map, next, table-keys and data-record functions.
In this case:
student-tbl => (["id" "surname" "year" "group_id"] ["1" "Ivanov" "1998"] ["2" "Petrov" "1997"] ["3" "Sidorov" "1996"])
(table-keys student-tbl) => [:id :surname :year :group_id]
(data-record [:id :surname :year :group_id] ["1" "Ivanov" "1996"]) => {:surname "Ivanov", :year "1996", :id "1"}
I wrote this:
(defn data-table [tbl]
(let [[x] (next tbl)]
(data-record (table-keys tbl) x)
))
(data-table student-tbl) => {:surname "Ivanov", :year "1998", :id "1"}
How I can use map for right result?
First, here is how you should probably write this in practice. Then I'll show you your mistake so you can learn for your homework.
One way:
(defn data-table
[[headers & data]]
(let [headers (map keyword headers)
data-record (partial zipmap headers)]
(map data-record data)))
The key takeaways here are:
destructure the input to go ahead and separate headers from data
build the headers once, using the core keyword function
compose a function which always takes the same set of headers, and then map that function over our data
note that there are no external functions, which is always a nice thing when we can get away with it
Now, to make your way work, what you need to do is map the data-record function over x. First, the let binding should bind (next tbl) to x, not [x] (the way you're doing it, you only get the first element of the data set (Ivanov, 1998, 1).
In this example, ignore the data-record zipmap and table-keys binding in the let. They're there to make this example work, and you can remove them safely.
(defn data-table-newb
[tbl]
(let [table-keys #(map keyword (first %))
headers (table-keys tbl)
data-record zipmap
x (next tbl)]
(map #(data-record headers %) x)))
Essentially, you compute your table headers at the beginning, then create a new anonymous function that calls data-record and gives it your computed headers and an individual vector of data. You apply that function over every element of your data list, which you have bound to x.
Removing the unnecessary functions which are defined elsewhere, you get:
(defn data-table-newb
[tbl]
(let [headers (table-keys tbl)
x (next tbl)]
(map #(data-record headers %) x)))

NOT - EXISTS / NOT - IN type query in Clojure

I have 2 data structures like the ones below
(ns test)
(def l
[{:name "Sean" :age 27}
{:name "Ross" :age 27}
{:name "Brian" :age 22}])
(def r
[{:owner "Sean" :item "Beer" }
{:owner "Sean" :item "Pizza"}
{:owner "Ross" :item "Computer"}
{:owner "Matt" :item "Bike"}])
I want to have get persons who dont own any item . (Brian in this case so [ {:name "Brian" :age 22}]
If this was SQL I would do left outer join or not exists but I not sure how to do this in clojure in more performant way.
While Chuck's solution is certainly the most sensible one, I find it interesting that it is possible to write a solution in terms of relational algebraic operators using clojure.set:
(require '[clojure.set :as set])
(set/difference (set l)
(set/project (set/join r l {:owner :name})
#{:name :age}))
; => #{{:name "Brian", :age 22}}
You basically want to do a filter on l, but negative. We could just not the condition, but the remove function already does this for us. So something like:
(let [owner-names (set (map :owner r))]
(remove #(owner-names (% :name)) l))
(I think it reads more nicely with the set, but if you want to avoid allocating the set, you can just do (remove (fn [person] (some #(= (% :owner) (person :name)) r)) l).)