Converting string to nested map in Clojure - clojure

I have a file containing some text like:
1|apple|sweet
2|coffee|bitter
3|gitpush|relief
I want to work with this input using a map. In Java or Python, I would have made a nested map like:
{1: {thing: apple, taste: sweet},
2: {thing: coffee, taste: bitter},
3: {thing: gitpush, taste: relief}}
Or even a list inside the map like:
{1: [apple, sweet],
2: [coffee, bitter],
3: [grape, sour]}
The end goal is to access the last two column's data efficiently using the first column as the key.
I want to do this in Clojure and I am new to it. So far, I have succeeded in creating a list of map using the following code:
(def cust_map (map (fn [[id name taste]]
(hash-map :id (Integer/parseInt id)
:name name
:taste taste ))
(map #(str/split % #"\|") (line-seq (clojure.java.io/reader path)))))
And I get this, but it's not what I want.
({1, apple, sweet},
{2, coffee, bitter},
{3, gitpush, relief})
It would be nice if you can show me how to do the most efficient of, or both nested map and list inside map in Clojure. Thanks!

When you build a map with hash-map, the arguments are alternative keys and values. For example:
(hash-map :a 0 :b 1)
=> {:b 1, :a 0}
From what I understand, you want to have a unique key, the integer, which maps to a compound object, a map:
(hash-map 0 {:thing "apple" :taste "sweet"})
Also, you do not want to call map, which would result in a sequence of maps. You want to have a single hash-map being built.
Try using reduce:
(reduce (fn [map [id name taste]]
(merge map
(hash-map (Integer/parseInt id)
{:name name :taste taste})))
{}
'(("1" "b" "c")
("2" "d" "e")))
--- edit
Here is the full test program:
(import '(java.io BufferedReader StringReader))
(def test-input (line-seq
(BufferedReader.
(StringReader.
"1|John Smith|123 Here Street|456-4567
2|Sue Jones|43 Rose Court Street|345-7867
3|Fan Yuhong|165 Happy Lane|345-4533"))))
(def a-map
(reduce
(fn [map [id name address phone]]
(merge map
(hash-map (Integer/parseInt id)
{:name name :address address :phone phone})))
{}
(map #(clojure.string/split % #"\|") test-input)))
a-map
=> {1 {:name "John Smith", :address "123 Here Street", :phone "456-4567"}, 2 {:name "Sue Jones", :address "43 Rose Court Street", :phone "345-7867"}, 3 {:name "Fan Yuhong", :address "165 Happy Lane", :phone "345-4533"}}

I agree with #coredump that this is not concise, yet a quick solution to your code is using a list (or any other collection) and a nested map:
(def cust_map (map (fn [[id name taste]]
(list (Integer/parseInt id)
(hash-map :name name
:taste taste)))
(map #(clojure.string/split % #"\|") (line-seq (clojure.java.io/reader path)))))

This may be a somewhat naive view on my part, as I'm not all that experienced with Clojure, but any time I want to make a map from a collection I immediately think of zipmap:
(require '[clojure.java.io :as io :refer [reader]])
(defn lines-from [fname]
(line-seq (io/reader fname)))
(defn nested-map [fname re keys]
"fname : full path and filename to the input file
re : regular expression used to split file lines into columns
keys : sequence of keys for the trailing columns in each line. The first column
of each line is assumed to be the line ID"
(let [lines (lines-from fname)
line-cols (map #(clojure.string/split % re) lines) ; (["1" "apple" "sweet"] ["2" "coffee" "bitter"] ["3" "gitpush" "relief"])
ids (map #(Integer/parseInt (first %)) line-cols) ; (1 2 3)
rest-cols (map rest line-cols) ; (("apple" "sweet") ("coffee" "bitter") ("gitpush" "relief"))
rest-maps (map #(zipmap keys %) rest-cols)] ; ({:thing "apple", :taste "sweet"} {:thing "coffee", :taste "bitter"} {:thing "gitpush", :taste "relief"})
(zipmap ids rest-maps)))
(nested-map "C:/Users/whatever/q50663848.txt" #"\|" [:thing :taste])
produces
{1 {:thing "apple", :taste "sweet"}, 2 {:thing "coffee", :taste "bitter"}, 3 {:thing "gitpush", :taste "relief"}}
I've shown the intermediate results of each step in the let block as a comment so you can see what's going on. I've also tossed in lines-from, which is just my thin wrapper around line-seq to keep myself from having to type in BufferedReader. and StringReader. all the time. :-)

Related

How to split string in clojure on number and convert it to map

I have a string school_name_1_class_2_city_name_3 want to split it to {school_name: 1, class:2, city_name: 3} in clojure I tried this code which didn't work
(def s "key_name_1_key_name_2")
(->> s
(re-seq #"(\w+)_(\d+)_")
(map (fn [[_ k v]] [(keyword k) (Integer/parseInt v)]))
(into {}))
You are looking for the ungreedy version of regex.
Try using #"(\w+?)_(\d+)_?" instead.
user=> (->> s (re-seq #"(\w+?)_(\d+)_?"))
(["key_name_1_" "key_name" "1"] ["key_name_2" "key_name" "2"])
When faced with a problem, just break it down and solve one small step at a time. Using let-spy-pretty from the Tupelo library allows us to see each step of the transformation:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require [clojure.string :as str]))
(defn str->keymap
[s]
(let-spy-pretty
[str1 (re-seq #"([a-zA-Z_]+|[0-9]+)" s)
seq1 (mapv first str1)
seq2 (mapv #(str/replace % #"^_+" "") seq1)
seq3 (mapv #(str/replace % #"_+$" "") seq2)
map1 (apply hash-map seq3)
map2 (tupelo.core/map-keys map1 #(keyword %) )
map3 (tupelo.core/map-vals map2 #(Long/parseLong %) )]
map3))
(dotest
(is= (str->keymap "school_name_1_class_2_city_name_3")
{:city_name 3, :class 2, :school_name 1}))
with result
------------------------------------
Clojure 1.10.3 Java 11.0.11
------------------------------------
Testing tst.demo.core
str1 =>
(["school_name_" "school_name_"]
["1" "1"]
["_class_" "_class_"]
["2" "2"]
["_city_name_" "_city_name_"]
["3" "3"])
seq1 =>
["school_name_" "1" "_class_" "2" "_city_name_" "3"]
seq2 =>
["school_name_" "1" "class_" "2" "city_name_" "3"]
seq3 =>
["school_name" "1" "class" "2" "city_name" "3"]
map1 =>
{"city_name" "3", "class" "2", "school_name" "1"}
map2 =>
{:city_name "3", :class "2", :school_name "1"}
map3 =>
{:city_name 3, :class 2, :school_name 1}
Ran 2 tests containing 1 assertions.
0 failures, 0 errors.
Passed all tests
Once you understand the steps and everything is working, just replace let-spy-pretty with let and continue on!
This was build using my favorite template project.
Given
(require '[clojure.string :as str])
(def s "school_name_1_class_2_city_name_3")
Following the accepted answer:
(->> s (re-seq #"(.*?)_(\d+)_?")
(map rest) ;; take only the rest of each element
(map (fn [[k v]] [k (Integer. v)])) ;; transform second as integer
(into {})) ;; make a hash-map out of all this
Or:
(apply hash-map ;; the entire thing as a hash-map
(interleave (str/split s #"_(\d+)(_|$)") ;; capture the names
(map #(Integer. (second %)) ;; convert to int
(re-seq #"(?<=_)(\d+)(?=(_|$))" s)))) ;; capture the integers
or:
(zipmap
(str/split s #"_(\d+)(_|$)") ;; extract names
(->> (re-seq #"_(\d+)(_|$)" s) ;; extract values
(map second) ;; by taking only second matched groups
(map #(Integer. %)))) ;; and converting them to integers
str/split leaves out the matched parts
re-seq returns only the matched parts
(_|$) ensures that the number is followed by a _ or is at an end position
The least verbose (where (_|$) could be replaced by _?:
(->> (re-seq #"(.*?)_(\d+)(_|$)" s) ;; capture key vals
(map (fn [[_ k v]] [k (Integer. v)])) ;; reorder coercing values to int
(into {})) ;; to hash-map

Clojure: set value as a key

May be, it is a stupid question, but it may help many of newbies. How do I add a key-value pair to the map?
I mean something like:
(defn init-item [v item]
(let [{:keys [id value]} item]
(-> v
(assoc :{ID_AS_A_KEY} value))))
And I get:
(init-item {} {:id "123456789" :value [:name "King" :surname "Leonid"]})
user=> {:123456789 [:name "King" :surname "Leonid"]}
Just don't do it. Use the string itself as your map key. There's no reason to make it a keyword. It's much easier to work with if you leave it alone.
(defn init-item [v item]
(assoc v (:id item) (:value item)))
I think this is what you meant to do:
(defn init-item
[dest-map item]
(let [item-id-str (:id item)
item-val (:value item)
item-id-kw (keyword item-id-str)]
(assoc dest-map item-id-kw item-val)))
(let [all-items {:a 1 :b 2 :c 3}
item-1 {:id "123456789"
:value [:name "King" :surname "Leonid"]}]
(init-item all-items item-1)
;=> {:a 1, :b 2, :c 3, :123456789 [:name "King" :surname "Leonid"]}
Clojure has functions name, symbol, and keyword to convert between strings and symbols/keywords. Since you already have the ID as a string, you just need to call keyword to convert it.
Be sure to always keep a browser tab open to The Clojure CheatSheet.

Retreiving data in clojure

I have three text files as http://paste.debian.net/plain/1027720. As the third file is in the following format
Third File
salesID | custID | prodID | itemCount
1|1|1|3
2|2|2|3
I want to display the table such that custID should be replaced by the customer name and the prodID by the product description,
as follows:
1: ["John" "shoes" "3"]
What I did till now is :
(def data (slurp "cust.txt"))
(->> (for [line (clojure.string/split data #"[ ]*[\r\n]+[ ]*")]
(-> line (clojure.string/split #"\|") rest vec))
(map vector (rest (range))))
How I can retreive and map the values accordingly?
EDIT
"demo_1.txt"
content id|name|address|phone-number
1|John|123 Street|456-4567
2|Smith|123 Here Street|456-4567
"demo_2.txt"
prodID | item | Cost
1|shoes|14.96
2|milk|1.98
The processing of this data is similar to how I process CSV files. I like to split the problem into functions that do line to vector and vector to map, using the first row as the header for each.
(defn line->vec [s]
(s/split s #"\|"))
(defn vec->map [desc row]
(into {}
(map vector desc row))) ; Map accepts multiple collections
(defn file->maps [filename]
; Destructuring here, for easy capturing of header row
(let [[desc & lines] (->> (slurp filename)
(s/split-lines)
(map line->vec))
desc-keys (map keyword desc)]
(for [line lines]
(vec->map desc-keys line))))
For your demo files, you can use group-by to generate a map, sort of like an index (I manually fixed the header formatting, but you'd want to do it with a utility fn):
For (group-by :content-id (file->maps "demo_1.txt"))
{"1" [{:address "123 Street",
:phone-number "456-4567",
:name "John",
:content-id "1"}],
"2" [{:address "123 Here Street",
:phone-number "456-4567",
:name "Smith",
:content-id "2"}]}
For (group-by :prodID (file->maps "demo_2.txt"))
{"1" [{:item "shoes", :prodID "1", :cost "14.96"}],
"2" [{:item "milk", :prodID "2", :cost "1.98"}]}
And then replace each column with its index value:
(defn replace-value [index idx-key m k]
(update m k #(get-in index [% 0 idx-key])))
(defn -main [& args]
(let [customers (group-by :content-id (file->maps "demo1.txt"))
products (group-by :prodID (file->maps "demo2.txt"))]
; Use customers and products to replace some data
(->> (file->maps "demo_3.txt")
(map #(replace-value customers :name % :content-id))
(map #(replace-value products :item % :prodID)))))
And the result:
({:prodID "shoes", :content-id "John", :salesID "1", :itemCount "3"}
{:prodID "milk", :content-id "Smith", :salesID "2", :itemCount "3"})
Then it should be straightforward to convert those maps back into the format you want.

clojure way to update a map inside a vector

What is the clojure way to update a map inside a vector e.g. if I have something like this, assuming each map has unique :name
(def some-vec
[{:name "foo"
....}
{:name "bar"
....}
{:name "baz"
....}])
I want to update the map in someway if it has :name equal to foo. Currently I'm using map, like this
(map (fn [{:keys [name] :as value}]
(if-not (= name "foo")
value
(do-something .....))) some-vec)
But this will loop through the entire vector even though I only update one item.
Keep the data as a map instead of a vector of map-records, keyed by :name.
(def some-data
{"foo" {:name "foo" :other :stuff}
"bar" {:name "bar" :other :stuff}
"baz" {:name "baz" :other :stuff}})
Then
(assoc-in some-data ["bar" :other] :things)
produces
{"foo" {:other :stuff, :name "foo"},
"bar" {:other :things, :name "bar"},
"baz" {:other :stuff, :name "baz"}}
in one go.
You can capture the basic manipulation in
(defn assoc-by-fn [data keyfn datum]
(assoc data (keyfn datum) datum))
When, for example,
(assoc-by-fn some-data :name {:name "zip" :other :fassner})
produces
{"zip" {:other :fassner, :name "zip"},
"foo" {:other :stuff, :name "foo"},
"bar" {:other :stuff, :name "bar"},
"baz" {:other :stuff, :name "baz"}}
Given that you have a vector of maps, your code looks fine to me. Your concern about "looping through the entire vector" is a natural consequence of the fact that you're doing a linear search for the :name and the fact that vectors are immutable.
I wonder whether what you really want is a vector of maps? Why not a map of maps?
(def some-map
{"foo" {...}
"bar" (...}
"baz" {...}}
Which you could then update with update-in?
Given this shape of the input data and unless you have an index that can tell you which indices the maps with a given value of :name reside at, you will have to loop over the entire vector. You can, however, minimize the amount of work involved in producing the updated vector by only "updating" the matching maps, rather than rebuilding the entire vector:
(defn update-values-if
"Assumes xs is a vector. Will update the values for which
pred returns true."
[xs pred f]
(let [lim (count xs)]
(loop [xs xs i 0]
(if (< i lim)
(let [x (nth xs i)]
(recur (if (pred x)
(assoc xs i (f x))
xs)
(inc i)))
xs))))
This will perform as many assoc operations as there are values in xs for which pred returns a truthy value.
Example:
(def some-vec [{:name "foo" :x 0} {:name "bar" :x 0} {:name "baz" :x 0}])
(update-values-if some-vec #(= "foo" (:name %)) #(update-in % [:x] inc))
;= [{:name "foo", :x 1} {:name "bar", :x 0} {:name "baz", :x 0}]
Of course if you're planning to transform the vector in this way with some regularity, then Thumbnail's and Paul's suggestion to use a map of maps will be a much more significant improvement. That remains the case if :name doesn't uniquely identify the maps – in that case, you could simply transform your original vector using frequencies and deal with a map of vectors (of maps with a given :name).
If you're working with vector, you should know index of element that you want to change, otherwise you have to traverse it in some way.
I can propose this solution:
(defn my-update [coll val fnc & args]
(let [index (->> (map-indexed vector coll)
(filter (fn [[_ {x :name}]] (= x val)))
ffirst)]
(when index
(apply update-in coll [index] fnc args))))
Where:
coll - given collection of maps;
val - value of field :name;
fnc - updating function;
args - arguments of the updating function.
Let's try it:
user> (def some-vec
[{:name "foo"}
{:name "bar"}
{:name "baz"}])
;; => #'user/some-vec
user> (my-update some-vec "foo" assoc :boo 12)
;; => [{:name "foo", :boo 12} {:name "bar"} {:name "baz"}]
user> (my-update some-vec "bar" assoc :wow "wow!")
;; => [{:name "foo"} {:name "bar", :wow "wow!"} {:name "baz"}]
I think that Thumbnail's answer may be quite useful for you. If you can keep your data as a map, these manipulations become much easier. Here is how you can transform your vector into a map:
user> (apply hash-map (interleave (map :name some-vec) some-vec))
;; => {"foo" {:name "foo"}, "bar" {:name "bar"}, "baz" {:name "baz"}}

Filter for Lists in Clojure

I am having a bit of difficulty with Lists in Clojure
I have a quick question concerning the filter function
Let's say I have a List composed of Maps
My code is:
(def Person {:name Bob } )
(def Person2 {:name Eric } )
(def Person3 {:name Tim } )
(def mylist (list Person Person2 Person3))
How would i go about filtering my list so that , for example: I want the list minus Person2 (meaning minus any map that has :name Eric)
Thank you very much to everybody helping me out. This is my last question I promise
For this purpose, it's better to use the 'remove' function. It takes a sequence, and removes elements on which it's predicate returns 'true'. It's basically the opposite of filter. Here is an example of it, and filter's usage for the same purposes, that I worked up via the REPL.
user> (def m1 {:name "eric" :age 32})
#'user/m1
user> (def m2 {:name "Rayne" :age 15})
#'user/m2
user> (def m3 {:name "connie" :age 44})
#'user/m3
user> (def mylist (list m1 m2 m3))
#'user/mylist
user> (filter #(not= (:name %) "eric") mylist)
({:name "eric", :age 32})
user> (remove #(= (:name %) "eric") mylist)
({:name "Rayne", :age 15} {:name "connie", :age 44})
As you can see, remove is a little bit cleaner, because you don't have to use not=. Also, when working with maps, you don't have to use the 'get' function unless you want it to return something special if a key isn't in the map. If you know the key you're looking for will be in the map, there is no reason to use 'get'. Good luck!
Suppose you have something like this:
(def Person {:name "Bob" } )
(def Person2 {:name "Eric" } )
(def Person3 {:name "Tim" } )
(def mylist (list Person Person2 Person3))
This would work:
(filter #(not= "Eric" (get % :name)) mylist)
user=> (filter (fn [person] (not= (person :name) "Eric")) mylist)
({:name "Bob"} {:name "Tim"})
or using a more compact syntax:
user=> (filter #(not= (% :name) "Eric") mylist)
({:name "Bob"} {:name "Tim"})