Clojure - splitting time string to find distinct hours - clojure

I am working with a data set that has a time column formatted like so:
{:time "00:00:00"}
{:time "03:05:12"}
{:time "09:45:14"}
{:time "13:01:22"}
{:time "19:29:31"}
I have run a SQL query to find these times although I need to be able to find the distinct hour only, potentially by splitting the string. I have tried:
t(get (split (:time data) #":") 0)
But it returns all of the data in the db except what I want - I am trying to split the time after the first : and only keep the first element of the split distinctly.
Ideally I just want the hours returned such as '00' '03' '09' '13' '19' but I am unsure how to do this. Thanks in advance for any help!

Unless there's any reason why the hours would be ever anything else than the first two characters, you could just do
(subs (:time data) 0 2)
If you really need all the characters before the first : and that's not always in the same position, you can still use subs, but combined with .indexOf:
(let [s (:time data)]
(subs s 0 (.indexOf s ":")))

Related

Why doesn't this keyword function lookup work in a hashmap?

I guess I need some eyeballs on this to make some sense of this
(println record)
(println (keys record) " - " (class record) " : " (:TradeId record) (:Stock record))
(doall (map #(println "Key " % "Value " (% record)) (keys record)))
Output:
{:Stock ATT, :AccountId 1, :TradeId 37, :Qty 100, :Price 117, :Date 2011-02-24T18:30:00.000Z, :Notes SPLIT 1:10, :Type B, :Brokerage 81.12}
(:Stock :AccountId :TradeId :Qty :Price :Date :Notes :Type :Brokerage) - clojure.lang.PersistentHashMap : nil ATT
Key :Stock Value ATT
Key :AccountId Value 1
Key :TradeId Value 37
...
The issue is (:TradeId record) doesn't work even though it exists as a key. Iterating through all keys and values - Line 3 - yields the right value.
Tried renaming the column in the csv but no change in behavior. I see no difference from the other columns (which work) except that this is the first column in the csv.
The hashmap is created from code like this - reading records from a CSV. Standard code from the clojure.data.csv package.
(->> (csv/read-csv reader)
csv-data->maps
(map #(my-function some-args %))
doall))
(defn csv-data->maps
"Return the csv records as a vector of maps"
[csv-data]
(map zipmap
(->> (first csv-data) ;; First row is the header
(map keyword) ;; Drop if you want string keys instead
repeat)
(rest csv-data)))
The "first column" thing is definitely suspicous and points to some invisible characters such as a BOM quietly attaching itself to your first keyword.
To debug try printing out the hex of the names of the keywords. And/or maybe you'll see something if you do a hex dump, e.g., with head -n 2 file.csv | od -x, of the first few lines of the input file.
I would try two things. First, print the type of each key. They should all be clojure.lang.Keyword, if the creation code you included is accurate and my-function preserves their type; but if you created it in some other way and misremembered, you might discover that the key is a symbol, or a string or something like that. In general, don't use println on anything but strings, because it's pretty low-fidelity. prn is better at conveying an accurate picture of your data - it's not perfect, but at least you can tell a string from a keyword with it.
Second, look at the printed values more carefully, e.g. with od -t x1 - or you could do it in process with something like:
(let [k (key (first m)), s (name k)]
(clojure.string/join " "
(for [c s]
(format "%02x" (int c)))))
If the result isn't "53 74 6f 63 6b", then you have some weird characters in your file - maybe nonprinting characters, maybe something that looks like a capital S but isn't, whatever.
Once I reached the point of trying anything, I copied the keyword from the REPL and pasted it into VSCode and sure enough - there was this weird looking character :?Id within the keyword. Using the weird keyword, the lookup worked.
Workaround: Added a dummy column as the first column.
Then things started to click into place, I remembered reading something on BOM in the csv reader project docs. https://github.com/clojure/data.csv#byte-order-mark
Downloaded a hexdump file viewer which confirmed the problem bytes at the start of the file.
o;?Id,AccountId,...
Final solution: Before passing the reader to the data.csv read function, skip over the unwanted bytes.
(.skip reader 1)
The world makes sense again.

building a hashmap from an array in clojure

First off, I am a student in week 5 of 12 at The Iron Yard studying Java backend engineering. The course is composed of roughly 60% Java, 25% JavaScript and 15% Clojure.
I have been given the following problem (outlined in the comment):
;; Given an ArrayList of words, return a HashMap> containing a keys for every
;; word's first letter. The value for the key will be an ArrayList of all
;; words in the list that start with that letter. An empty string has no first
;; letter so don't add a key for it.
(defn index-words [word-list]
(loop [word (first word-list)
index {}]
(if (contains? index (subs word 0 1))
(assoc index (subs word 0 1) (let [words (index (subs word 0 1))
word word]
(conj words word)))
(assoc index (subs word 0 1) (conj nil word)))
(if (empty? word-list)
index
(recur (rest word-list) index))))
I was able to get a similar problem working using zipmap but I am positive that I am missing something with this one. The code compiles but fails to run.
Specifically, I am failing to update my hashmap index in the false clause of the 'if'.
I have tested all of the components of this function in the REPL, and they work in isolation. but I am struggling to put them all together.
For your reference, here is the code that calls word-list.
(let [word-list ["aardvark" "apple" "zamboni" "phone"]]
(printf "index-words(%s) -> %s\n" word-list (index-words word-list)))
Rather than getting a working solution from the community, my hope is for a few pointers to get my brain moving in the right direction.
The function assoc does not modify index. You need to work with the new value that assoc returns. Same is true for conj: it does not modify the map you pass it.
I hope, this answer is of the nature you expected to get: just a pointer where your problem is.
BTW: If you can do with a PersistentList this becomes a one-liner when using reduce instead of loop and recur. An interesting function for you could be update-in.
Have fun with Clojure.
The group-by function does what you require.
You can use first as its discriminating function argument. It
returns the first character of a string, or nil if there isn't one:
(first word) is simpler than (subs word 0 1).
Use dissoc to remove the entry for key nil.
You seldom need to use explicit loops in clojure. Most common control patterns have been captured in functions like group-by. Such functions have function and possibly collection arguments. The commonest examples are map and reduce. The Clojure cheat sheet is a most useful guide to them.

Compare values in a list of maps in clojure

I have a list of maps like
(def listofmaps
({:directory_path "/some/path/1", :directory_size "8.49 GB"} {:directory_path "/user/dod/yieldbook/yb_sec_char", :directory_size "14.1 MB"})
containing many values and size can be in gb or mb.
Also I have a limitlistofmaps like
(def limitlistofmaps
({:directory_path "/some/path/8", :directory_size "15.2 GB"} {:directory_path "some/path/3", :directory_size "2.1 GB"}
{:directory_path "/some/path/1", :directory_size "17.2 GB"})
with many values..
I need to print "limit exceeded" if any map in list of maps had the same :directory_path as in limitlistofmaps but :directory_size exceeds the value specified. The problem is that size is in string format and unit has to be considered.
Can you help me with a way to do this in clojure?
I don't get why people down voted your question. I think as a Clojure community we are much better. I'm also a Clojure newb and I have nothing but great things to say about the community. I would like that it stays this way.
Firstly, why not have all the directory sizes in the same unit ? That way it's easier to compare them, say in KB.
Here is one version of a function that would transform any string like "100.23 Gb", "12 B", "123.3443 MB" to a Double representing Kilobytes.
(defn convert-to-kb
"Converts a string 'number (B|KB|MB|GB)' to Double representing KBytes"
[str]
(let [[number-str unit] (map str/lower-case (str/split (str/trim str) #"\s+"))
number (Double/parseDouble number-str)]
(condp = unit
"b" (/ number 1000)
"kb" number
"mb" (* number 1000)
"gb" (* number 1000000))))
Secondly, I would suggest you put the directory size limits in the same map data that lives inside your listofmaps, so you don't have state duplication like you have now in limitslistofmaps. But if for some reason you need this second map, here a piece of ugly code that returns a list of maps that are the same as in your listofmaps with two added key/val entries, :max_size_kb and :directory_size_kb.
for [dir listofmaps :let [{:keys [directory_path directory_size]} dir]]
(let [limit-map (first
(get (group-by :directory_path limitlistofmaps) directory_path))
max-size-kb (convert-to-kb (:directory_size limit-map))]
(-> dir
(assoc :max_size_kb max-size-kb)
(assoc :directory_size_kb (convert-to-kb directory_size)))))

Why does Incanter lose column title when only querying one column?

When selecting two columns from a dataset, the result has the two given column titles as expected. But when only specifying one column, the one resulting column loses it's title, instead, it is titled "0":
This makes it hard to use $order or whatever in later steps that take column names.
That is, this will work
(with-data data
(->> ($ [:foo :bar])
($order [:foo] :asc)
(view)))
and this will fail
(with-data data
(->> ($ [:foo])
($order [:foo] :asc)
(view)))
Any ideas what is going wrong or what to do?
which version of Incanter are you using? This behavior was changed in recent versions, and at least 1.5.4 works correctly. But take into account that behavior of $ is different when you pass the column name as single element, and as vector:
incanter.main=> (def data (dataset [:foo :bar] [[:a :b] [:c :d]]))
#'incanter.main/data
incanter.main=> ($ :foo data)
(:a :c)
incanter.main=> ($ [:foo] data)
| :foo |
|------|
| :a |
| :c |
It sounds like you hit on the correct answer when you point out that in the single key case incanter simply returns a sequence. One way to get around this, though it could be a little less elegant is to simply request a second column and ignore the second result or put it into a sequence of maps after. Something only a little hackish like:
(map hash-map (repeat :key) result-seq)

clojure: how to get values from lazy seq?

Iam new to clojure and need some help to get a value out of a lazy sequence.
You can have a look at my full data structure here: http://pastebin.com/ynLJaLaP
What I need is the content of the title:
{: _content AlbumTitel2}
I managed to get a list of all _content values:
(def albumtitle (map #(str (get % :title)) photosets))
(println albumtitle)
and the result is:
({:_content AlbumTitel2} {:_content test} {:_content AlbumTitel} {:_content album123} {:_content speciale} {:_content neues B5 Album} {:_content Album Nr 2})
But how can I get the value of every :_content?
Any help would be appreciated!
Thanks!
You could simply do this
(map (comp :_content :title) photosets)
Keywords work as functions, so the composition with comp will first retrieve the :title value of each photoset and then further retrieve the :_content value of that value.
Alternatively this could be written as
(map #(get-in % [:title :_content]) photosets)
A semi alternative solution is to do
(->> data
(map :title)
(map :_content))
This take advances of the fact that keywords are functions and the so called thread last macro. What it does is injecting the result of the first expression in as the last argument of the second etc..
The above code gets converted to
(map :_content (map :title data))
Clearly not as readable, and not easy to expand later either.
PS I asume something went wrong when the data was pasted to the web, because:
{: _content AlbumTitel2}
Is not Clojure syntax, this however is:
{:_content "AlbumTitel2"}
No the whitespace after :, and "" around text. Just in case you might want to paste some Clojure some other time.