Clojure - Working with list and date-time

Clojure - Working with list and date-time - list

I am quite stuck in this scenario.
I have a list of atoms representing bank transactions.
(#<Ref#29a71299: {:desc "DESC1", :amount 150, :date #<LocalDate 2017-01-10>}>)
(#<Ref#5a4ebf03: {:desc "DESC2", :amount 250, :date #<LocalDate 2017-01-10>}>)
(#<Ref#5a4ebf03: {:desc "DESC3", :amount -250, :date #<LocalDate 2017-01-11>}>)
(#<Ref#5a4ebf03: {:desc "DESC4", :amount 50, :date #<LocalDate 2017-01-12>}>)
I need calculate the balance account in the end of the day, so I should grab all transactions separated per day to know the balance in the end of the day.
Someone did it before ? What is the best way to filter dates and do this math ? I am still noob/student in clojure.
obs. I am using this library to work with date Jodatime

A great way to approach problems in Clojure is to think:
How can I break this problem down (this is usually the hard part)
How can I solve each problem alone
How do I compose these solutions (this is usually the easy part)
Applying this to your problem I see these problems:
segmenting a list of maps by a property of one of the keys
(partition-by ... something ...)
summing all the values of one of the keys in each of a sequence of maps
(map (reduce ...))
making an output format with the data and the sum from each segment
(map ... something)
And the composing part is likely just nesting each of these as nested function calls. Nested function calls can be written using the thread-last maco and will look something like this:
(->> data
(... problem one solution ...)
(problem two solution)
(some output formatting for problem three))

You may want to break it down this way:
(defn per-day-balance [txns]
(->> txns
(partition-by :date)
(map (fn [[{date :date} :as txns]]
{:date date :balance (reduce + (map :amt txns))}))))
Find the daily balance assuming everyday starts with 0. Sample run:
(def txns [{:date 1 :amt 10}
{:date 1 :amt 3}
{:date 2 :amt 9}
{:date 2 :amt -11}
{:date 3 :amt 13}])
user> (per-day-balance txns)
=> ({:date 1, :balance 13} {:date 2, :balance -2} {:date 3, :balance 13})
Now add a reduction function to get the running total. The reduction function simply 'update' the cumulative balance:
(defn running-balance [bals]
(let [[day-1 & others] bals]
(reductions
(fn [{running :balance} e] (update e :balance + running))
day-1
others)))
Sample run:
user> (->> txns
per-day-balance
running-balance)
=> ({:date 1, :balance 13} {:date 2, :balance 11} {:date 3, :balance 24})
Note: You can use whatever data type for :date field. Secondly, I deliberately avoid atom to make the functions pure.

This ended up getting more complicated than I thought it would. I looked at partition-by, and you should almost definitely use that instead. It's perfectly suited for this problem. This is just an example of how it could be done with a loop:
(defn split-dates [rows]
(loop [[row & rest-rows] rows ; Split on the head
last-date nil
acc [[]]]
(if row
(let [current-date (last row)]
(recur rest-rows current-date
; If the day is the same as the previous row
(if (or (nil? last-date) (= current-date last-date))
; Update the current day list with the new row
(update acc (dec (count acc))
#(conj % row))
; Else, create a new list of rows for the new date
(conj acc [row]))))
acc)))
(clojure.pprint/pprint
(split-dates
[[0 1 "16.12.25"]
[2 3 "16.12.25"]
[4 5 "16.12.26"]
[6 7 "16.12.26"]
[8 9 "16.12.26"]
[1 9 "16.12.27"]]))
[[[0 1 "16.12.25"] [2 3 "16.12.25"]]
[[4 5 "16.12.26"] [6 7 "16.12.26"] [8 9 "16.12.26"]]
[[1 9 "16.12.27"]]]
Notes:
This assumes the dates are in the last column, and that the rows are sorted by date.
It returns [[]] when given given an empty list. This may or may not be what you want.

Related

Aggregating transducers with intermediate values

I am still trying to understand better how to work with transducers in clojure. Here, I am interested in applying aggregating transducers, such as the ones in https://github.com/cgrand/xforms, but reporting at each step the intermediate values of the computation.
For instance, the following expression
(sequence (x/into #{}) [1 2 3])
yields (#{1 2 3}), which is only the final value of the reduction. Now, I would be interested in an transducer xf-incremental that given something like
(sequence (comp xf-incremental (x/into #{})) [1 2 3])
yields (#{1} #{1 2} #{1 2 3}).
The reason why I am interested in this is that I want to report intermediate values of a metric that aggregates over the history of processed values.
Any idea how can I do something of the sort in a generic way?
EDIT: Think of (x/into #{}) as an arbitrary transducer that aggregates results. Better examples could be x/avg or (x/reduce +) where I would expect
(sequence (comp xf-incremental x/avg) [1 2 3])
(sequence (comp xf-incremental (x/reduce +)) [1 2 3])
to return (1 3/2 2) and (1 3 6) respectively.
EDIT 2: another way of phrasing this is that I want a transducer that performs a reducing function and returns the accumulator at each step, which also can reuse all the available transducers so I do not need to rewrite basic functionalities.

Solution using clojure.core/reductions
You don't need a transducer to perform the computation that you are asking for. The function you are looking for to see all the intermediate results of reduce is called reductions and you provide it with conj and an empty set as arguments:
(rest (reductions conj #{} [1 2 3]))
;; => (#{1} #{1 2} #{1 3 2})
rest removes the first empty set, because that was the output you requested in the original question.
The function that builds up the result here is conj, lets refer to it as a step function. A transducer is a function that takes a step function as input and returns a new step function as output. So if we want to combine reductions with a transducer, we can just apply the transducer to conj:
(def my-transducer (comp (filter odd?)
(take 4)))
(dedupe (reductions (my-transducer conj) #{} (range)))
;; => (#{} #{1} #{1 3} #{1 3 5} #{7 1 3 5})
dedupe is there just to remove elements that are equal to preceding elements. You can remove it if you don't want to do that. In that case you get the following, because that is how the filtering transducer works:
(reductions (my-transducer conj) #{} (range)))
;; => (#{} #{} #{1} #{1} #{1 3} #{1 3} #{1 3 5} #{1 3 5} #{7 1 3 5})
Transducer-based solution using net.cgrand.xforms/reductions
Apparently, there is also a transducer version of reductions in the xforms library, which is closer to your initial code:
(require '[net.cgrand.xforms :as xforms])
(rest (sequence (xforms/reductions conj #{}) [1 2 3]))
;; => (#{1} #{1 2} #{1 3 2})
This xforms/reductions transducer can be composed with other transducer using comp to for example filter odd numbers and taking the first four of them:
(sequence (comp (filter odd?)
(take 4)
(xforms/reductions conj #{}))
(range))
;; => (#{} #{1} #{1 3} #{1 3 5} #{7 1 3 5})
In this case, you don't need dedupe. It is also possible to use other step functions with xforms/reductions, e.g. +:
(sequence (comp (filter odd?)
(take 10)
(xforms/reductions + 0)
(filter #(< 7 %)))
(range))
;; => (9 16 25 36 49 64 81 100)

How to filter a map comparing it with another collection

I have a map with collection of these {:id 2489 ,values :.......} {:id 5647 ,values : .....} and so on till 10000 and I want to filter its value dependent on another collection which has ids of first one like (2489 ,......)
I am new to clojure and I have tried :
(into {} (filter #(some (fn [u] (= u (:id %))) [2489 3456 4567 5689]) record-sets))
But it gives me only the last that is 5689 id as output {:id 5689 ,:values ....}, while I want all of them, can you suggest what I can do.

One problem is that you start out with a sequence of N maps, then you try to stuff them into a single map. This will cause the last one to overwrite the first one.
Instead, you need to have the output be a sequence of M maps (M <= N).
Something like this is what you want:
(def data
[{:id 1 :stuff :fred}
{:id 2 :stuff :barney}
{:id 3 :stuff :wilma}
{:id 4 :stuff :betty}])
(let [ids-wanted #{1 3}
data-wanted (filterv
(fn [item]
(contains? ids-wanted (:id item)))
data)]
(println data-wanted))
with result:
[{:id 1, :stuff :fred}
{:id 3, :stuff :wilma}]
Be sure to use the Clojure CheatSheet: http://jafingerhut.github.io/cheatsheet/clojuredocs/cheatsheet-tiptip-cdocs-summary.html
I like filterv over plain filter since it always gives a non-lazy result (i.e. a Clojure vector).

You are squashing all your maps into one. First thing, for sake of performance, is to change your list of IDs into a set, then simply filter.
(let [ids (into #{} [2489 3456 4567 5689])]
(filter (comp ids :id) record-sets))
This will give you the sequence of correct maps. If you want to covert this sequence of maps into a map keyed by ID, you can do this:
(let [ids (into #{} [2489 3456 4567 5689])]
(->> record-sets
(filter (comp ids :id))
(into {} (map (juxt :id identity)))))

Another way to do this could be with the use of select-keys functions in Clojure
select-keys returns a map of only the keys given to the function
so given that your data is a list of maps we can convert it into a hash-map of ids using group-by and then call select-keys on it
(def data
[{:id 1 :stuff :fred}
{:id 2 :stuff :barney}
{:id 3 :stuff :wilma}
{:id 4 :stuff :betty}])
(select-keys (group-by :id data) [1 4])
; => {1 [{:id 1, :stuff :fred}], 4 [{:id 4, :stuff :betty}]}
However now the values is a map of ids. So in order to get the orignal structure back we need get all the values in the map and then flatten the vectors
; get all the values in the map
(vals (select-keys (group-by :id data) [1 4]))
; => ([{:id 1, :stuff :fred}] [{:id 4, :stuff :betty}])
; flatten the nested vectors
(flatten (vals (select-keys (group-by :id data) [1 4])))
; => ({:id 1, :stuff :fred} {:id 4, :stuff :betty})
Extracting the values and flattening might seem a bit inefficient but i think its less complex then the nested loop that needs to be done in the filter based methods.
You can using the threading macro to compose all the steps together
(-> (group-by :id data)
(select-keys [1 4])
vals
flatten)
Another thing that you can do is to store the data as a map of ids from the beginning this way using select keys wont require group-by and the result wont require flattening.
Update all keys in a map
(update-values (group-by :id data) first)
; => {1 {:id 1, :stuff :fred}, 2 {:id 2, :stuff :barney}, 3 {:id 3, :stuff :wilma}, 4 {:id 4, :stuff :betty}}
This would probably be the most efficient for this problem but this structure might not work for every case.

Clojure: Find missing records in a collection based on another collection

I have 2 vectors: employ and emp-income. I want to loop thru emp-income based on employ to find what all the missing records. In this case, it's missing id = 2. And i want to create the missing record in emp-income and set the income as the previous record's income value. What is the best way to do it in clojure?
(def employ
[{:id 1 :name "Aaron"}
{:id 2 :name "Ben"}
{:id 3 :name "Carry"}])
from:
(def emp-income
[{:emp-id 1 :income 1000}
{:emp-id 3 :income 2000}])
to:
(def emp-income
[{:emp-id 1 :income 1000}
{:emp-id 2 :income 1000}
{:emp-id 3 :income 2000}])

You could use:
(let [emp-id->income (into {} (map (fn [rec] [(:emp-id rec) rec]) emp-income))]
(reduce (fn [acc {:keys [id]}]
(let [{:keys [income]} (or (get emp-id->income id) (peek acc))]
(conj acc {:emp-id id :income income})))
[]
employ))
Note this will create a record of {:emp-id id :income nil} if the first record is not found in emp-income. It will also use the last :emp-id encountered if duplicate :emp-id values are found within emp-income.

Clojure apply list of functions to list of arguments

I have a list of functions which adjust price, and a list of products.
As an output, I expect to have the same list of products, but with adjusted prices.
Technically, there is a list of functions and a list of maps.
What I'm trying to achieve is to apply each function sequentially to each map in a list, while preserving initial structure
(def products
[{:id 1 :price 100}
{:id 2 :price 200}])
(defn change-price
"increase by 10 => (change-price product 10 +)"
[product amount price-func]
(update product :price #(int (price-func amount %))))
(defn adjust-price
[products]
;;fn-list is a list of price adjuster functions
(let [fn-list [#(change-price % 10 +)
#(change-price % 50 -)]]
;; so I map a function to each product
;; which applies all adjsuter functions to that product
(merge (map (fn [prod]
(map #(% prod) fn-list)) products)))
It seems I don't understand how to reduce the result properly, because what I'm getting is a nested list like
(change-price products)
=> (({:id 1, :price 110} {:id 1, :price -50})
({:id 2, :price 210} {:id 2, :price -150}))
But I expect
({:id 1, :price 60} {:id 2, :price 160})
Thank you in advance.

It seems that you want to apply a composition of your functions:
(defn adjust-price
[products]
(let [fn-list [#(change-price % 10 +)
#(change-price % 50 -)]
f (apply comp fn-list)]
(map f products)))

the thing is map doesn't 'squash' results : it just makes list[n] => list[n].
what you need is reduce, something like this:
user> (let [fn-list [#(change-price % 10 +)
#(change-price % 50 -)]]
(map (fn [p] (reduce #(%2 %1) p fn-list))
products))
;;=> ({:id 1, :price -60} {:id 2, :price -160})
also you would have to rewrite your change-price function, since it has the wrong number of args here: (price-func amount %) => (price-func % amount)

You aren't mutating the hashmap passed in so when you call two different functions on an item the way you are, you will get to separate results.
In the 'adjust price' function, since you are using 'map' to go through the 'change price' functions, you are currently saying, run the first change price function once, return a value, then run the second price function once, return a separate value resulting in:
({:id 1, :price 110} {:id 1, :price -50})
The above answer is good, just thought I'd add another way to do it using a threaded function so that you don't have to worry about order.
(defn adjust-price
[products]
(let [f #(-> %
(change-price 10 +)
(change-price 50 -))]
(map f products)))
remember, single thread '->' means that you are passing the result of the current line(function) down to the next line(function), and it will be used as the first parameter
(ps. I know this is an old post, but hopefully this help someone else in the future:)

sort-by on a lazy sequence of hash-maps in clojure

I need to take 20 results from a lazy sequence of millions of hash-maps but for the 20 to be based on sorting on various values within the hashmaps.
For example:
(def population [{:id 85187153851 :name "anna" :created #inst "2012-10-23T20:36:25.626-00:00" :rank 77336}
{:id 12595145186 :name "bob" :created #inst "2011-02-03T20:36:25.626-00:00" :rank 983666}
{:id 98751563911 :name "cartmen" :created #inst "2007-01-13T20:36:25.626-00:00" :rank 112311}
...
{:id 91514417715 :name "zaphod" :created #inst "2015-02-03T20:36:25.626-00:00" :rank 9866}]
In normal circumstances a simple sort-by would get the job done:
(sort-by :id population)
(sort-by :name population)
(sort-by :created population)
(sort-by :rank population)
But I need to do this across millions of records as fast as possible and want to do it lazily rather than having to realize the entire data set.
I looked around a lot and found a number of implementations of algorithms that work really well for sorting a sequence of values (mostly numeric) but none for a lazy sequence of hash-maps in the way I need.
Speed & efficiency being of prime importance, the best I have found has been the quicksort example from the Joy Of Clojure book (Chapter 6.4) which does just enough work to return the required result.
(ns joy.q)
(defn sort-parts
"Lazy, tail-recursive, incremental quicksort. Works against
and creates partitions based on the pivot, defined as 'work'."
[work]
(lazy-seq
(loop [[part & parts] work]
(if-let [[pivot & xs] (seq part)]
(let [smaller? #(< % pivot)]
(recur (list*
(filter smaller? xs)
pivot
(remove smaller? xs)
parts)))
(when-let [[x & parts] parts]
(cons x (sort-parts parts)))))))
(defn qsort [xs]
(sort-parts (list xs)))
Works really well...
(time (take 10 (qsort (shuffle (range 10000000)))))
"Elapsed time: 551.714003 msecs"
(0 1 2 3 4 5 6 7 8 9)
Great! But...
However much I try I can't seem to work out how to apply this to a sequence of hashmaps.
I need something like:
(take 20 (qsort-by :created population))

If you only need the top N elements a full sort is too expensive (even a lazy sort as the one in the JoC: it needs to keep nearly the all data set in memory).
You only need to scan (reduce) the dataset and keep the best N items so far.
=> (defn top-by [n k coll]
(reduce
(fn [top x]
(let [top (conj top x)]
(if (> (count top) n)
(disj top (first top))
top)))
(sorted-set-by #(< (k %1) (k %2))) coll))
#'user/top-by
=> (top-by 3 first [[1 2] [10 2] [9 3] [4 2] [5 6]])
#{[5 6] [9 3] [10 2]}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Clojure - Working with list and date-time - list

Related

Aggregating transducers with intermediate values

How to filter a map comparing it with another collection

Clojure: Find missing records in a collection based on another collection

Clojure apply list of functions to list of arguments

sort-by on a lazy sequence of hash-maps in clojure

Categories

Resources