How do I undo or reverse a transaction in datomic? - clojure

I committed a transaction to datomic accidentally and I want to "undo" the whole transaction. I know exactly which transaction it is and I can see its datoms, but I don't know how to get from there to a rolled-back transaction.

The basic procedure:
Retrieve the datoms created in the transaction you want to undo. Use the transaction log to find them.
Remove datoms related to the transaction entity itself: we don't want to retract transaction metadata.
Invert the "added" state of all remaining datoms, i.e., if a datom was added, retract it, and if it was retracted, add it.
Reverse the order of the inverted datoms so the bad-new value is retracted before the old-good value is re-asserted.
Commit a new transaction.
In Clojure, your code would look like this:
(defn rollback
"Reassert retracted datoms and retract asserted datoms in a transaction,
effectively \"undoing\" the transaction.
WARNING: *very* naive function!"
[conn tx]
(let [tx-log (-> conn d/log (d/tx-range tx nil) first) ; find the transaction
txid (-> tx-log :t d/t->tx) ; get the transaction entity id
newdata (->> (:data tx-log) ; get the datoms from the transaction
(remove #(= (:e %) txid)) ; remove transaction-metadata datoms
; invert the datoms add/retract state.
(map #(do [(if (:added %) :db/retract :db/add) (:e %) (:a %) (:v %)]))
reverse)] ; reverse order of inverted datoms.
#(d/transact conn newdata))) ; commit new datoms.

This is not meant as an answer to the original question, but for those coming here from Google looking for inspiration for how to rollback a datascript transaction. I didn't find documentation about it, so I wrote my own:
(defn rollback
"Takes a transaction result and reasserts retracted
datoms and retracts asserted datoms, effectively
\"undoing\" the transaction."
[{:keys [tx-data]}]
; The passed transaction result looks something like
; this:
;
; {:db-before
; {1 :post/body,
; 1 :post/created-at,
; 1 :post/foo,
; 1 :post/id,
; 1 :post/title},
; :db-after {},
; :tx-data
; [#datascript/Datom [1 :post/body "sdffdsdsf" 536870914 false]
; #datascript/Datom [1 :post/created-at 1576538572631 536870914 false]
; #datascript/Datom [1 :post/foo "foo" 536870914 false]
; #datascript/Datom [1 :post/id #uuid "a21ad816-c509-42fe-a1b7-32ad9d3931ef" 536870914 false]
; #datascript/Datom [1 :post/title "123" 536870914 false]],
; :tempids {:db/current-tx 536870914},
; :tx-meta nil}))))
;
; We want to transform each datom into a new piece of
; a transaction. The last field in each datom indicates
; whether it was added (true) or retracted (false). To
; roll back the datom, this boolean needs to be inverted.
(let [t
(map
(fn [[entity-id attribute value _ added?]]
(if added?
[:db/retract entity-id attribute value]
[:db/add entity-id attribute value]))
tx-data)]
(transact t)))
You use it by first capturing a transaction's return value, then passing that return value to the rollback fn:
(let [tx (transact [...])]
(rollback tx))
Be careful though, I'm new to the datascript/Datomic world, so there might be something I am missing.

Related

How to execute parallel transactions in Clojure

I have a sequence of customers that needs to be processed in parallel. I tried to use a pmap for that. The result is painfully slow, much slower than a sequential implementation. The inner function process-customer has a transaction. Obviously, the pmap launches all the transactions at once and they end up retrying killing performance. What is thee best way to parallelize this?
(defn process-customers [customers]
(doall
(pmap
(fn [sub-customers]
(doseq [customer sub-customers]
(process-customer customer)))
(partition-all 10 customers))))
EDIT:
The process-customer function involves the below steps. I write the steps for brevity. All the steps are inside a transaction to ensure another parallel transaction does not cause inconsistencies like negative stock.
(defn- process-customer [customer]
"Process `customer`. Consists of three steps:
1. Finding all stores in which the requested products are still available.
2. Sorting the found stores to find the cheapest (for the sum of all products).
3. Buying the products by updating the `stock`.
)
EDIT 2: The below version of process-customers has the same performance as the parallel process-customers above. The below is obviously sequential.
(defn process-customers [customers]
"Process `customers` one by one. In this code, this happens sequentially."
(doseq [customer customers]
(process-customer customer)))
I assume your transaction is locking on the inventory for the full life cycle of process-customer. This will be slow as all customers are racing for the same universe of stores. If you can split the process into two phases: 1) quoting and 2) fulfilling and applies transaction only on (2) then the performance should be much better. Or if you buy into agent programming, you will have transaction boundary automatically defined for you at the message level. Here is one sample you can consider:
(defn get-best-deal
"Returns the best deal for a given order with given stores (agent)"
[stores order]
;;
;; request for quotation from 1000 stores (in parallel)
;;
(doseq [store stores]
(send store get-quote order))
;;
;; wait for reply, up to 0.5s
;;
(apply await-for 500 stores)
;;
;; sort and find the best store
;;
(when-let [best-store (->> stores
(filter (fn [store] (get-in #store [:quotes order])))
(sort-by (fn [store] (->> (get-in #store [:quotes order])
vals
(reduce +))))
first)]
{:best-store best-store
:invoice-id (do
;; execute the order
(send best-store fulfill order)
;; wait for the transaction to complete
(await best-store)
;; get an invoice id
(get-in #best-store [:invoices order]))}))
and to find best deals from 1,000 stores for 100 orders (Total 289 line items) from 100 products:
(->> orders
(pmap (partial get-best-deal stores))
(filter :invoice-id)
count
time)
;; => 57
;; "Elapsed time: 312.002328 msecs"
Sample business logic:
(defn get-quote
"issue a quote by checking inventory"
[store {:keys [order-items] :as order}]
(if-let [quote (->> order-items
(reduce reduce-inventory
{:store store
:quote nil})
:quote)]
;; has inventory to generate a quote
(assoc-in store [:quotes order] quote)
;; no inventory
(update store :quotes dissoc order)))
(defn fulfill
"fulfill an order if previuosly quoted"
[store order]
(if-let [quote (get-in store [:quotes order])]
;; check inventory again and generate invoice
(let [[invoice inventory'] (check-inventory-and-generate-invoice store order)]
(cond-> store
invoice (->
;; register invoice
(assoc-in [:invoices order] invoice)
;; invalidate the quote
(update :quotes dissoc order)
;; update inventory
(assoc :inventory inventory'))))
;; not quoted before
store))

Clojure join multiple table

I am trying to solve the query in clojure. It has to return only one row. But it is returning multiple row. The function is picking the correct row from Customer table, but along with that it is fetching all the 3 row from queue table and all 4 row from service table. So totally I am getting 12 (1X3X4) row instead of only one. Any suggestion would be helpful. Thanks.
SELECT a.first_name, b.queue_name, c.service_name
from customer a
JOIN queue b
on a.queue_id = b.queue_id
JOIN service c
on a.service_id = c.service_id
where a.empl_id = 'BA123';
(defn query-using-map
"Generates a query using a honey sql map and returns the results"
[query-map]
(jdbc/with-db-transaction [conn {:datasource config/datasource}]
(let [sql-query (sql/format query-map)
query-result (jdbc/query conn sql-query)]
query-result)))
(defn get-servicename-queuename [emplid]
(jdbc/with-db-transaction [conn {:datasource config/datasource}]
(let [result (query-using-map
{:select [:a.first_name :b.queue_name :c.service_name]
:from [[:customer :a] [:queue :b] [:service :c]]
:join [":a.queue_id=:b.queue_id and :a.service_id=:c.service_id"]
; OR :join [[:= :a.queue_id :b.queue_id] [:= :a.service_id :c.service_id]]
:where [:= :a.empl_id emplid]})](println result))))
;(get-servicename-queuename "BA123")

Extraneous groupBy in spark DAG

According to the spark DAG vizualization there is a groupBy being performed in Stage 1 after the groupBy being performed in Stage 0. I only have one groupBy in my code and wouldn't expect any of the other transformations I'm doing to result in a groupBy.
Here's the code (clojure / flambo):
;; stage 0
(-> (.textFile sc path 8192)
(f/map (f/fn [msg] (json/parse-string msg true)))
(f/group-by (f/fn [msg] (:mmsi msg)) 8192)
;; stage 1
(f/map-values (f/fn [values] (sort-by :timestamp (vec values))))
(f/flat-map (ft/key-val-fn (f/fn [mmsi messages]
(let [state-map (atom {}) draught-map (atom {})]
(map #(mk-line % state-map draught-map) (vec messages))))))
(f/map (f/fn [line] (json/generate-string line)))
(f/save-as-text-file path)))
It's clear to me how Stage 0 is the sequence textFile, map, groupBy and Stage 1 is map-values, map-values, flat-map, map, saveAsTextFile, but where does the groupBy in stage 1 come from?
Since groupBy causes a shuffle which is computationally expensive and time-consuming I don't want an extraneous one if it can be helped.
There is no extraneous groupBy here. groupBy is a two-step process. The first step is a local map which transforms from x to (f(x), x). This is the part which is represented as a groupBy block in the Stage 0.
The second step is non-local groupByKey which is marked as a groupBy block in the Stage 1. Only this part requires shuffling.

How do I update a record in a vector, matching certain criteria?

I'm trying to update records in a vector, which match certain criteria.
(defrecord Item [id name description])
(def items
(ref [
(->Item "1" "Cookies" "Good tasting!")
(->Item "2" "Blueberries" "Healthy!")
])
)
How can I do e.g. "set name of item to "foo" where id is equal 1"?
I maybe need something like
(dosync (commute items ???? ))
Can't figure the ????
I found e.g. function update-in in docs
But 1. Can't find examples with records, 2. Not sure if I can use it to update a different field than the one I'm using to do the query. In the examples fields seem to be the same.
The complete use case: I have a webservice which update operation, where I get a map with the id of the item and optional fields which have to be updated.
I'm new to Clojure. I implemented remove function, by id, it works:
(commute items #(remove (fn [x](= (:id x) id)) %))
Also find by id, which might be a step in direction to update:
(nth (filtered (filter #(= (:id %) id) #items)) 0)
But don't know how to update the record in my vector...
You can use assoc to make a copy of a record with some keys replaced.
(dosync
(commute items
#(mapv (fn [i]
(if (= (:id i) "1")
(assoc i :name "foo")
i))
%)))

Getting the id of an inserted entity in datomic?

After I run a transaction in datomic to insert a value, how I can use the return value of the transaction to get the ids of any entities that were created?
Here is a sample of the return value I get after an insert:
#<promise$settable_future$reify__4841#7c92b2e9: {:db-before datomic.db.Db#62d0401f, :db-after datomic.db.Db#bba61dfc,
:tx-data [#Datum{:e 13194139534331 :a 50
:v #inst "2013-06-19T11:38:08.025-00:00"
:tx 13194139534331 :added true} #Datum{:e 17592186045436 .....
I can see the underlying datums...how can I extract their values?
Use d/resolve-tempid. If you were to transact a single entity, looking at :tx-data would work but if your transaction contained more than one entity, then you wouldn't know the order in which they appear in :tx-data.
What you should do is give temporary ids to your entities (before transacting them) using either (d/tempid) or its literal representation #db/id[:db.part/user _negativeId_] and then use d/resolve-tempid to go from your temporary id to the real id given by the database. The code would look something like:
(d/resolve-tempid (d/db conn) (:tempids tx) (d/tempid :db.part/user _negativeId_))
For a full code sample, see this gist.
Ah, figured it out.
I had to deref the Clojure promise, and then I was able to yank out the values I wanted:
(:e (second (:tx-data #(transact! conn query))))
Wrote a quick function based on a2ndrade's answer. The naming isn't ideal and I may be committing idiomatic faux pas; suggestions are very much welcome.
(ns my.datomic.util
(:require [datomic.api :as d]))
(defn transact-and-get-id
"Transact tx and return entity id."
[conn tx]
(let [tempid (:db/id tx)
post-tx #(d/transact conn [tx])
db (:db-after post-tx)
entid (d/resolve-tempid db (:tempids post-tx) tempid)]
entid))
Example usage:
(def my-conn
(d/connect (str "datomic:sql://datomic?jdbc:postgresql://"
"127.0.1:5432/datomic?user=datomic&password=somepw")
(defn thing-tx
"Create transaction for new thing."
[name]
{:db/id (d/tempid :db.part/user)
:thing/name name})
(transact-and-get-id my-conn (thing-tx "Bob")) ;; => 17592186045502
The Tupelo Datomic library has a function (td/eids tx-result) to easily extract the EIDs created in a transaction. For example:
; Create Honey Rider and add her to the :people partition
(let [tx-result #(td/transact *conn*
(td/new-entity :people ; <- partition is first arg (optional) to td/new-entity
{ :person/name "Honey Rider" :location "Caribbean" :weapon/type #{:weapon/knife} } ))
[honey-eid] (td/eids tx-result) ; retrieve Honey Rider's EID from the seq (destructuring)
]