Clojure doing multiple DB queries at once - clojure

What will be the best way to implement multiple DB queries or any sync operation on a new thread?
Lets take for example the following code:
(let [res1 (mysql/query...)
res2 (mysql/query...)
res3 (mysql/query...)
final (do-something res1 res2 res3)]
(http/ok final))
In this example res1 res2 res3 are not connected to each other.. meaning that you can execute all of them at the same time and then wait for all.
I would like that the 3 queries will go on the same time and then wait for all of them togehter.

Probably the simplest way is to use Clojure's built-in thread pool with future objects.
(let [res1 (future (mysql/query...))
res2 (future (mysql/query...))
res3 (future (mysql/query...))
final (do-something #res1 #res2 #res3)]
(http/ok final))
However, you will need to use separate Connection objects in the new threads because JDBC connections are not meant to be shared.

Related

Streaming data to the caller in JVM

I have a function which gets data periodically and then stops getting the data. This function has to return the data that it is fetching periodically to the caller of the function either
As and when it gets
At one shot
The 2nd one is an easy implementation i.e you block the caller, fetch all the data and then send it in one shot.
But I want to implement the 1st one (I want to avoid having callbacks). Is streams the things to be used here? If so, how? If not, how do I return something on which the caller can query for data and also stop when it returns a signal that there is no more data?
Note: I am on the JVM ecosystem, clojure to be specific. I have had a look at the clojure library core.async which kind of solves this kind of a problem with the use of channels. But I was thinking if there is any other way which probably looks like this (assuming streams is something that can be used).
Java snippet
//Function which will periodically fetch MyData until there is no data
public Stream<MyData> myFunction() {
...
}
myFunction().filter(myData -> myData.text.equals("foo"))
Maybe you can just use seq - which is lazy by default (like Stream) so caller can decide when to pull the data in. And when there are no more data myFunction can simply end the sequence. While doing this, you would also encapsulate some optimisation within myFunction - e.g. to get data in batch to minimise roundtrips. Or fetch data periodically per your original requirement.
Here is one naive implementation:
(defn my-function []
(let [batch 100]
(->> (range)
(map #(let [from (* batch %)
to (+ from batch)]
(db-get from to)))
;; take while we have data from db-get
(take-while identity)
;; returns as one single seq/Stream
(apply concat))))
;; use it as a normal seq/Stream
(->> (my-function)
(filter odd?))
where db-get would be something like:
(defn db-get [from to]
;; return first 1000 records only, i.e. returns nil to signal completion
(when (< from 1000)
;; returns a range of records
(range from to)))
You might want to check https://github.com/ReactiveX/RxJava and https://github.com/ReactiveX/RxClojure (seems no longer maintained?)

Is it possible to replicate a transaction deadlock in Clojure?

In SQL it is relatively easy to replicate a transaction deadlock.
==SESSION1==
begin tran
update table1 set ... where ...
[hold off further action - begin on next session]
==SESSION2==
begin
update table1 set ... where ...
[hold off further action - begin on next session]
==SESSION3==
<list blocked transactions - see session2>
Now with Clojure transactions - you can't just open them and leave them open, the s-expressions don't let you do that.
So I'm curious with respect to the scenario above.
My question is: Is it possible to replicate a transaction deadlock in Clojure?
The STM in Clojure is designed to provide atomic, consistent and isolated actions on refs, without locking. Several features are implemented in order to achieve this as described in refs documentation, but one of the main point is to have a "optimistic" strategy, which handles version of data for each transaction and compare this version and the ref version at write time.
This kind of optimistic strategy can be implemented in database as well, e.g. in Oracle.
Anyway, in Clojure, if you really want to create a deadlock, you will have to use low level mecanism, for instance the locking macro that explicitly create a lock on an object (same as synchronized in Java) and manage explicitly the access to the shared resource.
EDIT : Example of livelock
This example comes from Clojure Programming, #cgrand and al.
(let [retry-count (agent 0)
x (ref 0)]
(try
(dosync ;; transaction A
#(future (dosync ;; transaction B
(send-off retry-count inc)
(ref-set x 1)))
(ref-set x 2))
(catch Exception e (println (str "caught exception: " (.getMessage e))))
(finally
(await retry-count)))
[#x #retry-count])
caught exception: Transaction failed after reaching retry limit
[1 10000]
user>
Transaction A executes in the repl thread. Transaction B will execute in a separate thread, but since the future is deref in A, it blocks until B finishes. When A tries (ref-set x 2), x has already been modified by B, and this trigger a retry of A, which spawn a new thread and B transaction... until the maximum retries is reached and an exception raised.
According to Rich Hickey:
Clojure's STM and agent mechanisms are deadlock-free. They are not message-passing systems with blocking selective receive. The STM uses locks internally, but does lock-conflict detection and resolution automatically.
More details can be found in this group.

Clojure Producer Consumer

I am learning clojure and try out its concurrency and effectiveness via a producer consumer example.
Did that and it felt pretty akward with having to use ref and deref and also watch and unwatch.
I tried to check other code snippet; but is there a better way of re factoring this other than using Java Condition await() and signal() methods along with Java Lock. I did not want to use anything in Java.
here is the code; i guess I would have made many mistakes here with my usage...
;a simple producer class
(ns my.clojure.producer
(:use my.clojure.consumer)
(:gen-class)
)
(def tasklist( ref (list) )) ;this is declared as a global variable; to make this
;mutable we need to use the fn ref
(defn gettasklist[]
(deref tasklist) ;we need to use deref fn to return the task list
)
(def testagent (agent 0)); create an agent
(defn emptytasklist[akey aref old-val new-val]
(doseq [item (gettasklist)]
(println(str "item is") item)
(send testagent consume item)
(send testagent increment item)
)
(. java.lang.Thread sleep 1000)
(dosync ; adding a transaction for this is needed to reset
(remove-watch tasklist "key123"); removing the watch on the tasklist so that it does not
; go to a recursive call
(ref-set tasklist (list ) ) ; we need to make it as a ref to reassign
(println (str "The number of tasks now remaining is=") (count (gettasklist)))
)
(add-watch tasklist "key123" emptytasklist)
)
(add-watch tasklist "key123" emptytasklist)
(defn addtask [task]
(dosync ; adding a transaction for this is needed to refset
;(println (str "The number of tasks before") (count (gettasklist)))
(println (str "Adding a task") task)
(ref-set tasklist (conj (gettasklist) task )) ; we need to make it as a ref to reassign
;(println (str "The number of tasks after") (count (gettasklist)))
)
)
Here is the consumer code
(ns my.clojure.consumer
)
(defn consume[c item]
(println "In the consume method:Item is " c item )
item
)
(defn increment [c n]
(println "parmeters are" c n)
(+ c n)
)
And here is the test code ( I have used maven to run clojure code and used NetBeans to edit as this is more familiar to me coming from Java - folder structure and pom at - https://github.com/alexcpn/clojure-evolve
(ns my.clojure.Testproducer
(:use my.clojure.producer)
(:use clojure.test)
(:gen-class)
)
(deftest test-addandcheck
(addtask 1)
(addtask 2)
(is(= 0 (count (gettasklist))))
(println (str "The number of tasks are") (count (gettasklist)))
)
If anybody can refactor this lightly so that I can read and understand the code then it will be great; Else I guess I will have to learn more
Edit -1
I guess using a global task list and making it available to other functions by de-referencing it (deref) and again making it mutable by ref is not the way in clojure;
So changing the addTask method to directly send the incoming tasks to an agent
(defn addtask [task]
(dosync ; adding a transaction for this is needed to refset
(println (str "Adding a task") task)
;(ref-set tasklist (conj (gettasklist) task )) ; we need to make it as a ref to reassign
(def testagent (agent 0)); create an agent
(send testagent consume task)
(send testagent increment task)
)
However when I tested it
(deftest test-addandcheck
(loop [task 0]
(when ( < task 100)
(addtask task)
(recur (inc task))))
(is(= 0 (count (gettasklist))))
(println (str "The number of tasks are") (count (gettasklist)))
)
after sometime the I am getting Java rejected execution exception -- This is fine if you do Java threads, because you take full control. But from clojure this looks odd, especially since you are not selecting the ThreadPool stratergy yourself
Adding a task 85
Exception in thread "pool-1-thread-4" java.util.concurrent.RejectedExecutionExce
ption
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution
(ThreadPoolExecutor.java:1759)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.jav
a:767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.ja
va:658)
at clojure.lang.Agent$Action.execute(Agent.java:56)
at clojure.lang.Agent$Action.doRun(Agent.java:95)
at clojure.lang.Agent$Action.run(Agent.java:106)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
utor.java:885)Adding a task 86
Adding a task 87
I think modeling producer & consumer in clojure would be done the easiest (and most efficiently) using lamina channels.
I have made similar clojure program with producer/consumer pattern in Computing Folder Sizes Asynchronously.
I don't think that you really need to use Refs.
You have single tasklist that is mutable state that you want to alter synchronously. Changes are relatively quick and do not depend on any other external state, only the variable that comes in to consumer.
As far with atoms there is swap! function which can help you do the changes the way I understood you need.
You can look at my snippet Computing folder size i think it can at least show you proper use of atoms & agents. I played with it a lot, so it should be correct.
Hope it helps!
Regards,
Jan
I was looking at the Clojure examples for Twitter storm. He just used LinkedBlockingQueue. It's easy to use, concurrent and performs well. Sure, it lacks the sex-appeal of an immutable Clojure solution, but it will work well.
I've come across several use cases where I need the ability to:
strictly control the number of worker threads on both the producer and consumer side
control the maximum size of the "work queue" in order to limit memory consumption
detect when all work has been completed so that I can shut down the workers
I've found that the clojure built-in concurrency features (while amazingly simple and useful in their own right) make the first two bullet points difficult. lamina looks great, but I didn't see a way that it would address my particular use cases w/o the same sort of extra plumbing that I'd need to do around an implementation based on BlockingQueue.
So, I ended up hacking together a simple clojure library to try to solve my problems. It is basically just a wrapper around BlockingQueue that attempts to conceal some of the Java constructs and provide a higher-level producer-consumer API. I'm not entirely satisfied with the API yet; it'll likely evolve a bit further... but it's operational:
https://github.com/cprice-puppet/freemarket
Example usage:
(def myproducer (producer producer-work-fn num-workers max-work))
(def myconsumer (consumer myproducer consumer-work-fn num-workers max-results))
(doseq [result (work-queue->seq (:result-queue myconsumer))]
(println result))
Feedback / suggestions / contributions would be welcomed!

idiomatic file locking in clojure?

I have a group of futures processing jobs from a queue that involve writing to files. What's the idiomatic way to make sure only one future accesses a particular file at a time?
How about using agents instead of locks to ensure this?
I think using agents to safe guard shared mutable state, regardless if it's in memory or on disk is more idiomatic in clojure than using locks.
If you create one agent at a time and send the access tries to the agents, you can ensure that only on thread at time accesses a given file.
For example like this:
(use 'clojure.contrib.duck-streams)
(defn file-agent [file-name]
(add-watch (agent nil) :file-writer
(fn [key agent old new]
(append-spit file-name new))))
(defn async-append [file-agent content]
(send file-agent (constantly content)))
then append your file through the agent:
(async-append "content written to file" (file-agent "temp-file-name"))
If you need synchronous usage of the file it could be achieved with await. Like this:
(defn sync-append [file-agent content]
(await (send file-agent (constantly content))))
I would use the core Clojure function locking which is used as follows:
(locking some-object
(do-whatever-you-like))
Here some-object could either be the file itself, or alternatively any arbitrary object that you want to synchronise on (which might make sense if you wanted a single lock to protect multiple files).
Under the hood this uses standard JVM object locking, so it's basically equivalent to a synchronized block of code in Java.
I don't think there is specific built-in function for this in Clojure but you can use standard java IO functions to do this. This would look something like this:
(import '(java.io File RandomAccessFile))
(def f (File. "/tmp/lock.file"))
(def channel (.getChannel (RandomAccessFile. f "rw")))
(def lock (.lock channel))
(.release lock)
(.close channel)

How can I create a constantly running background process in Clojure?

How can I create a constantly running background process in Clojure? Is using "future" with a loop that never ends the right way?
You could just start a Thread with a function that runs forever.
(defn forever []
;; do stuff in a loop forever
)
(.start (Thread. forever))
If you don't want the background thread to block process exit, make sure to make it a daemon thread:
(doto
(Thread. forever)
(.setDaemon true)
(.start))
If you want some more finesse you can use the java.util.concurrent.Executors factory to create an ExecutorService. This makes it easy to create pools of threads, use custom thread factories, custom incoming queues, etc.
The claypoole lib wraps some of the work execution stuff up into a more clojure-friendly api if that's what you're angling towards.
My simple higher-order infinite loop function (using futures):
(def counter (atom 1))
(defn infinite-loop [function]
(function)
(future (infinite-loop function))
nil)
;; note the nil above is necessary to avoid overflowing the stack with futures...
(infinite-loop
#(do
(Thread/sleep 1000)
(swap! counter inc)))
;; wait half a minute....
#counter
=> 31
I strongly recommend using an atom or one of Clojures other reference types to store results (as per the counter in the example above).
With a bit of tweaking you could also use this approach to start/stop/pause the process in a thread-safe manner (e.g. test a flag to see if (function) should be executed in each iteration of the loop).
Maybe, or perhaps Lein-daemon? https://github.com/arohner/lein-daemon