Future in closure not firing (Clojure) - clojure

I have a closure in which a future takes a do block. Each function inside the do block is provided by the arguments of the closure:
(defn accept-order
[persist record track notify log]
(fn [sponsor order]
(let [datetime (to-timestamp (local-now))
order (merge order {:network_reviewed_at datetime
:workflow_state "unconfirmed"
:sponsor_id (:id sponsor)})]
(future
(do
(persist order
(select-keys order [:network_reviewed_at
:workflow_state
:sponsor_id]))
(record sponsor order true)
(track)
(notify sponsor order)
(log sponsor order)))
order)))
No function in the do block is fired. If I deref the future, it works. If I remove the future it works. If I run from a REPL, it works. But if I run lein test, it won't work.
Any ideas? Thank you!

Adding a (Thread/sleep 2000) to a test invoking your function causes the future to run, so I'd venture a guess that Leiningen is killing the VM before your future gets to run (or at least before it manages to cause its side effects). Leiningen does kill the VM immediately after running tests.
As a side note, you don't need the do. future takes a body, not a single expression.

Related

NullPointerException in Storm when running topology

I am getting NullPointerExceptions in backtype.storm.utils.DisruptorQueue.consumeBatchToCursor method when running my topology, specifically in a bolt. Spouts are duly executed.
Storm's troubleshooting page says that it might be due to multiple threads issuing methods on the OutputCollector. However, i cannot see where does it relate to my case.
Here's the code for the spout:
(defspout stub-spout ["stub-spout"]
[conf context collector]
(spout
(nextTuple []
(let [channel-value (<!! storm-async-channel)]
(emit-spout! collector [channel-value])))
(ack [id]
))))
and for the bolt:
(defbolt stub-bolt ["stub-bolt"] [tuple collector]
(println "Invocation!")
(let [obj (get tuple "object")
do-some-calculations (resolve 'calclib/do-some-calculations)
new-obj (do-some-calculations obj)]
(emit-bolt! collector new-obj)))
After some investigation it turned out that the call to resolve returns null (i need to resolve during runtime as some calculation occurs in a macro located in calclib).
The code runs properly in local cluster though. Why is this happening?
Will be grateful for any suggestions.
Thanks!
I think i've found a solution. The bolt definition is changed to a prepared bolt:
(defbolt stub-bolt ["stub-bolt"]
{:prepare true}
[conf context collector]
(let [f (load "/calclib/core")
do-some-calculations (resolve 'calclib/do-some-calculations)]
(bolt
(execute [tuple]
(let [obj (get tuple "object")
new-obj (do-some-calculations obj)]
(emit-bolt! collector new-obj))))))
Key is the call to load. I wonder if there's a more elegant approach though.

Defining an agent lazily

I am creating an agent for writing changes back to a database (as discussed in the agent-based write-behind log in Clojure Programming).
This is working fine, but I am struggling to create the agent late. I don't want to create it as a def, as I don't want it created when my tests are running (I see the pool starting up when the tests load the forms even though I use with-redefs to set a test value).
The code I started with is (using c3p0 pooling):
(def dba (agent (pool/make-datasource-spec (u/load-config "db.edn"))))
I tried making the agent nil, and investigated how I could set it in the main of my application, when it's really needed. But there doesn't seem to be an equivalent reset! function as there is with an atom. And the following code also failed saying the agent wasn't in error so didn't need restarting:
(when (not #dba)
(restart-agent dba (create-db-pool)))
So at the moment, I have an atom containing the agent, where I then do:
(def dba (atom nil))
;; ...
(defn init-db! []
(when (not #dba)
(log/info "Creating agent for pooled connection")
(reset! dba (agent (create-db-pool))))
But the very fact I'm having to do ##dba to reference the contents of the agent (i.e. the pool) makes me think this is insane.
Is there a more obvious way of creating the pool agent lazily?
delay is useful for cases like this. It causes the item to be created the first time it is read. so if your tests don't read it it will not be created.
user=> (def my-agent (delay (do (println "im making the agent now") (agent 0))))
#'user/my-agent
user=>
user=> #my-agent
im making the agent now
#object[clojure.lang.Agent 0x2cd73ca1 {:status :ready, :val 0}]

Update clojure code running in a thread

Now, I need some kind of code reloading in the clojure/java mixing system.
I know tools.namespace lib can be used to refresh clj code. But when I try to reload code which is used in a thread, it is not updated.
This is the example clojure code I need to update: (just switch between upper case and lower case.)
(ns com.test.test-util)
(defn get-word [sentence]
(let [a-word (rand-nth (clojure.string/split sentence #"\s"))
;;a-word (clojure.string/upper-case a-word)
]
(println a-word)
a-word))
I used a future to start a thread and call the code above.
In main method, in my first test, I just create a future and refresh the code after a while. But it is not updated. So, I also tested to create another thread after refresh. But the code is not updated.
(defn start-job []
(future (
(println "start worker job.")
(while #job-cond
(do (get-word (rand-nth sentences))
(Thread/sleep 2000)) )
(println "return!")
)))
(defn -main []
(let [work-job (start-job)]
(println "modify clj file...")
(Thread/sleep 10000)
(swap! job-cond not)
(future-cancel work-job)
(swap! job-cond not)
(refresh)
(Thread/sleep 3000)
(let [new-work-job (start-job)]
(Thread/sleep 10000)
(swap! job-cond not)
(future-cancel new-work-job))))
Thanks.
In general you don't want code-reloading to affect runtime. Rather you should try to seek for clean ways to boot and shutdown the running application code using e. g. Stuart Sierras library component or juxt/jig by Malcolm Sparks. Then reload namespaces only when your application is not up and running (or at least not the parts of running code that depend on namespaces being reloaded).
That being said, here is why your example does not work:
start-job can not automatically refer to a reloaded version of get-word in its lifetime because the version of get-work it uses is resolved when it is compiled. So it will still use the old version.
Explicitly resolve the var of get-word using
(resolve `get-word)
and invoke the returned var as a function instead of using get-word directly in the future.
What will happen then?
After you made the changes and refresh has been invoked, -main will still refer to the old-version of start-job (so if you change that, it will not show effect in runtime). However within that get-word will be resolved as described above and the new version of get-word will be used.
Short explanation: You can't expect a running compiled function to replace itself with reloaded code on the fly (no matter whether it runs on a separate thread or not). That would (unless you are using resolve in it) be a necessity though for it to invoke reloaded functions.

Clojure / Jetty: Force URL to only be Hit Once at a Time

I am working on a Clojure / Jetty web service. I have a special url that I want to only be serviced one request at a time. If the url was requested, and before it returns, the url is requested again, I want to immediately return. So in more core.clj, where I defined my routes, I have something like this:
(def work-in-progress (ref false))
Then sometime later
(compojure.core/GET "/myapp/internal/do-work" []
(if #work-in-progress
"Work in Progress please try again later"
(do
(dosync
(ref-set work-in-progress true))
(do-the-work)
(dosync
(ref-set rebuild-in-progress false))
"Job completed Successfully")))
I have tried this on local Jetty server but I seem to be able to hit the url twice and double the work. What is a good pattern / way to implement this in Clojure in a threaded web server environment?
Imagine a following race condition for the solution proposed in the question.
Thread A starts to execute handler's body. #work-in-progress is false, so it enters the do expression. However, before it managed to set the value of work-in-progress to true...
Thread B starts to execute handler's body. #work-in-progress is false, so it enters the do expression.
Now two threads are executing (do-the-work) concurrently. That's not what we want.
To prevent this problem check and set the value of the ref in a dosync transaction.
(compojure.core/GET "/myapp/internal/do-work" []
(if (dosync
(when-not #work-in-progress
(ref-set work-in-progress true)))
(try
(do-the-work)
"Job completed Successfully"
(finally
(dosync
(ref-set work-in-progress false))))
"Work in Progress please try again later"))
Another abstraction which you might find useful in this scenario is an atom and compare-and-set!.
(def work-in-progress (atom false))
(compojure.core/GET "/myapp/internal/do-work" []
(if (compare-and-set! work-in-progress false true)
(try
(do-the-work)
"Job completed Successfully"
(finally
(reset! work-in-progress false)))
"Work in Progress please try again later"))
Actually this is the natural use case for a lock; in particular, a java.util.concurrent.locks.ReentrantLock.
The same pattern came up in my answer to an earlier SO question, Canonical Way to Ensure Only One Instance of a Service Is Running / Starting / Stopping in Clojure?; I'll repeat the relevant piece of code here:
(import java.util.concurrent.locks.ReentrantLock)
(def lock (ReentrantLock.))
(defn start []
(if (.tryLock lock)
(try
(do-stuff)
(finally (.unlock lock)))
(do-other-stuff)))
The tryLock method attempts to acquire the lock, returning true if it succeeds in doing so and false otherwise, not blocking in either case.
Consider queueing the access to the resource as well - in addition to getting an equivalent functionality to that of locks/flags, queues let you observe the resource contention, among other advantages.

Clojure Producer Consumer

I am learning clojure and try out its concurrency and effectiveness via a producer consumer example.
Did that and it felt pretty akward with having to use ref and deref and also watch and unwatch.
I tried to check other code snippet; but is there a better way of re factoring this other than using Java Condition await() and signal() methods along with Java Lock. I did not want to use anything in Java.
here is the code; i guess I would have made many mistakes here with my usage...
;a simple producer class
(ns my.clojure.producer
(:use my.clojure.consumer)
(:gen-class)
)
(def tasklist( ref (list) )) ;this is declared as a global variable; to make this
;mutable we need to use the fn ref
(defn gettasklist[]
(deref tasklist) ;we need to use deref fn to return the task list
)
(def testagent (agent 0)); create an agent
(defn emptytasklist[akey aref old-val new-val]
(doseq [item (gettasklist)]
(println(str "item is") item)
(send testagent consume item)
(send testagent increment item)
)
(. java.lang.Thread sleep 1000)
(dosync ; adding a transaction for this is needed to reset
(remove-watch tasklist "key123"); removing the watch on the tasklist so that it does not
; go to a recursive call
(ref-set tasklist (list ) ) ; we need to make it as a ref to reassign
(println (str "The number of tasks now remaining is=") (count (gettasklist)))
)
(add-watch tasklist "key123" emptytasklist)
)
(add-watch tasklist "key123" emptytasklist)
(defn addtask [task]
(dosync ; adding a transaction for this is needed to refset
;(println (str "The number of tasks before") (count (gettasklist)))
(println (str "Adding a task") task)
(ref-set tasklist (conj (gettasklist) task )) ; we need to make it as a ref to reassign
;(println (str "The number of tasks after") (count (gettasklist)))
)
)
Here is the consumer code
(ns my.clojure.consumer
)
(defn consume[c item]
(println "In the consume method:Item is " c item )
item
)
(defn increment [c n]
(println "parmeters are" c n)
(+ c n)
)
And here is the test code ( I have used maven to run clojure code and used NetBeans to edit as this is more familiar to me coming from Java - folder structure and pom at - https://github.com/alexcpn/clojure-evolve
(ns my.clojure.Testproducer
(:use my.clojure.producer)
(:use clojure.test)
(:gen-class)
)
(deftest test-addandcheck
(addtask 1)
(addtask 2)
(is(= 0 (count (gettasklist))))
(println (str "The number of tasks are") (count (gettasklist)))
)
If anybody can refactor this lightly so that I can read and understand the code then it will be great; Else I guess I will have to learn more
Edit -1
I guess using a global task list and making it available to other functions by de-referencing it (deref) and again making it mutable by ref is not the way in clojure;
So changing the addTask method to directly send the incoming tasks to an agent
(defn addtask [task]
(dosync ; adding a transaction for this is needed to refset
(println (str "Adding a task") task)
;(ref-set tasklist (conj (gettasklist) task )) ; we need to make it as a ref to reassign
(def testagent (agent 0)); create an agent
(send testagent consume task)
(send testagent increment task)
)
However when I tested it
(deftest test-addandcheck
(loop [task 0]
(when ( < task 100)
(addtask task)
(recur (inc task))))
(is(= 0 (count (gettasklist))))
(println (str "The number of tasks are") (count (gettasklist)))
)
after sometime the I am getting Java rejected execution exception -- This is fine if you do Java threads, because you take full control. But from clojure this looks odd, especially since you are not selecting the ThreadPool stratergy yourself
Adding a task 85
Exception in thread "pool-1-thread-4" java.util.concurrent.RejectedExecutionExce
ption
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution
(ThreadPoolExecutor.java:1759)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.jav
a:767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.ja
va:658)
at clojure.lang.Agent$Action.execute(Agent.java:56)
at clojure.lang.Agent$Action.doRun(Agent.java:95)
at clojure.lang.Agent$Action.run(Agent.java:106)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
utor.java:885)Adding a task 86
Adding a task 87
I think modeling producer & consumer in clojure would be done the easiest (and most efficiently) using lamina channels.
I have made similar clojure program with producer/consumer pattern in Computing Folder Sizes Asynchronously.
I don't think that you really need to use Refs.
You have single tasklist that is mutable state that you want to alter synchronously. Changes are relatively quick and do not depend on any other external state, only the variable that comes in to consumer.
As far with atoms there is swap! function which can help you do the changes the way I understood you need.
You can look at my snippet Computing folder size i think it can at least show you proper use of atoms & agents. I played with it a lot, so it should be correct.
Hope it helps!
Regards,
Jan
I was looking at the Clojure examples for Twitter storm. He just used LinkedBlockingQueue. It's easy to use, concurrent and performs well. Sure, it lacks the sex-appeal of an immutable Clojure solution, but it will work well.
I've come across several use cases where I need the ability to:
strictly control the number of worker threads on both the producer and consumer side
control the maximum size of the "work queue" in order to limit memory consumption
detect when all work has been completed so that I can shut down the workers
I've found that the clojure built-in concurrency features (while amazingly simple and useful in their own right) make the first two bullet points difficult. lamina looks great, but I didn't see a way that it would address my particular use cases w/o the same sort of extra plumbing that I'd need to do around an implementation based on BlockingQueue.
So, I ended up hacking together a simple clojure library to try to solve my problems. It is basically just a wrapper around BlockingQueue that attempts to conceal some of the Java constructs and provide a higher-level producer-consumer API. I'm not entirely satisfied with the API yet; it'll likely evolve a bit further... but it's operational:
https://github.com/cprice-puppet/freemarket
Example usage:
(def myproducer (producer producer-work-fn num-workers max-work))
(def myconsumer (consumer myproducer consumer-work-fn num-workers max-results))
(doseq [result (work-queue->seq (:result-queue myconsumer))]
(println result))
Feedback / suggestions / contributions would be welcomed!