Transform this Clojure call into a lazy sequence - clojure

I'm working with a messaging toolkit (it happens to be Spread but I don't know that the details matter). Receiving messages from this toolkit requires some boilerplate:
Create a connection to the daemon.
Join a group.
Receive one or more messages.
Leave the group.
Disconnect from the daemon.
Following some idioms that I've seen used elsewhere, I was able to cook up some working functions using Spread's Java API and Clojure's interop forms:
(defn connect-to-daemon
"Open a connection"
[daemon-spec]
(let [connection (SpreadConnection.)
{:keys [host port user]} daemon-spec]
(doto connection
(.connect (InetAddress/getByName host) port user false false))))
(defn join-group
"Join a group on a connection"
[cxn group-name]
(doto (SpreadGroup.)
(.join cxn group-name)))
(defn with-daemon*
"Execute a function with a connection to the specified daemon"
[daemon-spec func]
(let [daemon (merge *spread-daemon* daemon-spec)
cxn (connect-to-daemon daemon-spec)]
(try
(binding [*spread-daemon* (assoc daemon :connection cxn)]
(func))
(finally
(.disconnect cxn)))))
(defn with-group*
"Execute a function while joined to a group"
[group-name func]
(let [cxn (:connection *spread-daemon*)
grp (join-group cxn group-name)]
(try
(binding [*spread-group* grp]
(func))
(finally
(.leave grp)))))
(defn receive-message
"Receive a single message. If none are available, this will block indefinitely."
[]
(let [cxn (:connection *spread-daemon*)]
(.receive cxn)))
(Basically the same idiom as with-open, just that the SpreadConnection class uses disconnect instead of close. Grr. Also, I left out some macros that aren't relevant to the structural question here.)
This works well enough. I can call receive-message from inside of a structure like:
(with-daemon {:host "localhost" :port 4803}
(with-group "aGroup"
(... looping ...
(let [msg (receive-message)]
...))))
It occurs to me that receive-message would be cleaner to use if it were an infinite lazy sequence that produces messages. So, if I wanted to join a group and get messages, the calling code should look something like:
(def message-seq (messages-from {:host "localhost" :port 4803} "aGroup"))
(take 5 message-seq)
I've seen plenty of examples of lazy sequences without cleanup, that's not too hard. The catch is steps #4 and 5 from above: leaving the group and disconnecting from the daemon. How can I bind the state of the connection and group into the sequence and run the necessary cleanup code when the sequence is no longer needed?

This article describes how to do exactly that using clojure-contrib fill-queue. Regarding cleanup - the neat thing about fill-queue is that you can supply a blocking function that cleans itself up if there is an error or some condition reached. You can also hold a reference to the resource to control it externally. The sequence will just terminate. So depending on your semantic requirement you'll have to choose the strategy that fits.

Try this:
(ns your-namespace
(:use clojure.contrib.seq-utils))
(defn messages-from [daemon-spec group-name]
(let [cnx (connect-to-deamon daemon-spec))
group (connect-to-group cnx group-name)]
(fill-queue (fn [fill]
(if done?
(do
(.leave group)
(.disconnect cnx)
(throw (RuntimeException. "Finished messages"))
(fill (.receive cnx))))))
Set done? to true when you want to terminate the list. Also, any exceptions thrown in (.receive cnx) will also terminate the list.

Related

NullPointerException in Storm when running topology

I am getting NullPointerExceptions in backtype.storm.utils.DisruptorQueue.consumeBatchToCursor method when running my topology, specifically in a bolt. Spouts are duly executed.
Storm's troubleshooting page says that it might be due to multiple threads issuing methods on the OutputCollector. However, i cannot see where does it relate to my case.
Here's the code for the spout:
(defspout stub-spout ["stub-spout"]
[conf context collector]
(spout
(nextTuple []
(let [channel-value (<!! storm-async-channel)]
(emit-spout! collector [channel-value])))
(ack [id]
))))
and for the bolt:
(defbolt stub-bolt ["stub-bolt"] [tuple collector]
(println "Invocation!")
(let [obj (get tuple "object")
do-some-calculations (resolve 'calclib/do-some-calculations)
new-obj (do-some-calculations obj)]
(emit-bolt! collector new-obj)))
After some investigation it turned out that the call to resolve returns null (i need to resolve during runtime as some calculation occurs in a macro located in calclib).
The code runs properly in local cluster though. Why is this happening?
Will be grateful for any suggestions.
Thanks!
I think i've found a solution. The bolt definition is changed to a prepared bolt:
(defbolt stub-bolt ["stub-bolt"]
{:prepare true}
[conf context collector]
(let [f (load "/calclib/core")
do-some-calculations (resolve 'calclib/do-some-calculations)]
(bolt
(execute [tuple]
(let [obj (get tuple "object")
new-obj (do-some-calculations obj)]
(emit-bolt! collector new-obj))))))
Key is the call to load. I wonder if there's a more elegant approach though.

Defining an agent lazily

I am creating an agent for writing changes back to a database (as discussed in the agent-based write-behind log in Clojure Programming).
This is working fine, but I am struggling to create the agent late. I don't want to create it as a def, as I don't want it created when my tests are running (I see the pool starting up when the tests load the forms even though I use with-redefs to set a test value).
The code I started with is (using c3p0 pooling):
(def dba (agent (pool/make-datasource-spec (u/load-config "db.edn"))))
I tried making the agent nil, and investigated how I could set it in the main of my application, when it's really needed. But there doesn't seem to be an equivalent reset! function as there is with an atom. And the following code also failed saying the agent wasn't in error so didn't need restarting:
(when (not #dba)
(restart-agent dba (create-db-pool)))
So at the moment, I have an atom containing the agent, where I then do:
(def dba (atom nil))
;; ...
(defn init-db! []
(when (not #dba)
(log/info "Creating agent for pooled connection")
(reset! dba (agent (create-db-pool))))
But the very fact I'm having to do ##dba to reference the contents of the agent (i.e. the pool) makes me think this is insane.
Is there a more obvious way of creating the pool agent lazily?
delay is useful for cases like this. It causes the item to be created the first time it is read. so if your tests don't read it it will not be created.
user=> (def my-agent (delay (do (println "im making the agent now") (agent 0))))
#'user/my-agent
user=>
user=> #my-agent
im making the agent now
#object[clojure.lang.Agent 0x2cd73ca1 {:status :ready, :val 0}]

Clojure / Jetty: Force URL to only be Hit Once at a Time

I am working on a Clojure / Jetty web service. I have a special url that I want to only be serviced one request at a time. If the url was requested, and before it returns, the url is requested again, I want to immediately return. So in more core.clj, where I defined my routes, I have something like this:
(def work-in-progress (ref false))
Then sometime later
(compojure.core/GET "/myapp/internal/do-work" []
(if #work-in-progress
"Work in Progress please try again later"
(do
(dosync
(ref-set work-in-progress true))
(do-the-work)
(dosync
(ref-set rebuild-in-progress false))
"Job completed Successfully")))
I have tried this on local Jetty server but I seem to be able to hit the url twice and double the work. What is a good pattern / way to implement this in Clojure in a threaded web server environment?
Imagine a following race condition for the solution proposed in the question.
Thread A starts to execute handler's body. #work-in-progress is false, so it enters the do expression. However, before it managed to set the value of work-in-progress to true...
Thread B starts to execute handler's body. #work-in-progress is false, so it enters the do expression.
Now two threads are executing (do-the-work) concurrently. That's not what we want.
To prevent this problem check and set the value of the ref in a dosync transaction.
(compojure.core/GET "/myapp/internal/do-work" []
(if (dosync
(when-not #work-in-progress
(ref-set work-in-progress true)))
(try
(do-the-work)
"Job completed Successfully"
(finally
(dosync
(ref-set work-in-progress false))))
"Work in Progress please try again later"))
Another abstraction which you might find useful in this scenario is an atom and compare-and-set!.
(def work-in-progress (atom false))
(compojure.core/GET "/myapp/internal/do-work" []
(if (compare-and-set! work-in-progress false true)
(try
(do-the-work)
"Job completed Successfully"
(finally
(reset! work-in-progress false)))
"Work in Progress please try again later"))
Actually this is the natural use case for a lock; in particular, a java.util.concurrent.locks.ReentrantLock.
The same pattern came up in my answer to an earlier SO question, Canonical Way to Ensure Only One Instance of a Service Is Running / Starting / Stopping in Clojure?; I'll repeat the relevant piece of code here:
(import java.util.concurrent.locks.ReentrantLock)
(def lock (ReentrantLock.))
(defn start []
(if (.tryLock lock)
(try
(do-stuff)
(finally (.unlock lock)))
(do-other-stuff)))
The tryLock method attempts to acquire the lock, returning true if it succeeds in doing so and false otherwise, not blocking in either case.
Consider queueing the access to the resource as well - in addition to getting an equivalent functionality to that of locks/flags, queues let you observe the resource contention, among other advantages.

How to Manage Multiple Connections?

I put together a simple socket server (see below). Currently, it is not capable of handling multiple/concurrent requests. How can I make the socket server more efficient -- i.e. capable of handling concurrent requests? Are there any clojure constructs I can leverage? Thus far, I've thought of using either java's NIO (instead of IO) or netty (as pointed out here).
(ns server.policy
(:import
(java.net ServerSocket SocketException)
java.io.PrintWriter))
(defn create-socket
"Creates a socket on given port."
[port]
(ServerSocket. port))
(defn get-writer
"Create a socket file writer."
[client]
(PrintWriter. (.getOutputStream client)))
(defn listen-and-respond
"Accepts connection and responds."
[server-socket service]
(let [client (.accept server-socket)
socket-writer (get-writer client)]
(service socket-writer)))
(defn policy-provider
"Returns domain policy content."
[socket-writer]
(.print socket-writer "<content>This is a test</content>")
(.flush socket-writer)
(.close socket-writer))
(defn run-server
[port]
(let [server-socket (create-socket port)]
(while (not (.isClosed server-socket))
(listen-and-respond server-socket policy-provider))))
I have had success using Netty directly. However, if you want something that feels a little more like idiomatic Clojure code, take a look at the aleph library. It uses Netty internally, but results in much more simple code:
(use 'lamina.core 'aleph.tcp)
(defn echo-handler [channel client-info]
(siphon channel channel))
(start-tcp-server echo-handler {:port 1234})
Also, keep in mind that sometimes you need to reference the lamina documentation in addition to the aleph documentation.

Executing multiple statements in if-else without nullpointer exception

I'm trying to dig a little deeper into clojure and functional programming.
At some point of my code I have a (def server (spawn-server)). Now I want a short function for the REPL to check the state of this socket.
This is what I have at the moment:
(defn status []
(if server
(
(println "server is up and running")
(println "connections:" (connection-count server))
)
(println "server is down")))
If server is nil, everything works fine, but this is the output on the REPL if the server is running:
=> (status)
server is up and running
connections: 0
#<CompilerException java.lang.NullPointerException (NO_SOURCE_FILE:0)>
I'm not really sure if I see the problem, but I can't figure out how this should work :-)
What I have here is this:
((println "foo")(println "foo"))
which will be evaluated to (nil nil) which results in the NullPointerException ?
Usally I wouldn't use the outer parentheses but how can I create some kind of "block" statement for the if-condition. If I don't use them, the second println will be used as else.
What would work is the usage of let as some kind of "block"-statement:
(let []
(println "server is up and running"),
(println "connections:" (connection-count server)) )
But I'm not really sure if this the "right" solution?
Use do:
(defn status []
(if server
(do
(println "server is up and running")
(println "connections:" (connection-count server)))
(println "server is down")))
In Lisps, generally, you can't just add parens for grouping.
((println "foo") (println "foo"))
Here, the return value of the first (println "foo") will be tried to be called (as a function), with the return value of the second as an argument. Those are very basic evaluation rules, so I suggest that you hit some introductory books or documentation about Clojure or Lisps in general.
From the evaluation section of the Clojure homepage:
Non-empty Lists are considered calls
to either special forms, macros, or
functions. A call has the form
(operator operands*).
Macros or special forms may "break" this rule.