Clojure: How to Serialize a Function and Reuse it Later

Clojure: How to Serialize a Function and Reuse it Later - clojure

(defn my-func [opts]
(assoc opts :something :else))
What i want to be able to do, is serialize a reference to the function (maybe via #'my-func ?) to a string in such a way that i can upon deserializing it, invoke it with args.
How does this work?
Edit-- Why This is Not a Duplicate
The other question asked how to serialize a function body-- the entire function code. I am not asking how to do that. I am asking how to serialize a reference.
Imagine a cluster of servers all running the same jar, attached to a MQ. The MQ pubs in fn-reference and fn-args for functions in the jar, and the server in the cluster runs it and acks it. That's what i'm trying to do-- not pass function bodies around.
In some ways, this is like building a "serverless" engine in clojure.

Weirdly, a commit for serializing var identity was just added to Clojure yesterday: https://github.com/clojure/clojure/commit/a26dfc1390c53ca10dba750b8d5e6b93e846c067
So as of the latest master snapshot version, you can serialize a Var (like #'clojure.core/conj) and deserialize it on another JVM with access to the same loaded code, and invoke it.
(import [java.io File FileOutputStream FileInputStream ObjectOutputStream ObjectInputStream])
(defn write-obj [o f]
(let [oos (ObjectOutputStream. (FileOutputStream. (File. f)))]
(.writeObject oos o)
(.close oos)))
(defn read-obj [f]
(let [ois (ObjectInputStream. (FileInputStream. (File. f)))
o (.readObject ois)]
(.close ois)
o))
;; in one JVM
(write-obj #'clojure.core/conj "var.ser")
;; in another JVM
(read-obj "var.ser")

As suggested on the comments, if you can just serialize a keyword label for the function and store/retrieve that, you are finished.
If you need to transmit the function from one place to another, you essentially need to send the function source code as a string and then have it compiled via eval on the other end. This is what Datomic does when a Database Function is stored in the DB and automatically run by Datomic for any new additions/changes to the DB (these can perform automatic data validation, for example). See:
http://docs.datomic.com/database-functions.html
http://docs.datomic.com/clojure/index.html#datomic.api/function
As similar technique is used in the book Clojure in Action (1st Edition) for the distributed compute engine example using RabbitMQ.

Related

Streaming data to the caller in JVM

I have a function which gets data periodically and then stops getting the data. This function has to return the data that it is fetching periodically to the caller of the function either
As and when it gets
At one shot
The 2nd one is an easy implementation i.e you block the caller, fetch all the data and then send it in one shot.
But I want to implement the 1st one (I want to avoid having callbacks). Is streams the things to be used here? If so, how? If not, how do I return something on which the caller can query for data and also stop when it returns a signal that there is no more data?
Note: I am on the JVM ecosystem, clojure to be specific. I have had a look at the clojure library core.async which kind of solves this kind of a problem with the use of channels. But I was thinking if there is any other way which probably looks like this (assuming streams is something that can be used).
Java snippet
//Function which will periodically fetch MyData until there is no data
public Stream<MyData> myFunction() {
...
}
myFunction().filter(myData -> myData.text.equals("foo"))

Maybe you can just use seq - which is lazy by default (like Stream) so caller can decide when to pull the data in. And when there are no more data myFunction can simply end the sequence. While doing this, you would also encapsulate some optimisation within myFunction - e.g. to get data in batch to minimise roundtrips. Or fetch data periodically per your original requirement.
Here is one naive implementation:
(defn my-function []
(let [batch 100]
(->> (range)
(map #(let [from (* batch %)
to (+ from batch)]
(db-get from to)))
;; take while we have data from db-get
(take-while identity)
;; returns as one single seq/Stream
(apply concat))))
;; use it as a normal seq/Stream
(->> (my-function)
(filter odd?))
where db-get would be something like:
(defn db-get [from to]
;; return first 1000 records only, i.e. returns nil to signal completion
(when (< from 1000)
;; returns a range of records
(range from to)))

You might want to check https://github.com/ReactiveX/RxJava and https://github.com/ReactiveX/RxClojure (seems no longer maintained?)

YeSQL with connection pool?

I'm checking if YeSQL may help in my Clojure project, but I cannot find any example of YeSQL using a connection pool.
does this mean that YeSQL creates a new connection to every statement?
I also found an example on how to use transactions using clojure.java.jdbc/with-db-transaction, but I feel it is outdated (I haven't tried yet).
does this mean that YeSQL depends on clojure.java.jdbc to commit/rollback control? in this case, shouldn't I just use clojure.java.jdbc alone, since YeSQL doesn't offer too much more (other than naming my queries and externalizing them)?
thanks in advance

YeSQL doesn't handle connections or connection pooling. You need to handle it externally and provide a connection instance to the query function.
As you can see from the YeSQL example from README:
; Define a database connection spec. (This is standard clojure.java.jdbc.)
(def db-spec {:classname "org.postgresql.Driver"
:subprotocol "postgresql"
:subname "//localhost:5432/demo"
:user "me"})
; Use it standalone. Note that the first argument is the db-spec.
(users-by-country db-spec "GB")
;=> ({:count 58})
; Use it in a clojure.java.jdbc transaction.
(require '[clojure.java.jdbc :as jdbc])
(jdbc/with-db-transaction [connection db-spec]
{:limeys (users-by-country connection "GB")
:yanks (users-by-country connection "US")})
If you ask how to add connection pool handling you might check an example from Clojure Cookbook.
As for the transaction handling, YeSQL documentation is none, but source is quite easy to understand:
(defn- emit-query-fn
"Emit function to run a query.
- If the query name ends in `!` it will call `clojure.java.jdbc/execute!`,
- If the query name ends in `<!` it will call `clojure.java.jdbc/insert!`,
- otherwise `clojure.java.jdbc/query` will be used."
[{:keys [name docstring statement]}]
(let [split-query (split-at-parameters statement)
{:keys [query-args display-args function-args]} (split-query->args split-query)
jdbc-fn (cond
(= [\< \!] (take-last 2 name)) `insert-handler
(= \! (last name)) `execute-handler
:else `jdbc/query)]
`(def ~(fn-symbol (symbol name) docstring statement display-args)
(fn [db# ~#function-args]
(~jdbc-fn db#
(reassemble-query '~split-query
~query-args))))))
So it will just generate a function that will either call clojure.java.jdbc/execute! or clojure.java.jdbc/insert! with the generated query. You might need to refer to those functions' documentation for further details.

When doing transactions using YesQL, I wrap the YesQL calls in a clojure.java.jdbc/with-db-transation invocation, and pass the generated connection details in to the YesQL function call, which they'll use instead of the standard non-transaction connection if supplied. For example:
;; supply a name for the transaction connection, t-con, based on the existing db connection
(jdbc/with-db-transaction [t-con db-spec]
;; this is the yesql call with a second map specifying the transaction connection
(create-client-order<! {...} {:connection t-con}))
All wrapped YesQL calls using the {:connection t-con} map will be part of the same transaction.

Clojure - connections at compile time

I have a rabbitMQ connection that seems to be started at compile time (when I type lein compile) and then blocks the building of my project. Here are more details on the problem. Let us say this is the clojure file bla_test.clj
(import (com.rabbitmq.client ConnectionFactory Connection Channel QueueingConsumer))
;; And then we have to translate the equivalent java hello world program using
;; Clojure's excellent interop.
;; It feels very strange writing this sort of ceremony-oriented imperative code
;; in Clojure:
;; Make a connection factory on the local host
(def connection-factory
(doto (ConnectionFactory.)
(.setHost "localhost")))
;; and get it to make you a connection
(def connection (.newConnection connection-factory))
;; get that to make you a channel
(def channel (. connection createChannel))
;;HERE I WOULD LIKE TO USE THE SAME CONNECTION AND THE SAME CHANNEL INSTANCE AS OFTEN AS
;; I LIKE
(dotimes [ i 10 ]
(. channel basicPublish "" "hello" nil (. (format "Hello World! (%d)" i) getBytes)))
The clojure file above is part of a bigger clojure program that I build using lein. My problem is that when I compile with "lein compile", a connection is done because of the line (def connection (.newConnection connection-factory)) and then the compilation is stopped! How can I avoid this? Is there a way to compile without building connection? How can I manage to use the same instance of channel over several calls coming from external components?
Any help would be appreciated.
Regards,
Horace

The Clojure compiler must evaluate all top-level forms, because it can be required to run arbitrary code when expanding calls to macros.
The usual solution to issues like the one you describe is to define a top-level Var holding an object of a dereferenceable type, for example an atom or a promise, and have an initialization function provide the value at runtime. (You could also use a delay and specify the value inline; this is less flexible, since it makes it more difficult to use a different value for testing etc.)

How should carmine's wcar macro be used?

I'm confused by how calls with carmine should be done. I found the wcar macro described in carmine's docs:
(defmacro wcar [& body] `(car/with-conn pool spec-server1 ~#body))
Do I really have to call wcar every time I want to talk to redis in addition to the redis command? Or can I just call it once at the beginning? If so how?
This is what some code with tavisrudd's redis library looked like (from my toy url shortener project's testsuite):
(deftest test_shorten_doesnt_exist_create_new_next
(redis/with-server test-server
(redis/set "url_counter" 51)
(shorten test-url)
(is (= "1g" (redis/get (str "urls|" test-url))))
(is (= test-url (redis/get "shorts|1g")))))
And now I can only get it working with carmine by writing it like this:
(deftest test_shorten_doesnt_exist_create_new_next
(wcar (car/set "url_counter" 51))
(shorten test-url)
(is (= "1g" (wcar (car/get (str "urls|" test-url)))))
(is (= test-url (wcar (car/get "shorts|1g")))))
So what's the right way of using it and what underlying concept am I not getting?

Dan's explanation is correct.
Carmine uses response pipelining by default, whereas redis-clojure requires you to ask for pipelining when you want it (using the pipeline macro).
The main reason you'd want pipelining is for performance. Redis is so fast that the bottleneck in using it is often the time it takes for the request+response to travel over the network.
Clojure destructuring provides a convenient way of dealing with the pipelined response, but it does require writing your code differently to redis-clojure. The way I'd write your example is something like this (I'm assuming your shorten fn has side effects and needs to be called before the GETs):
(deftest test_shorten_doesnt_exist_create_new_next
(wcar (car/set "url_counter" 51))
(shorten test-url)
(let [[response1 response2] (wcar (car/get (str "urls|" test-url))
(car/get "shorts|1g"))]
(is (= "1g" response1))
(is (= test-url response2))))
So we're sending the first (SET) request to Redis and waiting for the reply (I'm not certain if that's actually necessary here). We then send the next two (GET) requests at once, allow Redis to queue the responses, then receive them all back at once as a vector that we'll destructure.
At first this may seem like unnecessary extra effort because it requires you to be explicit about when to receive queued responses, but it brings a lot of benefits including performance, clarity, and composable commands.
I'd check out Touchstone on GitHub if you're looking for an example of what I'd consider idiomatic Carmine use (just search for the wcar calls). (Sorry, SO is preventing me from including another link).
Otherwise just pop me an email (or file a GitHub issue) if you have any other questions.

Don't worry, you're using it the correct way already.
The Redis request functions (such as the get and set that you're using above) are all routed through another function send-request! that relies on a dynamically bound *context* to provide the connection. Attempting to call any of these Redis commands without that context will fail with a "no context" error. The with-conn macro (used in wcar) sets that context and provides the connection.
The wcar macro is then just a thin wrapper around with-conn making the assumption that you will be using the same connection details for all Redis requests.
So far this is all very similar to how Tavis Rudd's redis-clojure works.
So, the question now is why does Carmine need multiple wcar's when Redis-Clojure only required a single with-server?
And the answer is, it doesn't. Apart from sometimes, when it does. Carmine's with-conn uses Redis's "Pipelining" to send multiple requests with the same connection and then package the responses together in a vector. The example from the README shows this in action.
(wcar (car/ping)
(car/set "foo" "bar")
(car/get "foo"))
=> ["PONG" "OK" "bar"]
Here you will see that ping, set and get are only concerned with sending the request, leaving the receiving of response up to wcar. This precludes asserts (or any result access) from inside of wcar and leads to the separation of requests and multiple wcar calls that you have.

idiomatic file locking in clojure?

I have a group of futures processing jobs from a queue that involve writing to files. What's the idiomatic way to make sure only one future accesses a particular file at a time?

How about using agents instead of locks to ensure this?
I think using agents to safe guard shared mutable state, regardless if it's in memory or on disk is more idiomatic in clojure than using locks.
If you create one agent at a time and send the access tries to the agents, you can ensure that only on thread at time accesses a given file.
For example like this:
(use 'clojure.contrib.duck-streams)
(defn file-agent [file-name]
(add-watch (agent nil) :file-writer
(fn [key agent old new]
(append-spit file-name new))))
(defn async-append [file-agent content]
(send file-agent (constantly content)))
then append your file through the agent:
(async-append "content written to file" (file-agent "temp-file-name"))
If you need synchronous usage of the file it could be achieved with await. Like this:
(defn sync-append [file-agent content]
(await (send file-agent (constantly content))))

I would use the core Clojure function locking which is used as follows:
(locking some-object
(do-whatever-you-like))
Here some-object could either be the file itself, or alternatively any arbitrary object that you want to synchronise on (which might make sense if you wanted a single lock to protect multiple files).
Under the hood this uses standard JVM object locking, so it's basically equivalent to a synchronized block of code in Java.

I don't think there is specific built-in function for this in Clojure but you can use standard java IO functions to do this. This would look something like this:
(import '(java.io File RandomAccessFile))
(def f (File. "/tmp/lock.file"))
(def channel (.getChannel (RandomAccessFile. f "rw")))
(def lock (.lock channel))
(.release lock)
(.close channel)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Clojure: How to Serialize a Function and Reuse it Later - clojure

Related

Streaming data to the caller in JVM

YeSQL with connection pool?

Clojure - connections at compile time

How should carmine's wcar macro be used?

idiomatic file locking in clojure?

Categories

Resources