clojure behavior of def - clojure

I have a lein project (using cascalog--but that's not particularly important). I'm trying to externalize properties like paths to files, so I end up with code that looks something like this:
(defn output-tap [path] (hfs-textline (str (get-prop :output-path-prefix) path) :sinkmode :replace))
(def some-cascalog-query
(<- [?f1 ?f2 ?f3]
((output-tap (get-prop :output-path)) ?line)
(tab-split ?line :> ?f1 ?f2 ?f3)))
In the example above, assume the function get-prop exists; it's just using standard java to read a property value (based off this example: loading configuration file in clojure as data structure).
Now I have a main method that loads the property values, e.g. something like:
(defn -main [& args] (do (load-props (first args)) (do-cascalog-stuff)))
But when I lein uberjar I get a compile time error saying:
Caused by: java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
at org.apache.hadoop.fs.Path.<init>(Path.java:90)
at cascading.tap.hadoop.Hfs.getPath(Hfs.java:343)
Are defs always compile time evaluated (rather than runtime evaluated)? Or am I misunderstanding this error?

So, you want the property lookup to occur at run-time? Then yes, you'll need to define some-cascalog-query as a function or macro. A bare def causes the expression to be evaluated when the code is loaded, not when the var is dereferenced.
This can be illustrated pretty simply in the REPL:
user=> (def foo (do (println "Hello, world!") 1))
Hello, world!
#'user/foo
user=> foo
1
From the documentation (emphasis mine):
(def symbol init?)
Creates and interns or locates a global var with the name of symbol and a namespace of the value of the current namespace (ns). If init is supplied, it is evaluated, and the root binding of the var is set to the resulting value.

that error looks like (get-prop :output-path) (get-prop :output-path-prefix) are is returning nothing which is getting wrapped into an empty string by str. perhaps the property is not being found?
does get-prop work as expected?
your understanding of defs is correct, they are are compile time, not (usually) runtime.

Related

How to set a dynamic var before aot compile

I want to have a *flag* variable in a given namespace that is set to true only when :aot compiling.
Is there a way to do that?
Your issue is kind of complicated because the definition of Clojure's own dynamic nature, so there's no rough equivalent of C's #ifdef or some other mechanism that happens at compile time, but here's a workaround:
I created a Leiningen project with lein new app flagdemo. This trick detects when AOT is performed, as #Biped Phill mentioned above, using the dynamic var *compile-files*, and saves a resource in the classpath of the compiled code:
The code looks like this:
(ns flagdemo.core
(:gen-class))
(def flag-path "target/uberjar/classes/flag.edn")
(defn write-flag [val]
(try (spit flag-path (str val)) (catch Exception _)))
(defn read-flag []
(some-> (clojure.java.io/resource "flag.edn") slurp clojure.edn/read-string))
(write-flag false)
(when clojure.core/*compile-files*
(write-flag true))
(defn -main
[& args]
(println "Flag is" (read-flag)))
So, when the file loads using, say, lein run or when you load it in the REPL, it will try to write an EDN file with the value false.
When you compile the package using lein uberjar, it loads the namespace and finds that *compile-files* is defined, thus it saves an EDN file that is packaged with the JAR as a resource.
The function read-flag just tries to load the EDN file from the classpath.
It works like this:
$ lein clean
$ lein run
Flag is nil
$ lein uberjar
Compiling flagdemo.core
Created /tmp/flagdemo/target/uberjar/flagdemo-0.1.0-SNAPSHOT.jar
Created /tmp/flagdemo/target/uberjar/flagdemo-0.1.0-SNAPSHOT-standalone.jar
$ java -jar target/uberjar/flagdemo-0.1.0-SNAPSHOT-standalone.jar
Flag is true
Credit to #Biped Phill and #Denis Fuenzalida for explaining it to me.
I think this works as well:
(def ^:dynamic *static* false)
(when (and *compile-files* boot/*static-flag*)
(alter-var-root #'*static* true))
then in profiles.clj:
:injections [(require 'boot)
(intern 'boot '*static-flag* true)]
First you need to indicate that a variable will change over time:
(def ^:dynamic *the-answer* nil)
*the-answer*
;=> nil
The ^:dynamic bit will allow that variable to be rebound later:
(binding [*the-answer* 42] *the-answer*)
;=> 42
ADDENDUM 1
Here's what would happen if you didn't use ^:dynamic:
(def *the-answer* nil)
(binding [*the-answer* 42] *the-answer*)
;=> [...] Can't dynamically bind non-dynamic var [...]
ADDENDUM 2
These kind of variables are commonly referred to as "earmuffs". The naming convention is unenforced but strongly encouraged.
Here's what happen in the REPL when you use this naming without declaring a dynamic variable:
(def *the-answer* nil)
; Warning: *the-answer* not declared dynamic and thus is not dynamically rebindable, but its name suggests otherwise. Please either indicate ^:dynamic *the-answer* or change the name. (/tmp/form-init7760459636905875407.clj:1)

What does the - (minus) symbol in front of function name within Clojure mean?

I can't get my head around the following. When defining the main function in Clojure (based on code generated by Leinigen), there's an - symbol in front of the function name main.
I went to the original documentation on clojure.com and found defn and defn- among other things, see https://clojuredocs.org/search?q=defn. I also searched on Google and found a source that said that the - in front of main indicated that the function was static (http://ben.vandgrift.com/2013/03/13/clojure-hello-world.html).
Does the - truely mean that the function is static? I couldn't find any other sources that confirmed that. Also I can use both (main) and (-main) when calling the main method without any problem.
Given the following code...
(defn -main
"I don't do a whole lot ... yet."
[& args]
(println "Hello, World!"))
(defn main
"I don't do a whole lot ... yet."
[& args]
(println "Hello, World!"))
I get the following output...
(main)
Hello, World!
=> nil
(-main)
Hello, World!
=> nil
Loading src/clojure_example/core.clj... done
(main)
Hello, World!
=> nil
(-main)
Hello, World!
=> nil
I noticed no difference. Output is the same for both functions. Any help is appreciated!
First, an aside.
Regarding the macros defn vs defn-, the 2nd form is just a shorthand for "private" functions. The long form looks like:
(defn ^:private foo [args] ...)
However, this is just a hint to the user that one shouldn't use these functions. It is easy for testing, etc to work around this weak "private" restriction. Due to the hassle I never use so-called "private" functions (I do sometimes use metadata ^:no-doc and names like foo-impl to indicate a fn is not a part of the public-facing API and should be ignored by library users).
The "main" function in a Clojure Program
In Java, a program is always started by calling the "main" function in a selected class
class Foo
public static void main( String[] args ) {
...
}
}
and then
> javac Foo.java ; compile class Foo
> java Foo ; run at entrypoint Foo.main()
Clojure chooses to name the initial function -main. The hyphen in the function name -main is not really special, except it makes the name unusual so it is less likely to conflict with any other function in your codebase. You can see this in the definition of the function clojure.main/main-opt.
You can see part of the origin of the hyphen convention in the docs for gen-class (scroll down to see the part about :prefix). Note that using the hyphen is changeable if using gen-class for java interop.
Using the Clojure Deps & CLI tools, the name -main is assumed as the starting point of the program.
If you are using Leiningen, it is more flexible and allows one to override the -main entrypoint of a program.
In Leiningen projects, an entry like the following indicates where to start when you type lein run:
; assumes a `-main` function exists in the namespace `demo.core`
:main ^:skip-aot demo.core
so in a program like this:
(ns demo.core )
(defn foo [& args]
(newline)
(println "*** Running in foo program ***")
(newline))
(defn -main [& args]
(newline)
(println "*** Running in main program ***")
(newline))
we get the normal behavior:
~/expr/demo > lein run
*** Running in main program ***
However, we could invoke the program another way:
> lein run -m demo.core/foo
*** Running in foo program ***
to make the foo function the "entry point". We could also change the :main setting like this:
:main ^:skip-aot demo.core/foo
and get behavior:
~/expr/demo > lein run
*** Running in foo program ***
So, having the initial function of a Clojure program named -main is the default, and is required for most tools. You can override the default if using Leiningen, although this is probably only useful in testing & development.
Please keep in mind that each namespace can have its own -main function, so you can easily change the initial function simply by changing the initial namespace that is invoked.
And finally, the hyphen in -main us unrelated to the hyphen used for pseudo-private functions defined via defn-.
All clojure functions are "static" (digression below). Putting a - as the first character in the name of a function does not have any effect on the behaviour of the function. Using the defn- macro instead of defn makes a function private. -main is ~~by convention~~ the name of the main entry point for a clojure program, if you specify an ns in your project definition as the "main" namespace, the clojure runtime will look for a function named -main and invoke it.
It's not really "by convention" now I think about it. As it's not configurable for the standard tools. It is the one and only name that clojure.main will look for.
https://clojure.org/reference/repl_and_main
All clojure functions are actually an instance of a java object AFunction, with an instance method invoke. So they're not static from a java perspective, but while in clojure land I'd say that they are, as they have no instance that you see. There is also the special case of gen-class, where you define a java class using clojure. In this case you can mark clojure functions as ^:static for the generated java class. This creates a static method in the generated java class that refers to the instance of AFunction.

How do you make a var set!-able in the Leiningen user repl?

I've got a dynamic var in a namespace defined in a source file, like this:
(ns mystuff.log ...)
(def ^:dynamic *logging* #{})
I'd like to be able to set! this var from the REPL, so that code in that same source file can look at it. In this example, the mystuff.log/log macro looks at *logging* to decide whether to print a given expression. At the REPL, it would be convenient to (set! *logging* #{:whatever}), changing its value multiple times during the session.
How can I get Leiningen's REPL to allow this? By default, set!ing such a var produces an IllegalStateException because set! can't change the root binding of a var. The var must be thread-local to be changeable by set!.
Is there a way to tell Leiningen to wrap its REPL something like this, to create a thread-local binding for a var?
(binding [mystuff.log/*logging* mystuff.log/*logging*]
(the-leiningen-repl ...))
The :init option of :repl-options, briefly explained here, seems like it offers something close. Apparently, though, the REPL calls :init, which would make it too late to establish a thread-local binding for the expressions typed into the REPL.
You probably want alter-var-root, not set!. With no special set-up or modifications to the REPL, here's what you can do:
user> (def logging #{})
#'user/logging
user> (alter-var-root #'logging conj :my-new-logger)
#{:my-new-logger}
user> (alter-var-root #'logging conj :another-new-logger)
#{:my-new-logger :another-new-logger}
user> logging
#{:my-new-logger :another-new-logger}
#{:my-new-logger :another-new-logger}
set! modifies only a var's current thread binding. alter-var-root modifies the var's root binding: the value that's shared across all threads where it's not overridden by a per-thread binding.*
*By the way, that's why alter-var-root doesn't have an exclamation point. It follows the same convention as other forms that modify root bindings, like def.

Clojure: How to initialize a def from a file at startup?

I have the following clojure code to initialize my config structure Config.
I noticed that the file is actually read when compiling the file, not at runtime.
For me, the config structure Config should be immutable, however, I do not want to have the configuration inside the JAR file.
How should I do this? Do I have to use an atom? It is okay if the application crashes if my.config is missing.
(def Config
(read-string (slurp "my.config")))
When you don't want it at compile time you have to wrap it in a function.
(defn def-my-conf []
(def Conf (blub)))
But you the cleaner way would be:
(declare Config)
(defn load-Config []
(alter-var-root (var Config) (blub)))
This function should be called inside your main.
EDIT: Of course an atom is also a solution!
Write a function for reading your config:
(defn read-config
[]
(read-string
(slurp "my.config")))
Then you can call this function from -main, and either 1) pass the config on to any functions that will need it, or 2) store it in a dynamic variable and let them read it directly:
(def ^:dynamic *config* nil)
(defn some-function-using-config
[]
(println *config*))
(defn -main
[]
(binding [*config* (read-config)]
(some-function-using-config)))
Which of the two to choose is a matter of taste and situation. With direct passing you make it explicit that a function is receiving config, with the dynamic variable you avoid having to include config as an argument to every single function you write, most of whom will just pass it on.
Both of these solutions work well for unit tests, since you can just rebind the dynamic variable to whatever config you want to use for each test.
TheQuickBrownFox had the answer, the file is read both at run-time, and a compile-time.
Fine for me! That is actually really cool!

Setting Clojure "constants" at runtime

I have a Clojure program that I build as a JAR file using Maven. Embedded in the JAR Manifest is a build-version number, including the build timestamp.
I can easily read this at runtime from the JAR Manifest using the following code:
(defn set-version
"Set the version variable to the build number."
[]
(def version
(-> (str "jar:" (-> my.ns.name (.getProtectionDomain)
(.getCodeSource)
(.getLocation))
"!/META-INF/MANIFEST.MF")
(URL.)
(.openStream)
(Manifest.)
(.. getMainAttributes)
(.getValue "Build-number"))))
but I've been told that it is bad karma to use def inside defn.
What is the Clojure-idiomatic way to set a constant at runtime? I obviously do not have the build-version information to embed in my code as a def, but I would like it set once (and for all) from the main function when the program starts. It should then be available as a def to the rest of the running code.
UPDATE: BTW, Clojure has to be one of the coolest languages I have come across in quite a while. Kudos to Rich Hickey!
I still think the cleanest way is to use alter-var-root in the main method of your application.
(declare version)
(defn -main
[& args]
(alter-var-root #'version (constantly (-> ...)))
(do-stuff))
It declares the Var at compile time, sets its root value at runtime once, doesn't require deref and is not bound to the main thread. You didn't respond to this suggestion in your previous question. Did you try this approach?
You could use dynamic binding.
(declare *version*)
(defn start-my-program []
(binding [*version* (read-version-from-file)]
(main))
Now main and every function it calls will see the value of *version*.
While kotarak's solution works very well, here is an alternative approach: turn your code into a memoized function that returns the version. Like so:
(def get-version
(memoize
(fn []
(-> (str "jar:" (-> my.ns.name (.getProtectionDomain)
(.getCodeSource)
(.getLocation))
"!/META-INF/MANIFEST.MF")
(URL.)
(.openStream)
(Manifest.)
(.. getMainAttributes)
(.getValue "Build-number")))))
I hope i dont miss something this time.
If version is a constant, it's going to be defined one time and is not going to be changed you can simple remove the defn and keep the (def version ... ) alone. I suppose you dont want this for some reason.
If you want to change global variables in a fn i think the more idiomatic way is to use some of concurrency constructions to store the data and access and change it in a secure way
For example:
(def *version* (atom ""))
(defn set-version! [] (swap! *version* ...))