This question is off the back of a previous question I asked here a few days ago. One of the comments was that I should dispense with the Ring middleware for extracting query parameters and write my own. One alternative that I thought I'd play with was harnessing the existing one to get what I want and I've been doing some digging into the Ring source code. It does almost exactly what I want. If I write out how I understand it works:
A middleware has the function wrap-params which calls params-request
params-request adds a params map to the request map, calls assoc-query-params
assoc-query-params eventually calls ring.util.codec/form-decode on the incoming query string to turn it into a map
form-decode uses assoc-conj to merge values into an existing map via reduce
assoc-conj's docstring says
Associate a key with a value in a map. If the key already exists in
the map, a vector of values is associated with the key.
This last function is the one that is problematic in my previous question (TL;DR: I want the map's values to be consistent in class of either a string or a vector). With my object orientated hat on I would have easily solved this by subclassing and overriding the method that I need the behaviour changed. However for Clojure I cannot see how to just replace the one function without having to alter everything up the stack. Is this possible and is it easy, or should I be doing this another way? If it comes to it I could copy the entire middleware library and the codec one, but it seems a bit heavyweight to me.
While a custom middleware is probably the clearest way to go for this problem, don't forget that you can always override any function using with-redefs. For example:
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(dotest
(with-redefs [clojure.core/range (constantly "Bogus!")]
(is= "Bogus!" (range 9))))
While this is primarily used during unit tests, it is a wide-open escape hatch that can be used to override any function.
To Clojure, there is no difference between a Var in your source code versus one in a library (or even clojure.core itself, as the example shows).
I disagree with the advice to not use Ring's param middleware. It gives you perfect information about the incoming parameters, so you if you don't like the default behavior of string-or-list, you can change the parameters however you want.
There are numerous ways to do this, but one obvious approach would be to write your own middleware, and insert it in between Ring's param middleware and your handlers.
(defn wrap-take-last-param []
(fn [handler]
(fn [req]
(handler
(update req :params
(fn [params]
(into {}
(for [[k v] params]
[k (if (string? v) v, (last v)]))))))))
You could write something fancier, like adding some arguments to the function to let you specify which parameters you want to receive only the last specified, and which you would like to always receive as a list. In that case you might not want to wrap it around your entire handler, but around each of your routes separately to specify their expected parameters.
Related
I'm starting to learn clojure and I've stumbled upon the following, when I found myself declaring a "sum" function (for learning purposes) I wrote the following code
(def sum (fn [& args] (apply + args)))
I have understood that I defined the symbol sum as containing that fn, but why do I have to enclose the Fn in parenthesis, isn't the compiler calling that function upon definition instead of when someone is actually invoking it? Maybe it's just my imperative brain talking.
Also, what are the use cases of let? Sometimes I stumble on code that use it and other code that don't, for example on the Clojure site there's an exercise to use the OpenStream function from the Java Interop, I wrote the following code:
(defn http-get
[url]
(let [url-obj (java.net.URL. url)]
(slurp (.openStream url-obj))))
(http-get "https://www.google.com")
whilst they wrote the following on the clojure site as an answer
(defn http-get [url]
(slurp
(.openStream
(java.net.URL. url))))
Again maybe it's just my imperative brain talking, the need of having a "variable" or an "object" to store something before using it, but I quite don't understand when I should use let or when I shouldn't.
To answer both of your questions:
1.
(def sum (fn [& args] (apply + args)))
Using def here is very unorthodox. When you define a function you usually want to use defn. But since you used def you should know that def binds a name to a value. fn's return value is a function. Effectively you bound the name sum to the function returned by applying (using parenthesis which are used for application) fn.
You could have used the more traditional (defn sum [& args] (apply + args))
2.
While using let sometimes makes sense for readability (separating steps outside their nested use) it is sometimes required when you want to do something once and use it multiple times. It binds the result to a name within a specified context.
We can look at the following example and see that without let it becomes harder to write (function is for demonstration purposes):
(let [db-results (query "select * from table")] ;; note: query is not a pure function
;; do stuff with db-results
(f db-results)
;; return db-results
db-results)))
This simply re-uses a return value (db-results) from a function that you usually only want to run once - in multiple locations. So let can be used for style like the example you've given, but its also very useful for value reuse within some context.
Both def and defn define a global symbol, sort of like a global variable in Java, etc. Also, (defn xxx ...) is a (very common) shortcut for (def xxx (fn ...)). So, both versions will work exactly the same way when you run the program. Since the defn version is shorter and more explicit, that is what you will do 99% of the time.
Typing (let [xxx ...] ...) defines a local symbol, which cannot be seen by code outside of the let form, just like a local variable (block-scope) in Java, etc.
Just like Java, it is optional when to have a local variable like url-obj. It will make no difference to the running program. You must answer the question, "Which version makes my code easier to read and understand?" This part is no different than Java.
As an example that, hopefully, states things far better than I could in words:
(let [{:keys [a b c] :or {a 1 b 2 c 3} :as m} {}]
(println a b c) ; => works as expected, output is: 1 2 3
(println m) ; => this doesn't work, output is: {}
)
I expected the output of the second println to be the map containing the default values as though shoved in there by merge (that is {:a 1 :b 2 :c 3}).
Instead it looks like vars are being conjured and or'd after m is bound. Why does :as not get affected by :or like :keys does?
What's wrong with my mental model? How should I be looking at this?
EDIT:
I figured out how it works as I thought I'd shown above (although thanks for the links nonetheless). I've also since read through the source of clojure.core/destructure and now know exactly what it is doing. My question really is 'Why?'
In Clojure there always seems to be a reason things work the way they do. What are they here?
I apologize that the question came across as 'how does destructuring work with :as and :or'.
I'm not Rich, so obviously I didn't choose how this works, but I can think of a couple reasons the current behavior is better than the behavior you expected.
It's faster. A lot of Clojure's low-level core features get used all the time in your program, and they are optimized more for speed than elegance, in order to get acceptable performance. Of course if this were a matter of correctness it'd be a different story, but here there are two reasonable-sounding ways for :as to behave, so picking the faster one seems like a good plan. As for why it's faster, I presume this is obvious, but: we already have a pointer to the original map, which we can just reuse. To use the "modified" map, we have to build it with a bunch of assoc calls.
If :as doesn't give you back the original object, how can you possibly get the original object? You can't, really, right? Whereas if :as gives you back the original object, you can easily construct a modified version if you want. So one behavior leaves more options open to you.
According to Special Forms, :as and :or are both on their own in regards to the init-expr:
In addition, and optionally, an :as key in the binding form followed by a symbol will cause that symbol to be bound to the entire init-expr. Also optionally, an :or key in the binding form followed by another map may be used to supply default values for some or all of the keys if they are not found in the init-expr
As you have discovered, the :or key in the destructuring does not influence :as. :as will capture the original input, regardless of the application of defaults or encapsulation of remaining elements via & etc.
To quote the docs on clojure.org
Also optionally, an :or key in the binding form followed by another
map may be used to supply default values for some or all of the keys
if they are not found in the init-expr
...
Finally, also optional, :as followed by a symbol will cause that
symbol to be bound to the entire init-expr
How have you used metadata in your Clojure program?
I saw one example from Programming Clojure:
(defn shout [#^{:tag String} message] (.toUpperCase message))
;; Clojure casts message to String and then calls the method.
What are some uses? This form of programming is completely new to me.
Docstrings are stored as metadata under the :doc key. This is probably the number 1 most apparent use of metadata.
Return and parameter types can be optionally tagged with metadata to improve performance by avoiding the overhead of reflecting on the types at runtime. These are also known as "type hints." #^String is a type hint.
Storing things "under the hood" for use by the compiler, such as the arglist of a function, the line number where a var has been defined, or whether a var holds a reference to a macro. These are usually automatically added by the compiler and normally don't need to be manipulated directly by the user.
Creating simple testcases as part of a function definition:
(defn #^{:test (fn [] (assert true))} something [] nil)
(test #'something)
If you are reading Programming Clojure, then Chapter 2 provides a good intro to metadata. Figure 2.3 provides a good summary of common metadata.
For diversity some answer, which does not concentrate on interaction with the language itself:
You can also eg. track the source of some data. Unchecked input is marked as :tainted. A validator might check things and then set the status to :clean. Code doing security relevant things might then barf on :tainted and only accept :cleaned input.
Meta Data was extremely useful for me for purposes of typing. I'm talking not just about type hints, but about complete custom type system. Simplest example - overloading of print-method for structs (or any other var):
(defstruct my-struct :foo :bar :baz)
(defn make-my-struct [foo bar baz]
(with-meta (struct-map my-struct :foo foo :bar baz :baz baz)
{:type ::my-struct}))
(defmethod print-method
[my-struct writer]
(print-method ...))
In general, together with Clojure validation capabilities it may increase safety and, at the same time, flexibility of your code very very much (though it will take some more time to do actual coding).
For more ideas on typing see types-api.
metadata is used by the compiler extensively for things like storing the type of an object.
you use this when you give type hints
(defn foo [ #^String stringy] ....
I have used it for things like storing the amount of padding that was added to a number. Its intended for information that is 'orthogonal' to the data and should not be considered when deciding if you values are the same.
I have a defrecord called a bag. It behaves like a list of item to count. This is sometimes called a frequency or a census. I want to be able to do the following
(def b (bag/create [:k 1 :k2 3])
(keys bag)
=> (:k :k1)
I tried the following:
(defrecord MapBag [state]
Bag
(put-n [self item n]
(let [new-n (+ n (count self item))]
(MapBag. (assoc state item new-n))))
;... some stuff
java.util.Map
(getKeys [self] (keys state)) ;TODO TEST
Object
(toString [self]
(str ("Bag: " (:state self)))))
When I try to require it in a repl I get:
java.lang.ClassFormatError: Duplicate interface name in class file compile__stub/techne/bag/MapBag (bag.clj:12)
What is going on? How do I get a keys function on my bag? Also am I going about this the correct way by assuming clojure's keys function eventually calls getKeys on the map that is its argument?
Defrecord automatically makes sure that any record it defines participates in the ipersistentmap interface. So you can call keys on it without doing anything.
So you can define a record, and instantiate and call keys like this:
user> (defrecord rec [k1 k2])
user.rec
user> (def a-rec (rec. 1 2))
#'user/a-rec
user> (keys a-rec)
(:k1 :k2)
Your error message indicates that one of your declarations is duplicating an interface that defrecord gives you for free. I think it might actually be both.
Is there some reason why you cant just use a plain vanilla map for your purposes? With clojure, you often want to use plain vanilla data structures when you can.
Edit: if for whatever reason you don't want the ipersistentmap included, look into deftype.
Rob's answer is of course correct; I'm posting this one in response to the OP's comment on it -- perhaps it might be helpful in implementing the required functionality with deftype.
I have once written an implementation of a "default map" for Clojure, which acts just like a regular map except it returns a fixed default value when asked about a key not present inside it. The code is in this Gist.
I'm not sure if it will suit your use case directly, although you can use it to do things like
user> (:earth (assoc (DefaultMap. 0 {}) :earth 8000000000))
8000000000
user> (:mars (assoc (DefaultMap. 0 {}) :earth 8000000000))
0
More importantly, it should give you an idea of what's involved in writing this sort of thing with deftype.
Then again, it's based on clojure.core/emit-defrecord, so you might look at that part of Clojure's sources instead... It's doing a lot of things which you won't have to (because it's a function for preparing macro expansions -- there's lots of syntax-quoting and the like inside it which you have to strip away from it to use the code directly), but it is certainly the highest quality source of information possible. Here's a direct link to that point in the source for the 1.2.0 release of Clojure.
Update:
One more thing I realised might be important. If you rely on a special map-like type for implementing this sort of thing, the client might merge it into a regular map and lose the "defaulting" functionality (and indeed any other special functionality) in the process. As long as the "map-likeness" illusion maintained by your type is complete enough for it to be used as a regular map, passed to Clojure's standard function etc., I think there might not be a way around that.
So, at some level the client will probably have to know that there's some "magic" involved; if they get correct answers to queries like (:mars {...}) (with no :mars in the {...}), they'll have to remember not to merge this into a regular map (merge-ing the other way around would work fine).
I've found myself using the following idiom lately in clojure code.
(def *some-global-var* (ref {}))
(defn get-global-var []
#*global-var*)
(defn update-global-var [val]
(dosync (ref-set *global-var* val)))
Most of the time this isn't even multi-threaded code that might need the transactional semantics that refs give you. It just feels like refs are for more than threaded code but basically for any global that requires immutability. Is there a better practice for this? I could try to refactor the code to just use binding or let but that can get particularly tricky for some applications.
I always use an atom rather than a ref when I see this kind of pattern - if you don't need transactions, just a shared mutable storage location, then atoms seem to be the way to go.
e.g. for a mutable map of key/value pairs I would use:
(def state (atom {}))
(defn get-state [key]
(#state key))
(defn update-state [key val]
(swap! state assoc key val))
Your functions have side effects. Calling them twice with the same inputs may give different return values depending on the current value of *some-global-var*. This makes things difficult to test and reason about, especially once you have more than one of these global vars floating around.
People calling your functions may not even know that your functions are depending on the value of the global var, without inspecting the source. What if they forget to initialize the global var? It's easy to forget. What if you have two sets of code both trying to use a library that relies on these global vars? They are probably going to step all over each other, unless you use binding. You also add overheads every time you access data from a ref.
If you write your code side-effect free, these problems go away. A function stands on its own. It's easy to test: pass it some inputs, inspect the outputs, they'll always be the same. It's easy to see what inputs a function depends on: they're all in the argument list. And now your code is thread-safe. And probably runs faster.
It's tricky to think about code this way if you're used to the "mutate a bunch of objects/memory" style of programming, but once you get the hang of it, it becomes relatively straightforward to organize your programs this way. Your code generally ends up as simple as or simpler than the global-mutation version of the same code.
Here's a highly contrived example:
(def *address-book* (ref {}))
(defn add [name addr]
(dosync (alter *address-book* assoc name addr)))
(defn report []
(doseq [[name addr] #*address-book*]
(println name ":" addr)))
(defn do-some-stuff []
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))
Looking at do-some-stuff in isolation, what the heck is it doing? There are a lot of things happening implicitly. Down this path lies spaghetti. An arguably better version:
(defn make-address-book [] {})
(defn add [addr-book name addr]
(assoc addr-book name addr))
(defn report [addr-book]
(doseq [[name addr] addr-book]
(println name ":" addr)))
(defn do-some-stuff []
(let [addr-book (make-address-book)]
(-> addr-book
(add "Brian" "123 Bovine University Blvd.")
(add "Roger" "456 Main St.")
(report))))
Now it's clear what do-some-stuff is doing, even in isolation. You can have as many address books floating around as you want. Multiple threads could have their own. You can use this code from multiple namespaces safely. You can't forget to initialize the address book, because you pass it as an argument. You can test report easily: just pass the desired "mock" address book in and see what it prints. You don't have to care about any global state or anything but the function you're testing at the moment.
If you don't need to coordinate updates to a data structure from multiple threads, there's usually no need to use refs or global vars.