How can you validate function arguments in an efficient and DRY manner? - clojure

Let’s say I have three functions that operate on matrices:
(defn flip [matrix] (...))
(defn rotate [matrix] (...))
(defn inc-all [matrix] (...))
Imagine each function requires a vector of vectors of ints (where each inner vector is the same length) in order to function correctly.
I could provide a an assert-matrix function that validates that the matrix data is in the correct format:
(defn assert-matrix [matrix] (...) )
However, the flip function (for example) has no way of knowing whether data is passed to the function has been validated (it is totally up to the user whether they could be bothered validating it before passing it to the function). Therefore, to guarantee correctness flip would need to defined as:
(defn flip [matrix]
(assert-matrix matrix)
(...))
There are two main problems here:
It’s inefficient to have to keep calling assert-matrix every time a matrix function is called.
Whenever I create a matrix function I have to remember to call assert-matrix. Chances are I will forget as it is tedious repeating this.
In an Object Oriented language, I’d create a Class named Matrix with a constructor that checks the validity of the constructor args when the instance is created. There’s no need for methods to re-check the validity as they can be confident the data was validated when the class was initialised.
How would this be achieved in Clojure?

There are several ways to validate a data structure only once, you could for instance write a with-matrix macro along the lines of the following:
(defmacro -m> [matrix & forms]
`(do
(assert-matrix ~matrix
(-> ~matrix
~#forms))
which would allow you to do:
(-m> matrix flip rotate)
The above extends the threading macro to better cope with your use case.
There can be infinite variations of the same approach, but the idea should still be the same: the macro will make sure that a piece of code is executed only if the validation succeeds, with functions operating on matrices without any embedded validation code. Instead of once per method execution, the validation will be executed once per code block.
Another way could be to make sure all the code paths to matrix functions have a validation boundary somewhere.
You may also want to check out trammel.

You could use a protocol to represent all the operations on matrix and then create a function that acts like the "constructor" for matrix:
(defprotocol IMatrix
(m-flip [_])
(m-rotate [_])
(m-vals [_]))
(defn create-matrix [& rows]
(if (apply distinct? (map count rows))
(throw (Exception. "Voila, what are you doing man"))
(reify
IMatrix
(m-flip [_] (create-matrix rows))
(m-rotate [_] (create-matrix rows))
(m-vals [_] (vec rows)))))
(def m (create-matrix [1 2 3] [4 5 6]))
(m-flip m)

Related

How to iterate over a clojure eduction - without creating a seq?

For the sake of this question, let's assume I created the following eduction.
(def xform (map inc))
(def input [1 2 3])
(def educt (eduction xform input))
Now I want to pass educt to some function that can then do some kind of reduction. The reason I want to pass educt, rather than xform and input is that I don't want to expose xform and input to that function. If I did, that function could simply do a (transduce xform f init input). But as I don't, that function is left with an eduction that cannot be used with transduce.
I know I can e.g. use doseq on eductions, but I believe this will create a seq - with all its overhead in terms of object instantiation and usage for caching.
So how can I efficiently and idiomatically iterate over an eduction?
As eductions implement java.lang.Iterable, this question probably generalizes to:
How to iterate over a java.lang.Iterable without creating a seq?
reduce can be used to do that.
It works on instances of IReduceInit, which eduction implements.

Modify vector so it can be invoked with two arguments

I'm playing with a matrix implementation in Clojure which I'm doing for the fun of doing it and learning more about Clojure, rather than because I want to create the bestest fastest most coolest matrix implementation in the world.
One of the primary operations needed in code like this is the ability to return the value at a given row and column in a matrix, which of course I've written as a function
(mat-getrc m 2 3)
says "Give me the value at row 2, column 3 in matrix m". Perfectly good Clojure, but verbose and ugly. I'd rather write
(m 2 3)
but of course A) vectors (in my package matrices are just vectors) only respond to a single argument, and B) vectors don't know how to use the row and column number to figure out where the correct value is stored.
From looking at the docs for IFn (which vectors are supposed to implement) it appears that a two-argument version of invoke exists - but how do I get my "matrix" vectors to implement and respond to it?
Any suggestions and pointing-in-the-right-direction appreciated.
You can't modify how vectors are invoked as that's built into the implementation of vector, but you can define your own type that wraps a vector, acts as a vector, and is invokable however you like with deftype. You would need to extend many of the same interfaces that vectors implement (this is however a large list):
user=> (ancestors clojure.lang.PersistentVector)
#{clojure.lang.IEditableCollection clojure.lang.ILookup
java.util.concurrent.Callable java.lang.Runnable clojure.lang.IMeta
java.lang.Comparable clojure.lang.IReduceInit
clojure.lang.IPersistentCollection clojure.lang.IHashEq java.lang.Iterable
clojure.lang.IReduce java.util.List clojure.lang.AFn clojure.lang.Indexed
clojure.lang.Sequential clojure.lang.IPersistentStack java.io.Serializable
clojure.lang.Reversible clojure.lang.Counted java.util.Collection
java.util.RandomAccess java.lang.Object clojure.lang.Seqable
clojure.lang.Associative clojure.lang.APersistentVector
clojure.lang.IKVReduce clojure.lang.IPersistentVector clojure.lang.IObj
clojure.lang.IFn}
(def matrix [[1 2 3 4][5 6 7 8][9 10 11 12]])
As you say in your question this is possible:
(matrix 2)
But this is not:
(matrix 2 3)
This would be a standard way to get the index of an index:
(get-in matrix [2 3])
You can already nearly get what you want, just with a few more parens:
((matrix 2) 3)
You could define a higher order function:
(defn matrix-hof [matrix]
(fn [x y]
(get-in matrix [x y])))
Then put the function rather than the matrix in function position:
(let [m (matrix-hof matrix)]
(m 2 3))
I don't believe that exactly what you are asking is possible using either a function or a macro.

Is there any macro(except declare) for writing functions without think of the order of function declarations in Clojure

I want to write functions without think of the order of function declarations, I don't want to use declare function because I need to declare all function names which I don't want to do it.
I want some macro or some function that does the magic for me. Long story short, I need to write functions like Java(method declaration order does not matter )
One of the best things I like most about functional programming is, that its flow of writing is the same as the flow of thinking. It's like peeling an onion, and at every moment, I only need to concentrate on working on this single layer, and take the inner part for granted. And don't worry about function names, foo and bar would be fine at first. In this style of writing, functions are defined and implemented from the end of the source file back to the top. In cases when one function calls multiple other functions, this linear structure becomes a tree-like structure, but there is always a single point in file to insert new functions. No choices and worries.
Yes, there are times when we need to work on some code snippets with no other dependencies. They can be put at the top of the source file of course.
To answer the question upfront, macros are not magic. If there exists such a macro, this macro will need to take the whole source file as input, analyze the dependency between each code blocks, and re-flow them in the correct order. The analysis of dependency is non-trivial because of lexical scoping. It's almost like writing a compiler. I don't believe such macro exists (has been written by anyone), and the goods it can do is, well to me, not so big.
It is kind of unneeded, but easily possible as an exercise (up to some point). You can write macro, that would forward declare all the top level functions wrapped in it:
(defmacro with-forward-declaration [& body]
(let [names (keep #(when (#{'defn 'defn-} (first %))
(if (map? (second %)) (nth % 2) (second %)))
body)]
`(do (declare ~#names)
~#body)))
that is how it is expanded:
(with-forward-declaration
(defn ^long ff1 [x] (* 2 (ff2)))
(println "just to be sure it doesn't eat other forms")
(defn ^{:name 'xx} ff2 [] (+ 10 (ff3)))
(defn ff3 [] 101)
(defn- ff4 [] :x))
would do the following:
(do
(declare ff1 ff2 ff3 ff4)
(defn ff1 [x] (* 2 (ff2)))
(println "just to be sure it doesn't eat other forms")
(defn ff2 [] (+ 10 (ff3)))
(defn ff3 [] 101)
(defn- ff4 [] :x))
so if you wrap all your namespace's code into this macro, it would predeclare all the functions for you. If you want to go deeper, like predeclaring all some possible defns that are not in the top level, you could update this toy macro by using clojure.walk to find these inner forms.. But i guess this one is just enough, to play with clojure quickly, without thinking about functions' order. I wouldn't do that in production though (at least not without the heavy testing).

calling a library function of arbitrary name

say there's a library l, which has two functions (a and b).
Calling both functions and merging the results into a vector could be done like this:
(concat (l/a) (l/b))
Is there a way to make this more generic? I tried something like this, but it threw an exception:
(apply concat (map #(l/%) ['a 'b]))
of course, this would work:
(apply concat [l/a l/b])
Calling both functions and merging the results into a vector could be done like this:
(concat (l/a) (l/b))
No, you will not get a vector. And you will only get a sequence if those functions return sequences. Otherwise, definitely not, you will get a runtime exception with this code and your assumption.
It sounds like you have a bunch of functions and you want to concatenate the results of them all together? There is no need to quote them, just make a sequence of the functions:
[l/a l/b l/c ...]
And use apply with concat as you already are, or use reduce to accumulate values.
Call vec on the result if you need it to be a vector rather than a sequence.
Your other solutions are definitely making your code much much more complex, unnecessary, and difficult to read. (also, you almost never need to quote vars as you are doing)
It looks like you want a general way of invoking a function inside a namespace. You can construct a symbol and dereference it to find the functions, then combine the results using mapcat e.g.
(mapcat #((find-var (symbol "l" %))) ["a" "b"])
alternatively you could first find the namespace and use ns-resolve to find the vars e.g.
(let [ns (find-ns 'l)]
(mapcat #((ns-resolve ns %)) ['a 'b]))

What are side-effects in predicates and why are they bad?

I'm wondering what is considered to be a side-effect in predicates for fns like remove or filter. There seems to be a range of possibilities. Clearly, if the predicate writes to a file, this is a side-effect. But consider a situation like this:
(def *big-var-that-might-be-garbage-collected* ...)
(let [my-ref *big-var-that-might-be-garbage-collected*]
(defn my-pred
[x]
(some-operation-on my-ref x)))
Even if some-operation-on is merely a query that does not change state, the fact that my-pred retains a reference to *big... changes the state of the system in that the big var cannot be garbage collected. Is this also considered to be side-effect?
In my case, I'd like to write to a logging system in a predicate. Is this a side effect?
And why are side-effects in predicates discouraged exactly? Is it because filter and remove and their friends work lazily so that you cannot determine when the predicates are called (and - hence - when the side-effects happen)?
GC is not typically considered when evaluating if a function is pure or not, although many actions that make a function impure can have a GC effect.
Logging is a side effect, as is changing any state in the program or the world. A pure function takes data and returns data, without modifying anything else.
https://softwareengineering.stackexchange.com/questions/15269/why-are-side-effects-considered-evil-in-functional-programming covers why side effects are avoided in functional languages.
I found this link helpful
The problem is determining when, or even whether, the side-effects will occur on any given call to the function.
If you only care that the same inputs return the same answer, you are fine. Side-effects are dependent on how the function is executed.
For example,
(first (filter odd? (range 20)))
; 1
But if we arrange for odd? to print its argument as it goes:
(first (filter #(do (print %) (odd? %)) (range 20)))
It will print 012345678910111213141516171819 before returning 1!
The reason is that filter, where it can, deals with its sequence argument in chunks of 32 elements.
If we take the limit off the range:
(first (filter #(do (print %) (odd? %)) (range)))
... we get a full-size chunk printed: 012345678910111213141516171819012345678910111213141516171819202122232425262728293031
Just printing the argument is confusing. If the side effects are significant, things could go seriously awry.