Modify vector so it can be invoked with two arguments - clojure

I'm playing with a matrix implementation in Clojure which I'm doing for the fun of doing it and learning more about Clojure, rather than because I want to create the bestest fastest most coolest matrix implementation in the world.
One of the primary operations needed in code like this is the ability to return the value at a given row and column in a matrix, which of course I've written as a function
(mat-getrc m 2 3)
says "Give me the value at row 2, column 3 in matrix m". Perfectly good Clojure, but verbose and ugly. I'd rather write
(m 2 3)
but of course A) vectors (in my package matrices are just vectors) only respond to a single argument, and B) vectors don't know how to use the row and column number to figure out where the correct value is stored.
From looking at the docs for IFn (which vectors are supposed to implement) it appears that a two-argument version of invoke exists - but how do I get my "matrix" vectors to implement and respond to it?
Any suggestions and pointing-in-the-right-direction appreciated.

You can't modify how vectors are invoked as that's built into the implementation of vector, but you can define your own type that wraps a vector, acts as a vector, and is invokable however you like with deftype. You would need to extend many of the same interfaces that vectors implement (this is however a large list):
user=> (ancestors clojure.lang.PersistentVector)
#{clojure.lang.IEditableCollection clojure.lang.ILookup
java.util.concurrent.Callable java.lang.Runnable clojure.lang.IMeta
java.lang.Comparable clojure.lang.IReduceInit
clojure.lang.IPersistentCollection clojure.lang.IHashEq java.lang.Iterable
clojure.lang.IReduce java.util.List clojure.lang.AFn clojure.lang.Indexed
clojure.lang.Sequential clojure.lang.IPersistentStack java.io.Serializable
clojure.lang.Reversible clojure.lang.Counted java.util.Collection
java.util.RandomAccess java.lang.Object clojure.lang.Seqable
clojure.lang.Associative clojure.lang.APersistentVector
clojure.lang.IKVReduce clojure.lang.IPersistentVector clojure.lang.IObj
clojure.lang.IFn}

(def matrix [[1 2 3 4][5 6 7 8][9 10 11 12]])
As you say in your question this is possible:
(matrix 2)
But this is not:
(matrix 2 3)
This would be a standard way to get the index of an index:
(get-in matrix [2 3])
You can already nearly get what you want, just with a few more parens:
((matrix 2) 3)
You could define a higher order function:
(defn matrix-hof [matrix]
(fn [x y]
(get-in matrix [x y])))
Then put the function rather than the matrix in function position:
(let [m (matrix-hof matrix)]
(m 2 3))
I don't believe that exactly what you are asking is possible using either a function or a macro.

Related

Clojure Core function argument positions seem rather confusing. What's the logic behind it?

For me as, a new Clojurian, some core functions seem rather counter-intuitive and confusing when it comes to arguments order/position, here's an example:
> (nthrest (range 10) 5)
=> (5 6 7 8 9)
> (take-last 5 (range 10))
=> (5 6 7 8 9)
Perhaps there is some rule/logic behind it that I don't see yet?
I refuse to believe that the Clojure core team made so many brilliant technical decisions and forgot about consistency in function naming/argument ordering.
Or should I just remember it as it is?
Thanks
Slightly offtopic:
rand&rand-int VS random-sample - another example where function naming seems inconsistent but that's a rather rarely used function so it's not a big deal.
There is an FAQ on Clojure.org for this question: https://clojure.org/guides/faq#arg_order
What are the rules of thumb for arg order in core functions?
Primary collection operands come first. That way one can write → and its ilk, and their position is independent of whether or not they have variable arity parameters. There is a tradition of this in OO languages and Common Lisp (slot-value, aref, elt).
One way to think about sequences is that they are read from the left, and fed from the right:
<- [1 2 3 4]
Most of the sequence functions consume and produce sequences. So one way to visualize that is as a chain:
map <- filter <- [1 2 3 4]
and one way to think about many of the seq functions is that they are parameterized in some way:
(map f) <- (filter pred) <- [1 2 3 4]
So, sequence functions take their source(s) last, and any other parameters before them, and partial allows for direct parameterization as above. There is a tradition of this in functional languages and Lisps.
Note that this is not the same as taking the primary operand last. Some sequence functions have more than one source (concat, interleave). When sequence functions are variadic, it is usually in their sources.
Adapted from comments by Rich Hickey.
Functions that work with seqs usually has the actual seq as last argument.
(map, filter, remote etc.)
Accessing and "changing" individual elements takes a collection as first element: conj, assoc, get, update
That way, you can use the (->>) macro with a collection consistenly,
as well as create transducers consistently.
Only rarely one has to resort to (as->) to change argument order. And if you have to do so, it might be an opportunity to check if your own functions follow that convention.
For some functions (especially functions that are "seq in, seq out"), the args are ordered so that one can use partial as follows:
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(dotest
(let [dozen (range 12)
odds-1 (filterv odd? dozen)
filter-odd (partial filterv odd?)
odds-2 (filter-odd dozen) ]
(is= odds-1 odds-2
[1 3 5 7 9 11])))
For other functions, Clojure often follows the ordering of "biggest-first", or "most-important-first" (usually these have the same result). Thus, we see examples like:
(get <map> <key>)
(get <map> <key> <default-val>)
This also shows that any optional values must, by definition, be last (in order to use "rest" args). This is common in most languages (e.g. Java).
For the record, I really dislike using partial functions, since they have user-defined names (at best) or are used inline (more common). Consider this code:
(let [dozen (range 12)
odds (filterv odd? dozen)
evens-1 (mapv (partial + 1) odds)
evens-2 (mapv #(+ 1 %) odds)
add-1 (fn [arg] (+ 1 arg))
evens-3 (mapv add-1 odds)]
(is= evens-1 evens-2 evens-3
[2 4 6 8 10 12]))
Also
I personally find it really annoying trying to parse out code using partial as with evens-1, especially for the case of user-defined functions, or even standard functions that are not as simple as +.
This is especially so if the partial is used with 2 or more args.
For the 1-arg case, the function literal seen for evens-2 is much more readable to me.
If 2 or more args are present, please make a named function (either local, as shown for evens-3), or a regular (defn some-fn ...) global function.

How to iterate over a clojure eduction - without creating a seq?

For the sake of this question, let's assume I created the following eduction.
(def xform (map inc))
(def input [1 2 3])
(def educt (eduction xform input))
Now I want to pass educt to some function that can then do some kind of reduction. The reason I want to pass educt, rather than xform and input is that I don't want to expose xform and input to that function. If I did, that function could simply do a (transduce xform f init input). But as I don't, that function is left with an eduction that cannot be used with transduce.
I know I can e.g. use doseq on eductions, but I believe this will create a seq - with all its overhead in terms of object instantiation and usage for caching.
So how can I efficiently and idiomatically iterate over an eduction?
As eductions implement java.lang.Iterable, this question probably generalizes to:
How to iterate over a java.lang.Iterable without creating a seq?
reduce can be used to do that.
It works on instances of IReduceInit, which eduction implements.

Clojure / Incanter Data Transformations Capabilities

I'm considering Clojure / Incanter as an alternative to R
and just wondering if clojure / incanter have the capabilities to do the following:
Import the result of a SQL statement as a data-set ( I do this in R using dbGetQuery ).
Reshape the data-set - turning rows into columns also known as "pivot" / "unpivot"- I do this in R using the reshape, reshape2 packages ( in the R world it's called melting and casting data ).
Save the reshaped data-set to a SQL table ( I do this in R using dbWriteTable function in RMySQL )
You may be interested in core.matrix - it's a project to bring multi-dimensional array and numerical computation capabilities into Clojure. Still in very active development but already usable.
Features:
A clean, functional API
Proper multi-dimensional arrays
Idiomatic style of working with Clojure data, e.g. the nested vector [[1 2] [3 4]] can be automatically used as a 2x2 matrix.
All of the array reshaping capabilities you might expect.
All of the usual matrix operations (multiplication, scaling, determinants etc.)
Support for multiple back end matrix implementations, e.g. JBLAS for high performance (uses native code)
See some example code here:
;; a matrix can be defined using a nested vector
(def a (matrix [[2 0] [0 2]]))
;; core.matrix.operators overloads operators to work on matrices
(* a a)
;; a wide range of mathematical functions are defined for matrices
(sqrt a)
;; you can get rows and columns of matrices individually
(get-row a 0)
;; Java double arrays can be used as vectors
(* a (double-array [1 2]))
;; you can modify double arrays in place - they are examples of mutable vectors
(let [a (double-array [1 4 9])]
(sqrt! a) ;; "!" signifies an in-place operator
(seq a))
;; you can coerce matrices between different formats
(coerce [] (double-array [1 2 3]))
;; scalars can be used in many places that you can use a matrix
(* [1 2 3] 2)
;; operations on scalars alone behave as you would expect
(* 1 2 3 4 5)
;; you can do various functional programming tricks with matrices too
(emap inc [[1 2] [3 4]])
core.matrix has been approved by Rich Hickey as an official Clojure contrib library, and it is likely that Incanter will switch over to using core.matrix in the future.
SQL table support isn't directly included in core.matrix, but it would only be a one-liner to convert a resultset from clojure.java.jdbc into a core.matrix array. Something like the following should do the trick:
(coerce [] (map vals resultset))
Then you can transform and process it with core.matrix however you like.

How can you validate function arguments in an efficient and DRY manner?

Let’s say I have three functions that operate on matrices:
(defn flip [matrix] (...))
(defn rotate [matrix] (...))
(defn inc-all [matrix] (...))
Imagine each function requires a vector of vectors of ints (where each inner vector is the same length) in order to function correctly.
I could provide a an assert-matrix function that validates that the matrix data is in the correct format:
(defn assert-matrix [matrix] (...) )
However, the flip function (for example) has no way of knowing whether data is passed to the function has been validated (it is totally up to the user whether they could be bothered validating it before passing it to the function). Therefore, to guarantee correctness flip would need to defined as:
(defn flip [matrix]
(assert-matrix matrix)
(...))
There are two main problems here:
It’s inefficient to have to keep calling assert-matrix every time a matrix function is called.
Whenever I create a matrix function I have to remember to call assert-matrix. Chances are I will forget as it is tedious repeating this.
In an Object Oriented language, I’d create a Class named Matrix with a constructor that checks the validity of the constructor args when the instance is created. There’s no need for methods to re-check the validity as they can be confident the data was validated when the class was initialised.
How would this be achieved in Clojure?
There are several ways to validate a data structure only once, you could for instance write a with-matrix macro along the lines of the following:
(defmacro -m> [matrix & forms]
`(do
(assert-matrix ~matrix
(-> ~matrix
~#forms))
which would allow you to do:
(-m> matrix flip rotate)
The above extends the threading macro to better cope with your use case.
There can be infinite variations of the same approach, but the idea should still be the same: the macro will make sure that a piece of code is executed only if the validation succeeds, with functions operating on matrices without any embedded validation code. Instead of once per method execution, the validation will be executed once per code block.
Another way could be to make sure all the code paths to matrix functions have a validation boundary somewhere.
You may also want to check out trammel.
You could use a protocol to represent all the operations on matrix and then create a function that acts like the "constructor" for matrix:
(defprotocol IMatrix
(m-flip [_])
(m-rotate [_])
(m-vals [_]))
(defn create-matrix [& rows]
(if (apply distinct? (map count rows))
(throw (Exception. "Voila, what are you doing man"))
(reify
IMatrix
(m-flip [_] (create-matrix rows))
(m-rotate [_] (create-matrix rows))
(m-vals [_] (vec rows)))))
(def m (create-matrix [1 2 3] [4 5 6]))
(m-flip m)

Splice in Clojure

is there a single function for getting "from x to y" items in a sequence?
For example, given (range 10) I want [5 6 7 8] (take from 6th to nineth, or take 4 from the 6th,). Of course I can have this with a combination of a couple of functions (eg (take 4 (drop 5 (range 10)))), but is seems strange that there's not a built-in like pythons's mylist[5:9]. Thanks
subvec for vectors, primarily since it is O(1). For seqs you will need to use the O(n) of take/drop.
From a philosophical point of view, the reason there's no built-in operator is that you don't need a built-in operator to make it feel "natural" like you do in Python.
(defn splice [coll start stop]
(take (- stop start) (drop start coll)))
(splice coll 6 10)
Feels just like a language built-in, with exactly as much "new syntax" as any feature. In Python, the special [x:y] operator needs language-level support to make it feel as natural as the single-element accessor.
So rather than cluttering up the (already crowded) language core, Clojure simply leaves room for a user or library to implement this if you want it.
(range 5 9), or (vec (range 5 9)).
(Perhaps this syntax for range wasn't available mid-2012.)