Lowest common ancestor in parent/child isa? hierarchy in Clojure - clojure

Let's say we have this parent/child hierarchy:
(derive ::integer ::decimal)
(derive ::positive-integer ::integer)
(derive ::long ::integer)
What is a Clojure idiomatic to implement a way to find the lowest common ancestor in such a hierarchy? I.e.:
(lca ::positive-integer ::long) ; => ::integer
My initial thoughts include using a recursive function traversing combinations of parents of each argument, but I suspect there probably is a better approach.
My motivation is to use this as a dispatch function for a multimethod that takes 2 arguments and dispatches to the best suited implementation based on the types of the arguments.

The function ancestors returns a set, so you'll want to (require [clojure.set :as s]).
Now write:
(defn lca [h1 h2]
(let [a1 (into #{} (conj (ancestors h1) h1))
a2 (into #{} (conj (ancestors h2) h2))
ac (s/intersection a1 a2)]
(apply (partial max-key (comp count ancestors)) ac)))
Let's try it out!
stack-prj.hierarchy> (derive ::integer ::decimal)
nil
stack-prj.hierarchy> (derive ::positive-integer ::integer)
nil
stack-prj.hierarchy> (derive ::long ::integer)
nil
stack-prj.hierarchy> (lca ::positive-integer ::long)
:stack-prj.hierarchy/integer
Here's how it works: I take the set of ancestors of each type using ancestors. I add the type itself to the resulting set (since I think (lca ::integer ::long) should return integer instead of decimal) using conj, for both types. Using set intersection, I store all common ancestors into the variable ac.
Of the common ancestors, I want to know which one of those has the most ancestors. (comp count ancestors) is a function which takes one type and returns the number of ancestors it has. I partially apply max-key to this function, and then I apply (using apply) the resulting function to the collection ac. The result is the common ancestor with the greatest number of ancestors, or the least common ancestor.
(Note that lca will give an error if you pass it two types without any common ancestors! You should decide how to handle this case yourself.)

Related

Mapping two string lists (in a short way) in Lisp?

Lisp beginner here.
I have two string lists in this form with same length:
keys = ("abc" "def" "gh" ...)
values = ("qwe" "opr" "kmn" ...)
I need to construct hash-table or association lists (whichever is easy to construct and fast to get values from) from those lists. They are in the proper index due to their pair.
I know I can map them with iterating. But I want go with a more declarative way and I am looking for a clean way to this, if it can be done so.
There is a dedicated function named PAIRLIS that does exactly what what you want to build association lists:
USER> (pairlis '("abc" "def" "gh")
'("qwe" "opr" "kmn"))
(("gh" . "kmn") ("def" . "opr") ("abc" . "qwe"))
Note that the order is reversed, but this depends on the implementation. Here orders does not matter since your keys are unique.
Then, you can use the popular alexandria library to build a hash-table from that:
USER> (alexandria:alist-hash-table * :test #'equalp)
#<HASH-TABLE :TEST EQUALP :COUNT 3 {101C66ECA3}>
Here I am using a hash-table with test equalp because your keys are strings.
NB. The * symbol refers to the last primary value in a REPL
You could do something such as mapcar which will handle the iteration for you, vs. manually entering some sort of loop for iteration. For example:
(defvar *first-names* '("tom" "aaron" "drew"))
(defvar *last-names* '("brady" "rogers" "brees"))
(defvar *names-table* (make-hash-table))
We could create a list of the two sets of names and then a hashtable (or alist if you prefer). Then we can simply user mapcar to map through the list of us instead of manually entering a loop such as do, dolist, dotimes, loop ect…
(mapcar #'(lambda (first last)
(setf (gethash first *names-table*) last))
*first-names*
*last-names*)
mapping is particularly useful for lists in common lisp.
Note that as well as pairlis &c the normal mapping functions such as mapcar in fact take multiple list arguments and call the function being mapped on each of them. So a simple-minded version of (part of) what pairlis does might be:
(defun kv->alist (keys values)
(mapcar #'cons keys values))
(In fact this has an advantage over pairlis in some cases: the order of the result is determinate.)
And if you want to make a hashtable:
(defun kv->ht (keys values &key (test #'eql))
(let ((ht (make-hash-table :test test)))
(mapc (lambda (k v)
(setf (gethash k ht) v))
keys values)
ht))

Is manipulating a vector of nested maps possible using zippers?

I need to turn the following input into output by applying the following two rules:
remove all vectors that have "nope" as last item
remove each map that does not have at least one vector with "ds1" as last item
(def input
[{:simple1 [:from [:simple1 'ds1]]}
{:simple2 [:from-any [[:simple2 'nope] [:simple2 'ds1]]]}
{:walk1 [:from [:sub1 :sub2 'ds1]]}
{:unaffected [:from [:unaffected 'nope]]}
{:replaced-with-nil [:from [:the-original 'ds1]]}
{:concat1 [:concat [[:simple1 'ds1] [:simple2 'ds1]]]}
{:lookup-word [:lookup [:word 'word :word 'ds1]]}])
(def output
[{:simple1 [:from [:simple1 'ds1]]}
{:simple2 [:from-any [[:simple2 'ds1]]]}
{:walk1 [:from [:sub1 :sub2 'ds1]]}
{:replaced-with-nil [:from [:the-original 'ds1]]}
{:concat1 [:concat [[:simple1 'ds1] [:simple2 'ds1]]]}
{:lookup-word [:lookup [:word 'word :word 'ds1]]}])
I was wondering if performing this transformation is possible with zippers?
I'd recommend clojure.walk instead for this kind of general tree transformation. It can take a bit of fiddling to get the replacement functions right but it works nicely with any nesting of Clojure data structures, which AFAIK can be a bit more challenging in a zipper based approach.
We're looking to shrink our tree, so postwalk is my go-to here. It takes a function f and a tree root and goes through the tree, replacing each leaf value with (f leaf), then their parents and their parents etc. until finally replacing the root. (prewalk is similar but proceeds from root and down to leaves, so it's usually more natural when you're growing the tree by splitting branches.)
The strategy here is to somehow construct a function that prunes any branch which meets our removal criteria, but returns any other value unchanged.
(ns shrink-tree
(:require [clojure.walk :refer [postwalk]]))
(letfn[(rule-1 [node]
(and (vector? node)
(= 'nope (last node))))
(rule-2 [node]
(and
(map? node)
(not-any? #(and (vector? %) (= 'ds1 (last %)))
(tree-seq vector? seq (-> node vals first)))))
(remove-marked [node]
(if (coll? node)
(into (empty node) (remove (some-fn rule-1 rule-2) node))
node))]
(= output (postwalk remove-marked input)))
;; => true
Here the fns rule-1 and rule-2 try to turn your rules into predicates and remove-marked:
If a node is a collection, returns the same collection, less any members for which rule1 or rule2 return truthy when called with that member. To check for either one at the same time we combine the predicates with some-fn.
Otherwise returns the same node. This is how we keep values like 'ds1 or :from-any around.
You might also want to consider looking at specter. It supports these sorts of transformations by allowing you to select and transform arbitrarily complex structures.

Is there a complete list of lazy functions of Clojure's core module?

After a while of working with Clojure, I have accumulated some knowledge on its laziness. I know whether a frequently-used API such as map is lazy. However, I still feel dubious when I start using an unfamiliar API such as with-open.
Is there any document that shows a complete list of lazy APIs of Clojure's core module?
You can find functions that return lazy sequences by opening up the Clojure code https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj
and searching for "Returns a lazy"
I am not aware of any curated lists of them.
The rule of thumb is: if it returns a sequence, it will be a lazy sequence, if it returns a value, it will force evaluation.
When using a new function, macro or special form, read the docstring. Most development environments have a key to show the docstring, or at least navigate to the source (where you can see the docstring), and there is always http://clojure.org/api/api.
In the case of with-open:
with-open
macro
Usage: (with-open bindings & body)
bindings => [name init ...]
Evaluates body in a try expression with names bound to the values
of the inits, and a finally clause that calls (.close name) on each
name in reverse order.
We can see that the result of calling with-open is evaluation of the expression with a final close. So we know that there is nothing lazy about it. However that doesn't mean you don't need to think about laziness inside with-open, quite the opposite!
(with-open [r (io/reader "myfile")]
(line-seq r))
This is a common trap. line-seq returns a lazy sequence! The problem here is that the lazy sequence will be realized after the file is closed, because the file is closed when exiting the scope of with-open. So you need to fully process the lazy sequence before exiting the with-open scope.
My advice is to avoid trying to think about your program as having 'lazy bits' and 'immediate bits', but instead just be mindful that when io or side-effects are involved you need to take care of when things happen as well as what should happen.
digging on a Timothy Pratley's proposal to search in doc:
let's make it fun!
your repl has everything that you need to find out a list of lazy functions.
first of all, there is a clojure.repl/doc macro, which prints documentation to out in repl
user> (doc +)
-------------------------
clojure.core/+
([] [x] [x y] [x y & more])
Returns the sum of nums. (+) returns 0. Does not auto-promote
longs, will throw on overflow. See also: +'
nil
unfortunately we can't get a string of it simply, but we can always rebind the *out* to be a StringWriter, and then get its string value.
so, whan we want to take all the symbols from clojure.core namespace, get their docs, write them all to string, and find every one that contains "returns a lazy". Here comes the help: clojure.core/ns-publics, returning a map of public names to their vars:
user> (take 10 (ns-publics 'clojure.core))
([primitives-classnames #'clojure.core/primitives-classnames]
[+' #'clojure.core/+']
[decimal? #'clojure.core/decimal?]
[restart-agent #'clojure.core/restart-agent]
[sort-by #'clojure.core/sort-by]
[macroexpand #'clojure.core/macroexpand]
[ensure #'clojure.core/ensure]
[chunk-first #'clojure.core/chunk-first]
[eduction #'clojure.core/eduction]
[tree-seq #'clojure.core/tree-seq])
so we just need to get all the keys from there and lookup for their docs.
Let's make a macro for that:
user> (defmacro all-docs []
(let [names (keys (ns-publics 'clojure.core))]
`(binding [*out* (java.io.StringWriter.)]
(do ~#(map #(list `doc %) names))
(str *out*))))
#'user/all-docs
it does just what i've said, gets all publics' docs to string.
now we simply process it:
user> (def all-doc-items (clojure.string/split
(all-docs)
#"-------------------------"))
#'user/all-doc-items
user> (nth all-doc-items 10)
"\nclojure.core/tree-seq\n([branch? children root])\n Returns a lazy sequence of the nodes in a tree, via a depth-first walk.\n branch? must be a fn of one arg that returns true if passed a node\n that can have children (but may not). children must be a fn of one\n arg that returns a sequence of the children. Will only be called on\n nodes for which branch? returns true. Root is the root node of the\n tree.\n"
and now just filter them:
user> (def all-lazy-fns (filter #(re-find #"(?i)returns a lazy" %) all-doc-items))
#'user/all-lazy-fns
user> (count all-lazy-fns)
30
user> (println (take 3 all-lazy-fns))
(
clojure.core/tree-seq
([branch? children root])
Returns a lazy sequence of the nodes in a tree, via a depth-first walk.
branch? must be a fn of one arg that returns true if passed a node
that can have children (but may not). children must be a fn of one
arg that returns a sequence of the children. Will only be called on
nodes for which branch? returns true. Root is the root node of the tree.
clojure.core/keep-indexed
([f] [f coll])
Returns a lazy sequence of the non-nil results of (f index item). Note,
this means false return values will be included. f must be free of
side-effects. Returns a stateful transducer when no collection is
provided.
clojure.core/take-nth
([n] [n coll])
Returns a lazy seq of every nth item in coll. Returns a stateful
transducer when no collection is provided.
)
nil
And now use these all-lazy-fns however you want.

Clojure (or any functional language): is there a functional way of building flat lists by a recursive function?

I've got a recursive function building a list:
(defn- traverse-dir
"Traverses the (source) directory, preorder"
[src-dir dst-root dst-step ffc!]
(let [{:keys [options]} *parsed-args*
uname (:unified-name options)
[dirs files] (list-dir-groomed (fs/list-dir src-dir))
... recursive call of traverse-dir is the last expression of dir-handler
(doall (concat (map-indexed (dir-handler) dirs) (map-indexed (file-handler) files))))) ;; traverse-dir
The list, built by traverse-dir, is recursive, while I want a flat one:
flat-list (->> (flatten recursive-list) (partition 2) (map vec))
Is there a way of building the flat list in the first place? Short of using mutable lists, that is.
I don't quite understand your context with a dir-handler that is called with nothing and returns a function which expects indices and directories, list-dir-groomed and all of that, but I'd recommend a look at tree-seq:
(defn tree-seq
"Returns a lazy sequence of the nodes in a tree, via a depth-first walk.
branch? must be a fn of one arg that returns true if passed a node
that can have children (but may not). children must be a fn of one
arg that returns a sequence of the children. Will only be called on
nodes for which branch? returns true. Root is the root node of the
tree."
{:added "1.0"
:static true}
[branch? children root]
(let [walk (fn walk [node]
(lazy-seq
(cons node
(when (branch? node)
(mapcat walk (children node))))))]
(walk root)))
My go-to use here is
(tree-seq #(.isDirectory %) #(.listFiles %) (clojure.java.io/as-file file-name))
but your context might mean that doesn't work. You can change to different functions for getting child files if you need to sanitize those, or you can just use filter on the output. If that's no good, the same pattern of a local fn from nodes into pre-walks that handles children by recursively mapcatting itself over them seems pretty applicable.

How can I improve this Clojure function?

I just wrote my first Clojure function based on my very limited knowledge of the language. I would love some feedback in regards to performance and use of types. For example, I'm not sure
if I should be using lists or vectors.
(defn actor-ids-for-subject-id [subject-id]
(sql/with-connection (System/getenv "DATABASE_URL")
(sql/with-query-results results
["SELECT actor_id FROM entries WHERE subject_id = ?" subject-id]
(let [res (into [] results)]
(map (fn [row] (get row :actor_id)) res)))))
It passes the following test (given proper seed data):
(deftest test-actor-ids-for-subject-id
(is (= ["123" "321"] (actor-ids-for-subject-id "123"))))
If it makes a difference (and I imagine it does) my usage characteristics of the returned data will almost exclusively involve generating the union and intersection of another set returned by the same function.
it's slightly more concise to use 'vec' instead of 'into' when the initial vector is empty. it may express the intent more clearly, though that's more a matter of preference.
(vec (map :actor_id results))
the results is a clojure.lang.Cons, is lazy sequence, return by clojure.java.jdbc/resultset-seq. each record is a map:
(defn actor-ids-for-subject-id [subject-id]
(sql/with-connection (System/getenv "DATABASE_URL")
(sql/with-query-results results
["SELECT actor_id FROM entries WHERE subject_id = ?" subject-id]
(into [] (map :actor_id results)))))