Idiomatic expression simplification in Clojure - clojure

Inspired by this excellent post I wanted to implement a simple expression simplifier in Clojure using the algorithm used in the post. The post gives example implementations in F#, Scala, Haskell, C++, and Julia which all appear fairly elegant.
I have come up with two different implementations (see below) but I have a nagging feeling that they are both less than idiomatic.
My question is: What would an idiomatic Clojure implementation look like?
First implementation, based primarily on protocols:
(defprotocol Expr
(simplify1 [e])
(simplify [e]))
(defrecord Const [n]
Expr
(simplify1 [this] this)
(simplify [this] this))
(defrecord Variable [name]
Expr
(simplify1 [this] this)
(simplify [this] this))
(defrecord Add [l r]
Expr
(simplify1 [{:keys [l r] :as expr}]
(let [lclass (class l)
rclass (class r)]
(cond
(= lclass rclass Const)
(Const. (+ (:n l) (:n r)))
(and (= lclass Const) (= (:n l) 0))
r
(and (= rclass Const) (= (:n r) 0))
l
:else expr)))
(simplify [{:keys [l r]}]
(simplify1 (Add. (simplify l) (simplify r)))))
(defrecord Mult [l r]
Expr
(simplify1 [{:keys [l r] :as expr}]
(let [lclass (class l)
rclass (class r)]
(cond
(= lclass rclass Const)
(Const. (* (:n l) (:n r)))
(and (= lclass Const) (= (:n l) 0))
(Const. 0)
(and (= rclass Const) (= (:n r) 0))
(Const. 0)
(and (= lclass Const) (= (:n l) 1))
r
(and (= rclass Const) (= (:n r) 1))
l
:else expr)))
(simplify [{:keys [l r]}]
(simplify1 (Mult. (simplify l) (simplify r)))))
(defmulti print-expr class)
(defmethod print-expr Const [e]
(print-str (.value e)))
(defmethod print-expr ::expr [e]
(print-str "The expression cannot be simplified to a constant"))
(let [e (Add. (Mult. (Add. (Const. 1) (Mult. (Const. 0) (Variable. "X"))) (Const. 3)) (Const. 12))]
(-> e
simplify
print-expr))
Second implementation, primarily based on multimethods and more verbose than the first:
(defrecord Const [value])
(defrecord Variable [name])
(defrecord Add [l r])
(defrecord Mult [l r])
(derive Const ::expr)
(derive Variable ::expr)
(derive Add ::expr)
(derive Mult ::expr)
(defn sim-1-disp [{:keys [l r] :as e}]
(if (some #{(class e)} [Add Mult])
[(class e) (class l) (class r)]
(class e)))
(defmulti simplify class)
(defmulti simplify1 sim-1-disp)
(defmulti print-expr class)
(defmethod simplify Add [{:keys [l r]}]
(simplify1 (Add. (simplify l) (simplify r))))
(defmethod simplify Mult [{:keys [l r]}]
(simplify1 (Mult. (simplify l) (simplify r))))
(defmethod simplify ::expr [e]
e)
(defmethod simplify1 [Add Const Const] [{:keys [l r]}]
(Const. (+ (:value l) (:value r))))
(defmethod simplify1 [Add Const ::expr] [{:keys [l r] :as e}]
(if (= (:value l) 0)
r
e))
(defmethod simplify1 [Add ::expr Const] [{:keys [l r] :as e}]
(if (= (:value r) 0)
l
e))
(defmethod simplify1 [Mult Const Const] [{:keys [l r]}]
(Const. (* (.value l) (.value r))))
(defmethod simplify1 [Mult Const ::expr] [{:keys [l r] :as e}]
(cond (= (:value l) 0)
(Const. 0)
(= (:value l) 1)
r
:else e))
(defmethod simplify1 [Mult ::expr Const] [{:keys [l r] :as e}]
(cond (= (:value r) 0)
(Const. 0)
(= (:value r) 1)
l
:else e))
(defmethod simplify1 ::expr [e]
e)
(defmethod print-expr Const [e]
(print-str (.value e)))
(defmethod print-expr ::expr [e]
(print-str "The expression cannot be simplified to a constant"))
(let [e (Add. (Mult. (Add. (Const. 1) (Mult. (Const. 0) (Variable. "X"))) (Const. 3)) (Const. 12))]
(-> e
simplify
print-expr))

Not sure about being the idiomatic implementation, but I think as Guillermo Winkler mentioned core.match is a pretty natural alternative approach, especially with variants. As your linked article says, sum types are pretty neat.
(ns simplify
(:require [clojure.core.match :refer [match]]))
(defn- simplify-1 [expr]
(match expr
[::add [::const 0] a] a
[::add a [::const 0]] a
[::add [::const a] [::const b]] [::const (+ a b)]
[::mult [::const 0] _] [::const 0]
[::mult _ [::const 0]] [::const 0]
[::mult a [::const 1]] a
[::mult [::const 1] a] a
[::mult [::const a] [::const b]] [::const (* a b)]
_ expr))
(defn simplify [expr]
(match expr
[::add a b ] (simplify-1 [::add (simplify a) (simplify b)])
[::mult a b ] (simplify-1 [::mult (simplify a) (simplify b)])
_ (simplify-1 expr)))
Example:
(simplify [::add
[::mult
[::add [::const 1] [::mult [::const 0] [::var 'x]]]
[::const 3]]
[::const 12]])
;=> [:simplify/const 15]
This lets you leverage pattern matching for terseness and have a similar elegance as some of your linked examples. There is a cost compared to your protocol/multimethod approaches though - those are sum types open to extension, including by other people's code without touching your source code. How useful that is depends on your application.
A few asides:
You can also define simplify in terms of clojure.walk/postwalk with simplify-1 as the function argument. This is maybe a tad easier to extend since simplify no longer needs to know which expr variants are operations and can be simplified beyond calling simplify-1 on them.
I tried to define a core.typed type for this, but my environment seems to have some issues loading that today so I can't check it.
Think this should more or less fit:
(defalias Expr
"A variant type for algebraic expressions."
(Rec [e]
(U [(Value ::const) Number]
[(Value ::add) e e]
[(Value ::mult) e e]
[(Value ::var) Symbol])))

Related

fn and let inside clojure macro

I'm running into some limitations of Clojure macros. I wonder how to optimize the following code?
(defmacro ssplit-7-inefficient [x]
(let [t 7]
;; Duplicated computation here!
`(do [(first (split-with #(not (= '~t %)) '~x))
(drop 1 (second (split-with #(not (= '~t %)) '~x)))])))
(ssplit-7-inefficient (foo 7 bar baz))
;; Returns: [(foo) (bar baz)]
Here are some approaches that don't work:
(defmacro ssplit-7-fails [x]
(let [t 7]
`(do ((fn [[a b]] [a (drop 1 b)]) (split-with #(not (= '~t %)) '~x)))))
(ssplit-7-fails (foo 7 bar baz))
;; Error: Call to clojure.core/fn did not conform to spec.
(defmacro ssplit-7-fails-again [x]
(let [t 7]
`(do
(let [data (split-with #(not (= '~t %)) '~x)]
((fn [[a b]] [a (drop 1 b)]) data)))))
(ssplit-7-fails-again (foo 7 bar baz))
;; Error: Call to clojure.core/let did not conform to spec.
Note that split-with splits only once. You can use some destructuring to get what you want:
(defmacro split-by-7 [arg]
`((fn [[x# [_# & z#]]] [x# z#]) (split-with (complement #{7}) '~arg)))
(split-by-7 (foo 7 bar baz))
=> [(foo) (bar baz)]
In other use cases, partition-by can be also useful:
(defmacro split-by-7 [arg]
`(->> (partition-by #{7} '~arg)
(remove #{[7]})))
(split-by-7 (foo 7 bar baz))
=> ((foo) (bar baz))
It is not so easy to reason about macros in Clojure - (in my view macroexpand-1 alienates the code a lot - in contrast to Common Lisp's macroexpand-1 ...).
My way was first to build a helper function.
(defn %split-7 [x]
(let [y 7]
(let [[a b] (split-with #(not= y %) x)]
[a (drop 1 b)])))
This function uses destructuring so that the split-with is "efficient".
It does nearly exactly what the macro should do. Just that one has to quote
the argument - so that it works.
(%split-7 '(a 7 b c))
;;=> [(a) (b c)]
From this step to the macro is not difficult.
The macro should just automatically quote the argument when inserting into the helper function's call.
(defmacro split-7 [x]
`(%split-7 '~x))
So that we can call:
(split-7 (a 7 b c))
;; => [(a) (b c)]
Using this trick, even generalize the function to:
(defn %split-by [x y]able like this
(let [[a b] (split-with #(not= y %) x)]
[a (drop 1 b)]))
(defmacro split-by [x y]
`(%split-by '~x ~y))
(split-by (a 7 b c) 7)
;; => [(a) (b c)]
(split-by (a 7 b c 9 d e) 9)
;; => [(a 7 b c) (d e)]
The use of (helper) functions in the macro body - and even other macros - or recursive functions or recursive macros - macros which call other macros - shows how powerful lisp macros are. Because it shows that you can use the entirety of lisp when formulating/defining macros. Something what most language's macros usually aren't able to do.

Run length encoding of sequences

So I am trying to solve this problem, and this is the code I have come up with:
First I have a pack function, receives a list and groups same elements into a vector.
(defn pack [lst]
(def a [])
(def vect [])
(cond
(empty? lst)
lst
:else
(loop [i 0]
(def r (get lst i))
(def t (get lst (+ i 1)))
(if (= r t)
(def vect (conj vect r))
)
(if (not= r t)
(and (def vect (conj vect r)) (and (def a (conj a vect)) (def vect [])))
)
(if (= i (- (count lst) 1))
a
(recur (inc i))
)
))
)
for example if I have this vector:
(def tes '[a a a a b c c a a d e e e e])
pack function will return this:
[[a a a a] [b] [c c] [a a] [d] [e e e e]]
Then I tried doing the "encode" part of the problem with this code:
(def v1 [])
(def v2 [])
(conj v2 (conj v1 (count (get (pack tes) 0)) (get (get (pack tes) 0) 0)))
And it returned what I wanted, a vector "v2" with a vector "v1" that has the "encoded" item.
[[4 a]]
So now I try to make the function:
(defn encode [lst]
(loop [index 0 limit (count (pack lst)) v1 [] v2[]]
(if (= index limit)
lst
(conj v2 (conj v1 (count (get (pack tes) index)) (get (get (pack tes) index) index)))
)
(recur (inc index) limit v1 v2)
)
)
(encode tes)
but I get this error:
2021/03/07 00:00:21 got exception from server /usr/local/bin/lein: line 152:
28 Killed "$LEIN_JAVA_CMD" "${BOOTCLASSPATH[#]}" -Dfile.encoding=UTF-8 -Dmaven.wagon.http.ssl.easy=false -Dmaven.wagon.rto=10000 $LEIN_JVM_OPTS
-Dleiningen.original.pwd="$ORIGINAL_PWD" -Dleiningen.script="$0" -classpath "$CLASSPATH" clojure.main -m leiningen.core.main "$#"
2021/03/07 01:42:20 error reading from server EOF
Any way to fix my code or to solve the problem more efficiently but still return a vector?
juxt can be used in the pack function:
(defn pack [xs]
(map (juxt count first) (partition-by identity xs)))
(defn unpack [xs]
(mapcat #(apply repeat %) xs))
Don't use def inside function, because it creates global
variable. Use let instead.
Don't use multiple if in row, there is cond.
Format your code better- for example, put all parentheses on the end together on one line.
Here is more efficient solution:
(defn pack [lst]
(letfn [(pack-help [lst]
(if (empty? lst) '()
(let [elem (first lst)]
(cons (vec (take-while #(= % elem) lst))
(pack-help (drop-while #(= % elem) lst))))))]
(vec (pack-help lst))))
(defn pack-with-count [lst]
(mapv #(vector (count %) (first %))
(pack lst)))
(defn unpack [packed-lst]
(into [] (apply concat packed-lst)))
(pack '[a a a a b c c a a d e e e e])
(pack-with-count '[a a a a b c c a a d e e e e])
(unpack '[[a a a a] [b] [c c] [a a] [d] [e e e e]])
As a rule, whenever you reach for loop/recur, there are some pieces of the standard library which will allow you to get the desired effect using higher-order functions. You avoid needing to implement the wiring and can just concentrate on your intent.
(def tes '[a a a a b c c a a d e e e e])
(partition-by identity tes)
; => ((a a a a) (b) (c c) (a a) (d) (e e e e))
(map (juxt count first) *1)
; => ([4 a] [1 b] [2 c] [2 a] [1 d] [4 e])
(mapcat #(apply repeat %) *1)
; => (a a a a b c c a a d e e e e)
Here *1 is just the REPL shorthand for "previous result" - if you need to compose these into functions, this will be replaced with your argument.
If you really need vectors rather than sequences for the outer collection at each stage, you can wrap with vec (to convert the lazy sequence to a vector), or use mapv instead of map.
Finally - the error message you are getting from lein is a syntax error rather than a logic or code problem. Clojure generally flags an unexpected EOF if there aren't enough closing parens.
(println "because we left them open like this -"
Consider working inside a REPL within an IDE, or if that isn't possible then using a text editor that matches parens for you.

Clojure functions which perform symbolic simplification using 'and'

I am new to Clojure and I'm learning how to write a program that can simplify logical expressions (just 'and' for now to figure out how things work first). For example:
(and-simplify '(and true)) => true
(and-simplify '(and x true)) => x
(and-simplify '(and true false x)) => false
(and-simplify '(and x y z true)) => (and x y z)
I already knew how to simplify two arguments, that everything I can do right now is:
(defn and-simplify []
(def x (and true false))
println x)
(and-simplify)
I've read this post and tried to modify my code a little bit but it doesn't seem to get me anywhere:
(defn and-simplify [&expr]
(def (and &expr))
)
What is the correct way that I should have done?
Here's my take on it.
(defn simplify-and
[[op & forms]]
(let [known-falsy? #(or (false? %) (nil? %))
known-truthy? #(and (not (symbol? %))
(not (seq? %))
(not (known-falsy? %)))
falsy-forms (filter known-falsy? forms)
unknown-forms (remove known-truthy? forms)]
(if (seq falsy-forms)
(first falsy-forms)
(case (count unknown-forms)
0 true
1 (first unknown-forms)
(cons op unknown-forms)))))
(comment (simplify-and `(and true 1 2 a)))
However, we can write a more generic simplify that uses multimethods to simplify lists, so that we can add more optimisations without modifying existing code. Here's that, with optimisations for and, or and + from clojure.core. This simplify only optimises lists based on namespace qualified names.
Check out the examples in the comment form. Hope it makes sense.
(defn- known-falsy? [form]
(or (false? form) (nil? form)))
(defn- known-truthy? [form]
(and (not (symbol? form))
(not (seq? form))
(not (known-falsy? form))))
(declare simplify)
(defmulti simplify-list first)
(defmethod simplify-list :default [form] form)
(defmethod simplify-list 'clojure.core/and
[[op & forms]]
(let [forms (mapv simplify forms)
falsy-forms (filter known-falsy? forms)
unknown-forms (remove known-truthy? forms)]
(if (seq falsy-forms)
(first falsy-forms)
(case (count unknown-forms)
0 true
1 (first unknown-forms)
(cons op unknown-forms)))))
(defmethod simplify-list 'clojure.core/or
[[op & forms]]
(let [forms (mapv simplify forms)
truthy-forms (filter known-truthy? forms)
unknown-forms (remove known-falsy? forms)]
(if (seq truthy-forms)
(first truthy-forms)
(case (count unknown-forms)
0 nil
1 (first unknown-forms)
(cons op unknown-forms)))))
(defmethod simplify-list 'clojure.core/+
[[op & forms]]
(let [{nums true non-nums false} (group-by number? (mapv simplify forms))
sum (apply + nums)]
(if (seq non-nums)
(cons op (cons sum non-nums))
sum)))
(defn simplify
"takes a Clojure form with resolved symbols and performs
peephole optimisations on it"
[form]
(cond (set? form) (into #{} (map simplify) form)
(vector? form) (mapv simplify form)
(map? form) (reduce-kv (fn [m k v] (assoc m (simplify k) (simplify v)))
{} form)
(seq? form) (simplify-list form)
:else form))
(comment
(simplify `(+ 1 2))
(simplify `(foo 1 2))
(simplify `(and true (+ 1 2 3 4 5 foo)))
(simplify `(or false x))
(simplify `(or false x nil y))
(simplify `(or false x (and y nil z) (+ 1 2)))
)

postwalk to evaluate arithmetic expression

I am trying to use Instaparse to make a simple arithmetic expression evaluator. The parser seems to work fine but I cannot figure out how to evaluate the returned nested vector. Currently I am using postwalk, like this
(ns test5.core
(:require [instaparse.core :as insta])
(:require [clojure.walk :refer [postwalk]])
(:gen-class))
(def WS
(insta/parser
"WS = #'\\s+'"))
(def transform-options
{:IntLiteral read-string})
(def parser
(insta/parser
"AddExpr = AddExpr '+' MultExpr
| AddExpr '-' MultExpr
| MultExpr
MultExpr = MultExpr '*' IntLiteral
| MultExpr '/' IntLiteral
| IntLiteral
IntLiteral = #'[0-9]+'"
:auto-whitespace WS))
(defn parse[input]
(->> (parser input)
(insta/transform transform-options)))
(defn visit [node]
(println node)
(cond
(number? node) node
(string? node) (resolve (symbol node))
(vector? node)
(cond
(= :MultExpr (first node)) (visit (rest node))
(= :AddExpr (first node)) (visit (rest node))
:else node)
:else node))
(defn evaluate [tree]
(println tree)
(postwalk visit tree))
(defn -main
[& args]
(evaluate (parse "1 * 2 + 3")))
postwalk does traverse the vector but I get a nested list as the result, eg
((((1) #'clojure.core/* 2)) #'clojure.core/+ (3))
Use org.clojure/core.match. Base on your current grammar, you can write the evaluation function as:
(defn eval-expr [expr]
(match expr
[:MultExpr e1 "*" e2] (* (eval-expr e1)
(eval-expr e2))
[:MultExpr e1 "/" e2] (/ (eval-expr e1)
(eval-expr e2))
[:AddExpr e1 "+" e2] (+ (eval-expr e1)
(eval-expr e2))
[:AddExpr e1 "-" e2] (- (eval-expr e1)
(eval-expr e2))
[:MultExpr e1] (eval-expr e1)
[:AddExpr e1] (eval-expr e1)
:else expr))
and evaluate with:
(-> "1 * 2 + 3"
parse
eval-expr)
;; => 5
This doesn't use Instaparse or clojure.walk, but here's something I had for evaluating infix math using only reduce:
(defn evaluate
"Evaluates an infix arithmetic form e.g. (1 + 1 * 2)."
[e]
(let [eval-op (fn [op a b]
(let [f (resolve op)]
(f a b)))]
(reduce
(fn [[v op] elem]
(cond
(coll? elem)
(if op
[(eval-op op v (first (evaluate elem))) nil]
[(first (evaluate elem)) nil])
(and op (number? elem))
[(eval-op op v elem) nil]
(number? elem)
[elem nil]
(symbol? elem)
[v elem]
:else
(throw (ex-info "Invalid evaluation" {:v v :op op :elem (type elem)}))))
[0 nil]
e)))
(first (evaluate (clojure.edn/read-string "(1 * 2 + 3)")))
=> 5
(first (evaluate (clojure.edn/read-string "(1 * 2 + (3 * 5))")))
=> 17
This requires the input string to represent a valid Clojure list. I also had this function for grouping multiplication/division:
(defn pemdas
"Groups division/multiplication operations in e into lists."
[e]
(loop [out []
rem e]
(if (empty? rem)
(seq out)
(let [curr (first rem)
next' (second rem)]
(if (contains? #{'/ '*} next')
(recur (conj out (list curr next' (nth rem 2)))
(drop 3 rem))
(recur (conj out curr) (rest rem)))))))
(pemdas '(9.87 + 4 / 3 * 0.41))
=> (9.87 + (4 / 3) * 0.41)
This exact problem is why I first created the Tupelo Forest library.
Please see the talk from Clojure Conj 2017.
I've started some docs here. You can also see live examples here.
Update
Here is how you could use the Tupelo Forest library to do it:
First, define your Abstract Syntax Tree (AST) data using Hiccup format:
(with-forest (new-forest)
(let [data-hiccup [:rpc
[:fn {:type :+}
[:value 2]
[:value 3]]]
root-hid (add-tree-hiccup data-hiccup)
with result:
(hid->bush root-hid) =>
[{:tag :rpc}
[{:type :+, :tag :fn}
[{:tag :value, :value 2}]
[{:tag :value, :value 3}]]]
Show how walk-tree works using a "display interceptor"
disp-interceptor {:leave (fn [path]
(let [curr-hid (xlast path)
curr-node (hid->node curr-hid)]
(spyx curr-node)))}
>> (do
(println "Display walk-tree processing:")
(walk-tree root-hid disp-interceptor))
with result:
Display walk-tree processing:
curr-node => {:tupelo.forest/khids [], :tag :value, :value 2}
curr-node => {:tupelo.forest/khids [], :tag :value, :value 3}
curr-node => {:tupelo.forest/khids [1037 1038], :type :+, :tag :fn}
curr-node => {:tupelo.forest/khids [1039], :tag :rpc}
then define the operators and an interceptor to transform a subtree like (+ 2 3) => 5
op->fn {:+ +
:* *}
math-interceptor {:leave (fn [path]
(let [curr-hid (xlast path)
curr-node (hid->node curr-hid)
curr-tag (grab :tag curr-node)]
(when (= :fn curr-tag)
(let [curr-op (grab :type curr-node)
curr-fn (grab curr-op op->fn)
kid-hids (hid->kids curr-hid)
kid-values (mapv hid->value kid-hids)
result-val (apply curr-fn kid-values)]
(set-node curr-hid {:tag :value :value result-val} [])))))}
] ; end of let form
; imperative step replaces old nodes with result of math op
(walk-tree root-hid math-interceptor)
We can then display the modified AST tree which contains the result of (+ 2 3):
(hid->bush root-hid) =>
[{:tag :rpc}
[{:tag :value, :value 5}]]
You can see the live code here.

Nesting macros in Clojure

Consider this pseudo code:
(defrc name
"string"
[a :A]
[:div a])
Where defrc would be a macro, that would expand to the following
(let [a (rum/react (atom :A))]
(rum/defc name < rum/reactive []
[:div a]))
Where rum/defc is itself a macro. I came up with the code below:
(defmacro defrc
[name subj bindings & body]
(let [map-bindings# (apply array-map bindings)
keys# (keys map-bindings#)
vals# (vals map-bindings#)
atomised-vals# (atom-map vals#)]
`(let ~(vec (interleave keys# (map (fn [v] (list 'rum/react v)) (vals atomised-vals#))))
(rum/defc ~name < rum/reactive [] ~#body))))
Which almost works:
(macroexpand-all '(defrc aname
#_=> "string"
#_=> [a :A]
#_=> [:div a]))
(let* [a (rum/react #object[clojure.lang.Atom 0x727ed2e6 {:status :ready, :val nil}])] (rum/defc aname clojure.core/< rum/reactive [] [:div a]))
However when used it results in a syntax error:
ERROR: Syntax error at (clojure.core/< rum.core/reactive [] [:div a])
Is this because the inner macro is not being expanded?
Turns out the macro was working correctly but the problem occurred because < was inside the syntax quote it got expanded to clojure.core/<, and Rum simply looks for a quoted <, relevant snippet from Rum's source:
...(cond
(and (empty? res) (symbol? x))
(recur {:name x} next nil)
(fn-body? xs) (assoc res :bodies (list xs))
(every? fn-body? xs) (assoc res :bodies xs)
(string? x) (recur (assoc res :doc x) next nil)
(= '< x) (recur res next :mixins)
(= mode :mixins)
(recur (update-in res [:mixins] (fnil conj []) x) next :mixins)
:else
(throw (IllegalArgumentException. (str "Syntax error at " xs))))...