I am trying to use Instaparse to make a simple arithmetic expression evaluator. The parser seems to work fine but I cannot figure out how to evaluate the returned nested vector. Currently I am using postwalk, like this
(ns test5.core
(:require [instaparse.core :as insta])
(:require [clojure.walk :refer [postwalk]])
(:gen-class))
(def WS
(insta/parser
"WS = #'\\s+'"))
(def transform-options
{:IntLiteral read-string})
(def parser
(insta/parser
"AddExpr = AddExpr '+' MultExpr
| AddExpr '-' MultExpr
| MultExpr
MultExpr = MultExpr '*' IntLiteral
| MultExpr '/' IntLiteral
| IntLiteral
IntLiteral = #'[0-9]+'"
:auto-whitespace WS))
(defn parse[input]
(->> (parser input)
(insta/transform transform-options)))
(defn visit [node]
(println node)
(cond
(number? node) node
(string? node) (resolve (symbol node))
(vector? node)
(cond
(= :MultExpr (first node)) (visit (rest node))
(= :AddExpr (first node)) (visit (rest node))
:else node)
:else node))
(defn evaluate [tree]
(println tree)
(postwalk visit tree))
(defn -main
[& args]
(evaluate (parse "1 * 2 + 3")))
postwalk does traverse the vector but I get a nested list as the result, eg
((((1) #'clojure.core/* 2)) #'clojure.core/+ (3))
Use org.clojure/core.match. Base on your current grammar, you can write the evaluation function as:
(defn eval-expr [expr]
(match expr
[:MultExpr e1 "*" e2] (* (eval-expr e1)
(eval-expr e2))
[:MultExpr e1 "/" e2] (/ (eval-expr e1)
(eval-expr e2))
[:AddExpr e1 "+" e2] (+ (eval-expr e1)
(eval-expr e2))
[:AddExpr e1 "-" e2] (- (eval-expr e1)
(eval-expr e2))
[:MultExpr e1] (eval-expr e1)
[:AddExpr e1] (eval-expr e1)
:else expr))
and evaluate with:
(-> "1 * 2 + 3"
parse
eval-expr)
;; => 5
This doesn't use Instaparse or clojure.walk, but here's something I had for evaluating infix math using only reduce:
(defn evaluate
"Evaluates an infix arithmetic form e.g. (1 + 1 * 2)."
[e]
(let [eval-op (fn [op a b]
(let [f (resolve op)]
(f a b)))]
(reduce
(fn [[v op] elem]
(cond
(coll? elem)
(if op
[(eval-op op v (first (evaluate elem))) nil]
[(first (evaluate elem)) nil])
(and op (number? elem))
[(eval-op op v elem) nil]
(number? elem)
[elem nil]
(symbol? elem)
[v elem]
:else
(throw (ex-info "Invalid evaluation" {:v v :op op :elem (type elem)}))))
[0 nil]
e)))
(first (evaluate (clojure.edn/read-string "(1 * 2 + 3)")))
=> 5
(first (evaluate (clojure.edn/read-string "(1 * 2 + (3 * 5))")))
=> 17
This requires the input string to represent a valid Clojure list. I also had this function for grouping multiplication/division:
(defn pemdas
"Groups division/multiplication operations in e into lists."
[e]
(loop [out []
rem e]
(if (empty? rem)
(seq out)
(let [curr (first rem)
next' (second rem)]
(if (contains? #{'/ '*} next')
(recur (conj out (list curr next' (nth rem 2)))
(drop 3 rem))
(recur (conj out curr) (rest rem)))))))
(pemdas '(9.87 + 4 / 3 * 0.41))
=> (9.87 + (4 / 3) * 0.41)
This exact problem is why I first created the Tupelo Forest library.
Please see the talk from Clojure Conj 2017.
I've started some docs here. You can also see live examples here.
Update
Here is how you could use the Tupelo Forest library to do it:
First, define your Abstract Syntax Tree (AST) data using Hiccup format:
(with-forest (new-forest)
(let [data-hiccup [:rpc
[:fn {:type :+}
[:value 2]
[:value 3]]]
root-hid (add-tree-hiccup data-hiccup)
with result:
(hid->bush root-hid) =>
[{:tag :rpc}
[{:type :+, :tag :fn}
[{:tag :value, :value 2}]
[{:tag :value, :value 3}]]]
Show how walk-tree works using a "display interceptor"
disp-interceptor {:leave (fn [path]
(let [curr-hid (xlast path)
curr-node (hid->node curr-hid)]
(spyx curr-node)))}
>> (do
(println "Display walk-tree processing:")
(walk-tree root-hid disp-interceptor))
with result:
Display walk-tree processing:
curr-node => {:tupelo.forest/khids [], :tag :value, :value 2}
curr-node => {:tupelo.forest/khids [], :tag :value, :value 3}
curr-node => {:tupelo.forest/khids [1037 1038], :type :+, :tag :fn}
curr-node => {:tupelo.forest/khids [1039], :tag :rpc}
then define the operators and an interceptor to transform a subtree like (+ 2 3) => 5
op->fn {:+ +
:* *}
math-interceptor {:leave (fn [path]
(let [curr-hid (xlast path)
curr-node (hid->node curr-hid)
curr-tag (grab :tag curr-node)]
(when (= :fn curr-tag)
(let [curr-op (grab :type curr-node)
curr-fn (grab curr-op op->fn)
kid-hids (hid->kids curr-hid)
kid-values (mapv hid->value kid-hids)
result-val (apply curr-fn kid-values)]
(set-node curr-hid {:tag :value :value result-val} [])))))}
] ; end of let form
; imperative step replaces old nodes with result of math op
(walk-tree root-hid math-interceptor)
We can then display the modified AST tree which contains the result of (+ 2 3):
(hid->bush root-hid) =>
[{:tag :rpc}
[{:tag :value, :value 5}]]
You can see the live code here.
Related
I am new to Clojure and I'm learning how to write a program that can simplify logical expressions (just 'and' for now to figure out how things work first). For example:
(and-simplify '(and true)) => true
(and-simplify '(and x true)) => x
(and-simplify '(and true false x)) => false
(and-simplify '(and x y z true)) => (and x y z)
I already knew how to simplify two arguments, that everything I can do right now is:
(defn and-simplify []
(def x (and true false))
println x)
(and-simplify)
I've read this post and tried to modify my code a little bit but it doesn't seem to get me anywhere:
(defn and-simplify [&expr]
(def (and &expr))
)
What is the correct way that I should have done?
Here's my take on it.
(defn simplify-and
[[op & forms]]
(let [known-falsy? #(or (false? %) (nil? %))
known-truthy? #(and (not (symbol? %))
(not (seq? %))
(not (known-falsy? %)))
falsy-forms (filter known-falsy? forms)
unknown-forms (remove known-truthy? forms)]
(if (seq falsy-forms)
(first falsy-forms)
(case (count unknown-forms)
0 true
1 (first unknown-forms)
(cons op unknown-forms)))))
(comment (simplify-and `(and true 1 2 a)))
However, we can write a more generic simplify that uses multimethods to simplify lists, so that we can add more optimisations without modifying existing code. Here's that, with optimisations for and, or and + from clojure.core. This simplify only optimises lists based on namespace qualified names.
Check out the examples in the comment form. Hope it makes sense.
(defn- known-falsy? [form]
(or (false? form) (nil? form)))
(defn- known-truthy? [form]
(and (not (symbol? form))
(not (seq? form))
(not (known-falsy? form))))
(declare simplify)
(defmulti simplify-list first)
(defmethod simplify-list :default [form] form)
(defmethod simplify-list 'clojure.core/and
[[op & forms]]
(let [forms (mapv simplify forms)
falsy-forms (filter known-falsy? forms)
unknown-forms (remove known-truthy? forms)]
(if (seq falsy-forms)
(first falsy-forms)
(case (count unknown-forms)
0 true
1 (first unknown-forms)
(cons op unknown-forms)))))
(defmethod simplify-list 'clojure.core/or
[[op & forms]]
(let [forms (mapv simplify forms)
truthy-forms (filter known-truthy? forms)
unknown-forms (remove known-falsy? forms)]
(if (seq truthy-forms)
(first truthy-forms)
(case (count unknown-forms)
0 nil
1 (first unknown-forms)
(cons op unknown-forms)))))
(defmethod simplify-list 'clojure.core/+
[[op & forms]]
(let [{nums true non-nums false} (group-by number? (mapv simplify forms))
sum (apply + nums)]
(if (seq non-nums)
(cons op (cons sum non-nums))
sum)))
(defn simplify
"takes a Clojure form with resolved symbols and performs
peephole optimisations on it"
[form]
(cond (set? form) (into #{} (map simplify) form)
(vector? form) (mapv simplify form)
(map? form) (reduce-kv (fn [m k v] (assoc m (simplify k) (simplify v)))
{} form)
(seq? form) (simplify-list form)
:else form))
(comment
(simplify `(+ 1 2))
(simplify `(foo 1 2))
(simplify `(and true (+ 1 2 3 4 5 foo)))
(simplify `(or false x))
(simplify `(or false x nil y))
(simplify `(or false x (and y nil z) (+ 1 2)))
)
Consider this pseudo code:
(defrc name
"string"
[a :A]
[:div a])
Where defrc would be a macro, that would expand to the following
(let [a (rum/react (atom :A))]
(rum/defc name < rum/reactive []
[:div a]))
Where rum/defc is itself a macro. I came up with the code below:
(defmacro defrc
[name subj bindings & body]
(let [map-bindings# (apply array-map bindings)
keys# (keys map-bindings#)
vals# (vals map-bindings#)
atomised-vals# (atom-map vals#)]
`(let ~(vec (interleave keys# (map (fn [v] (list 'rum/react v)) (vals atomised-vals#))))
(rum/defc ~name < rum/reactive [] ~#body))))
Which almost works:
(macroexpand-all '(defrc aname
#_=> "string"
#_=> [a :A]
#_=> [:div a]))
(let* [a (rum/react #object[clojure.lang.Atom 0x727ed2e6 {:status :ready, :val nil}])] (rum/defc aname clojure.core/< rum/reactive [] [:div a]))
However when used it results in a syntax error:
ERROR: Syntax error at (clojure.core/< rum.core/reactive [] [:div a])
Is this because the inner macro is not being expanded?
Turns out the macro was working correctly but the problem occurred because < was inside the syntax quote it got expanded to clojure.core/<, and Rum simply looks for a quoted <, relevant snippet from Rum's source:
...(cond
(and (empty? res) (symbol? x))
(recur {:name x} next nil)
(fn-body? xs) (assoc res :bodies (list xs))
(every? fn-body? xs) (assoc res :bodies xs)
(string? x) (recur (assoc res :doc x) next nil)
(= '< x) (recur res next :mixins)
(= mode :mixins)
(recur (update-in res [:mixins] (fnil conj []) x) next :mixins)
:else
(throw (IllegalArgumentException. (str "Syntax error at " xs))))...
Here is an example from joy of clojure chapter 8:
(defn contextual-eval [ctx expr]
(let [new-expr
`(let [~#(mapcat (fn [[k v]]
[k `'~v])
ctx)]
~expr)]
(pprint new-expr)
(eval new-expr)))
(pprint (contextual-eval '{a 1 b 2} '(+ a b)))
I find the ``'` part quite perplexing, what's it for?
I also tried to modify the function a bit:
(defn contextual-eval [ctx expr]
(let [new-expr
`(let [~#(mapcat (fn [[k v]]
[k `~v])
ctx)]
~expr)]
(pprint new-expr)
(eval new-expr)))
(pprint (contextual-eval '{a 1 b 2} '(+ a b)))
(defn contextual-eval [ctx expr]
(let [new-expr
`(let [~#(vec (apply
concat
ctx))]
~expr)]
(pprint new-expr)
(eval new-expr)))
(pprint (contextual-eval '{a 1 b 2} '(+ a b)))
All the versions above have similar effect. Why did the author choose to use `' then?
A more detailed look:
(use 'clojure.pprint)
(defmacro epprint [expr]
`(do
(print "==>")
(pprint '~expr)
(pprint ~expr)))
(defmacro epprints [& exprs]
(list* 'do (map (fn [x] (list 'epprint x))
exprs)))
(defn contextual-eval [ctx expr]
(let [new-expr
`(let [~#(mapcat (fn [[k v]]
(epprints
(class v)
v
(class '~v)
'~v
(class `'~v)
`'~v
(class ctx)
ctx)
[k `~v])
ctx)]
~expr)]
(pprint new-expr)
(eval new-expr)))
(pprint (contextual-eval '{a (* 2 3) b (inc 11)} '(+ a b)))
This prints out the following in the repl:
==>(class v)
clojure.lang.PersistentList
==>v
(* 2 3)
==>(class '~v)
clojure.lang.PersistentList
==>'~v
~v
==>(class
(clojure.core/seq
(clojure.core/concat
(clojure.core/list 'quote)
(clojure.core/list v))))
clojure.lang.Cons
==>(clojure.core/seq
(clojure.core/concat (clojure.core/list 'quote) (clojure.core/list v)))
'(* 2 3)
==>(class ctx)
clojure.lang.PersistentArrayMap
==>ctx
{a (* 2 3), b (inc 11)}
==>(class v)
clojure.lang.PersistentList
==>v
(inc 11)
==>(class '~v)
clojure.lang.PersistentList
==>'~v
~v
==>(class
(clojure.core/seq
(clojure.core/concat
(clojure.core/list 'quote)
(clojure.core/list v))))
clojure.lang.Cons
==>(clojure.core/seq
(clojure.core/concat (clojure.core/list 'quote) (clojure.core/list v)))
'(inc 11)
==>(class ctx)
clojure.lang.PersistentArrayMap
==>ctx
{a (* 2 3), b (inc 11)}
==>new-expr
(clojure.core/let [a (* 2 3) b (inc 11)] (+ a b))
18
Again, using a single syntax quote for v seems to get the job done.
In fact, using `'v might cause you some trouble:
(defn contextual-eval [ctx expr]
(let [new-expr
`(let [~#(mapcat (fn [[k v]]
[k `'~v])
ctx)]
~expr)]
(pprint new-expr)
(eval new-expr)))
(pprint (contextual-eval '{a (inc 3) b (* 3 4)} '(+ a b)))
CompilerException java.lang.ClassCastException: clojure.lang.PersistentList cannot be cast to java.lang.Number, compiling:(/Users/kaiyin/personal_config_bin_files/workspace/typedclj/src/typedclj/macros.clj:14:22)
`'~v is a way to return
(list 'quote v)
in this case quoting the actual value of v in the let expression, not the symbol itself.
IDK The Joy Of Clojure, but apparently the authors want to prevent forms passed in ctx from being evaluated in the expanded let form. E. g. (contextual-eval '{a (+ 3 4)} 'a) will return (+ 3 4) but 7 in your versions which are both identical in behavior.
Your modified versions have the same effect only because you're trying them on very simple data. Try instead with a mapping like {'a 'x}, a context in which the binding for a is the symbol x.
user> (defn contextual-eval [ctx expr]
(let [new-expr
`(let [~#(mapcat (fn [[k v]]
[k `'~v])
ctx)]
~expr)]
(eval new-expr)))
#'user/contextual-eval
user> (contextual-eval {'a 'x} '(name a))
"x"
user> (defn contextual-eval [ctx expr]
(let [new-expr
`(let [~#(mapcat (fn [[k v]]
[k `~v])
ctx)]
~expr)]
(eval new-expr)))
#'user/contextual-eval
user> (contextual-eval {'a 'x} '(name a))
; Evaluation aborted.
The problem is that in your version, by neglecting the quote, you are double-evaluating the values bound to your symbols: x shouldn't be evaluated, because the value is actually the symbol x. You get away with this double evaluation in your simple test cases, because 1 evaluates to itself: (eval (eval (eval 1))) would work fine too. But doing that with most data structures is wrong, because they have non-trivial evaluation semantics.
Note also that the following expressions are identical in all cases, so there's never a reason to write any of them but the first one:
x
`~x
`~`~x
```~`~~`~~x
If you syntax-quote and then immediately un-quote, you haven't accomplished anything. So, if you ever find yourself writing a quote followed by an unquote, this should be a big red flag that you are doing something wrong.
I have the following code, defining a type that has an atom in there.
(defprotocol IDeck
(vec-* [dk] "Output to a persistent vector")
(count-* [dk] "Number of elements in the deck")
(conj1-* [dk & es] "Adding multiple elements to the deck"))
(deftype ADeck [#^clojure.lang.Atom val]
IDeck
(vec-* [dk] (->> (.val dk) deref (map deref) vec))
(count-* [dk] (-> (.val dk) deref count))
(conj1-* [dk & es]
(try
(loop [esi es]
(let [e (first esi)]
(cond
(nil? e) dk
:else
(do
(swap! (.val dk) #(conj % (atom e)))
(recur (rest esi))))))
(catch Throwable t (println t)))))
(defn new-*adeck
([] (ADeck. (atom [])))
([v] (ADeck. (atom (vec (map atom v))))))
(defn conj2-* [dk & es]
(try
(loop [esi es]
(let [e (first esi)]
(cond
(nil? e) dk
:else
(do
(swap! (.val dk) #(conj % (atom e)))
(recur (rest esi))))))
(catch Throwable t (println t))))
;; Usage
(def a (new-*adeck [1 2 3 4]))
(count-* a)
;=> 4
(vec-* a)
;=> [1 2 3 4]
(conj1-* a 1 2) ;; The deftype case
;=> IllegalArgumentException java.lang.IllegalArgumentException: Don't know how to create ISeq from: java.lang.Long
(vec-* a)
;=> [1 2 3 4]
(conj2-* a 1 2) ;; The defn case
(vec-* a)
;=> [1 2 3 4 1 2]
Even though the two conj-* methods are exactly the same, except that one is in a deftype and the other is a normal defn, the first gives an error while the second succeeds. Why is this?
This is because protocols doesn't support variable number of arguments.
What you can do is make:
(conj1-* [dk & es] "Adding multiple elements to the deck"))
into
(conj1-* [dk es] "Adding multiple elements to the deck"))
such that the es param will be vector and called like:
(conj1-* a [1 2])
I have a sequence s and a list of indexes into this sequence indexes. How do I retain only the items given via the indexes?
Simple example:
(filter-by-index '(a b c d e f g) '(0 2 3 4)) ; => (a c d e)
My usecase:
(filter-by-index '(c c# d d# e f f# g g# a a# b) '(0 2 4 5 7 9 11)) ; => (c d e f g a b)
You can use keep-indexed:
(defn filter-by-index [coll idxs]
(keep-indexed #(when ((set idxs) %1) %2)
coll))
Another version using explicit recur and lazy-seq:
(defn filter-by-index [coll idxs]
(lazy-seq
(when-let [idx (first idxs)]
(if (zero? idx)
(cons (first coll)
(filter-by-index (rest coll) (rest (map dec idxs))))
(filter-by-index (drop idx coll)
(map #(- % idx) idxs))))))
make a list of vectors containing the items combined with the indexes,
(def with-indexes (map #(vector %1 %2 ) ['a 'b 'c 'd 'e 'f] (range)))
#'clojure.core/with-indexes
with-indexes
([a 0] [b 1] [c 2] [d 3] [e 4] [f 5])
filter this list
lojure.core=> (def filtered (filter #(#{1 3 5 7} (second % )) with-indexes))
#'clojure.core/filtered
clojure.core=> filtered
([b 1] [d 3] [f 5])
then remove the indexes.
clojure.core=> (map first filtered)
(b d f)
then we thread it together with the "thread last" macro
(defn filter-by-index [coll idxs]
(->> coll
(map #(vector %1 %2)(range))
(filter #(idxs (first %)))
(map second)))
clojure.core=> (filter-by-index ['a 'b 'c 'd 'e 'f 'g] #{2 3 1 6})
(b c d g)
The moral of the story is, break it into small independent parts, test them, then compose them into a working function.
The easiest solution is to use map:
(defn filter-by-index [coll idx]
(map (partial nth coll) idx))
I like Jonas's answer, but neither version will work well for an infinite sequence of indices: the first tries to create an infinite set, and the latter runs into a stack overflow by layering too many unrealized lazy sequences on top of each other. To avoid both problems you have to do slightly more manual work:
(defn filter-by-index [coll idxs]
((fn helper [coll idxs offset]
(lazy-seq
(when-let [idx (first idxs)]
(if (= idx offset)
(cons (first coll)
(helper (rest coll) (rest idxs) (inc offset)))
(helper (rest coll) idxs (inc offset))))))
coll idxs 0))
With this version, both coll and idxs can be infinite and you will still have no problems:
user> (nth (filter-by-index (range) (iterate #(+ 2 %) 0)) 1e6)
2000000
Edit: not trying to single out Jonas's answer: none of the other solutions work for infinite index sequences, which is why I felt a solution that does is needed.
I had a similar use case and came up with another easy solution. This one expects vectors.
I've changed the function name to match other similar clojure functions.
(defn select-indices [coll indices]
(reverse (vals (select-keys coll indices))))
(defn filter-by-index [seq idxs]
(let [idxs (into #{} idxs)]
(reduce (fn [h [char idx]]
(if (contains? idxs idx)
(conj h char) h))
[] (partition 2 (interleave seq (iterate inc 0))))))
(filter-by-index [\a \b \c \d \e \f \g] [0 2 3 4])
=>[\a \c \d \e]
=> (defn filter-by-index [src indexes]
(reduce (fn [a i] (conj a (nth src i))) [] indexes))
=> (filter-by-index '(a b c d e f g) '(0 2 3 4))
[a c d e]
I know this is not what was asked, but after reading these answers, I realized in my own personal use case, what I actually wanted was basically filtering by a mask.
So here was my take. Hopefully this will help someone else.
(defn filter-by-mask [coll mask]
(filter some? (map #(if %1 %2) mask coll)))
(defn make-errors-mask [coll]
(map #(nil? (:error %)) coll))
Usage
(let [v [{} {:error 3} {:ok 2} {:error 4 :yea 7}]
data ["one" "two" "three" "four"]
mask (make-errors-mask v)]
(filter-by-mask data mask))
; ==> ("one" "three")