I have a method that removes common prefixes from pairs of strings, and have been trying to create a generator for it. It is trivial to generate pairs of random strings, but how can I enforce that many of the pairs have common prefixes? Simply filtering the pairs generates an insufficient number of examples, so I'm trying to create a custom generator to satisfy the requirements.
Here is what I have now; it works, but I'd like to generate better parameters:
(ns liblevenshtein.distance
(:require [clojure.spec.alpha :as spec]
[clojure.spec.gen.alpha :as gen]))
(spec/def ::word
(spec/and string? (complement nil?)))
(spec/def ::v-w (spec/cat :v ::word, :w ::word))
(spec/def ::non-empty-v-w
(spec/and ::v-w (fn [{:keys [v w]}]
(and (not-empty v)
(not-empty w)))))
(defn- drop-common-prefix [v w]
(loop [v v, a (.charAt v 0), s (.substring v 1),
w w, b (.charAt w 0), t (.substring w 1)]
(if (and (= a b)
(not-empty s)
(not-empty t))
(recur s (.charAt s 0) (.substring s 1)
t (.charAt t 0) (.substring t 1))
[v a s, w b t])))
(spec/fdef drop-common-prefix
:args ::non-empty-v-w
:ret (spec/tuple string? char? string?, string? char? string?)
:fn (fn [{{:keys [v w]} :args, [v' a s, w' b t] :ret}]
(and (= v' (str a s))
(.endsWith v v')
(= w' (str b t))
(.endsWith w w'))))
Experimenting with generators, I've come up with the following. It generates pairs of strings satisfying my requirement, but I do not know how to split them into the arguments for my function:
user=> (def prefix-pair-gen (gen/fmap (fn [[u v w]] [(str u v) (str u w)]) (spec/gen (spec/coll-of string? :type vector? :count 3))))
#'user/prefix-pair-gen
user=> (spec/def ::prefix-pair (spec/with-gen (spec/coll-of string? :type vector? :count 2) (fn [] prefix-pair-gen)))
:user/prefix-pair
user=> (gen/sample (spec/gen ::prefix-pair))
(["" ""]
["c" "cR"]
["lZ" "2F"]
["8a" "8a4"]
["n1D8CSq" "n1D8Gb1k"]
["X4PO" "X4Pu"]
["eAVM1" "eAVM1qg"]
["5e3DkZ6i" "5e3DkZv4Y"]
["3P7210" "3P7245cHM"]
["1c4D2j4UUK738" "1c4D2joFjd"])
I found the solution, and it is trivial. I should have paid more attention to the docs. The :args documentation for fdef states:
:args A regex spec for the function arguments as they were a list to
be passed to apply - in this way, a single spec can handle functions
with multiple arities
As such, I can provide the generated vectors directly, as follows:
(defn- drop-common-prefix [v w]
(loop [v v, a (.charAt v 0), s (.substring v 1),
w w, b (.charAt w 0), t (.substring w 1)]
(if (and (= a b)
(not-empty s)
(not-empty t))
(recur s (.charAt s 0) (.substring s 1)
t (.charAt t 0) (.substring t 1))
[v a s, w b t])))
(def prefix-pair-gen
(gen/fmap
(fn [[u v w]]
[(str u v) (str u w)])
(spec/gen
(spec/and (spec/coll-of string? :type vector? :count 3)
(fn [[u v w]]
(and (not-empty v)
(not-empty w)))))))
(spec/def ::prefix-pair
(spec/with-gen
(spec/coll-of string? :type vector? :count 2)
(constantly prefix-pair-gen)))
(spec/fdef drop-common-prefix
:args ::prefix-pair
:ret (spec/tuple string? char? string?, string? char? string?)
:fn (fn [{[v w] :args, [v' a s, w' b t] :ret}]
(and (= v' (str a s))
(.endsWith v v')
(= w' (str b t))
(.endsWith w w'))))
I can verify its correctness via:
user> (stest/summarize-results (stest/check `liblevenshtein.distance/drop-common-prefix))
{:sym liblevenshtein.distance/drop-common-prefix}
{:total 1, :check-passed 1}
Related
I have a map and the keys are strings. If the key contains the word "kg" I want to multiply the value by 2.2 and then replace "kg" with "lb" in the key. I can't figure out how to iterate over the map in a way that I can conditionally update it.
Example map:
{"id" ("7215" "74777" "7219"),
"weight-kg" ("150" "220" "530"),
"time-seconds" ("1900" "2" "770")}
Desired output
{"id" ("7215" "74777" "7219"),
"weight-lb" ("330" "485" "1168"),
"time-seconds" ("1900" "2" "770")}
I've tried update, for map and reduce-kv. Project requirement is to not use the string library, which is why there is re-find. These are only attempts at changing the values. Since I can't change the values, I haven't attempted changing the keys.
(defn kg->lb [m k]
(if (re-find #"kg" k)
(map #(update m % * 2.2))))
(defn kg2->lb2 [m]
(reduce-kv #(if (re-find #"kg" %)
(update % * 2.2)) {} m)
(map #(if (re-find #"kg" %)
(update % * 2.2)) m)
(for [k (keys m)]
(if (re-find #"kg" k)
(update m k #(* % 2.2))))
Data:
(def data {"id" ["7215" "74777" "7219"],
"weight-kg" ["150" "220" "530"],
"time-seconds" ["1900" "2" "770"]})
Helper function to convert a string (kg amount) to string (lb amount):
(defn kg->lb [kg-string]
(-> kg-string
parse-long
(* 2.2)
int
str))
The most important function is reduce-kv.
If you find "kg" in key, you will replace that with "lb" and map helper function over all values.
If you don't find "kg" in key, you will just assoc that entry without change.
(reduce-kv (fn [m k v]
(if (re-find #"kg" k)
(assoc m (str/replace k #"kg" "lb")
(map kg->lb v))
(assoc m k v)))
{}
data)
I think I passed Project requirement is to not use the string library, except for (str/replace k #"kg" "lb"), which you can replace with String/replace interop: (.replace k "kg" "lb").
EDIT: Solution with map and into:
(defn update-entry [[k v]]
(if (re-find #"kg" k)
[(.replace k "kg" "lb") (map kg->lb v)]
[k v]))
(->> data
(map update-entry)
(into {}))
Transducer version:
(into {} (map update-entry) data)
I would do it a bit differently. First, I'd convert the strings to integers using the Java function Long/parseLong:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require
[tupelo.core :as t]
))
(defn parse-vec-longs
[v]
(mapv #(Long/parseLong %) v))
(verify
(is= (parse-vec-longs ["150" "220" "530"])
[150 220 530]))
Then I would write the code to convert the kg vals to lb vals. At the end, just use dissoc to get rid of the kg data, then assoc to add in the new lb data:
(defn convert-kg-lb
[data]
(let-spy-pretty [kg-vals (get data "weight-kg")
lb-vals (mapv #(Math/round (* 2.2 %)) kg-vals)
result (t/it-> data
(dissoc it "weight-kg")
(assoc it "weight-lb" lb-vals))]
result))
(verify
(let [data-str {"id" ["7215" "74777" "7219"]
"weight-kg" ["150" "220" "530"]
"time-seconds" ["1900" "2" "770"]}
data-parsed (t/map-vals data-str #(parse-vec-longs %))
expected-parsed {"id" [7215 74777 7219]
"weight-kg" [150 220 530]
"time-seconds" [1900 2 770]}
expected-out {"id" [7215 74777 7219]
"weight-lb" [330 484 1166]
"time-seconds" [1900 2 770]}
result (convert-kg-lb data-parsed)]
(is= data-parsed expected-parsed)
(is= (spyx-pretty result) (spyx-pretty expected-out))))
Normally, you'd also replace all the string keys with keywords as well, so "lb" => :lb.
I am trying to use Instaparse to make a simple arithmetic expression evaluator. The parser seems to work fine but I cannot figure out how to evaluate the returned nested vector. Currently I am using postwalk, like this
(ns test5.core
(:require [instaparse.core :as insta])
(:require [clojure.walk :refer [postwalk]])
(:gen-class))
(def WS
(insta/parser
"WS = #'\\s+'"))
(def transform-options
{:IntLiteral read-string})
(def parser
(insta/parser
"AddExpr = AddExpr '+' MultExpr
| AddExpr '-' MultExpr
| MultExpr
MultExpr = MultExpr '*' IntLiteral
| MultExpr '/' IntLiteral
| IntLiteral
IntLiteral = #'[0-9]+'"
:auto-whitespace WS))
(defn parse[input]
(->> (parser input)
(insta/transform transform-options)))
(defn visit [node]
(println node)
(cond
(number? node) node
(string? node) (resolve (symbol node))
(vector? node)
(cond
(= :MultExpr (first node)) (visit (rest node))
(= :AddExpr (first node)) (visit (rest node))
:else node)
:else node))
(defn evaluate [tree]
(println tree)
(postwalk visit tree))
(defn -main
[& args]
(evaluate (parse "1 * 2 + 3")))
postwalk does traverse the vector but I get a nested list as the result, eg
((((1) #'clojure.core/* 2)) #'clojure.core/+ (3))
Use org.clojure/core.match. Base on your current grammar, you can write the evaluation function as:
(defn eval-expr [expr]
(match expr
[:MultExpr e1 "*" e2] (* (eval-expr e1)
(eval-expr e2))
[:MultExpr e1 "/" e2] (/ (eval-expr e1)
(eval-expr e2))
[:AddExpr e1 "+" e2] (+ (eval-expr e1)
(eval-expr e2))
[:AddExpr e1 "-" e2] (- (eval-expr e1)
(eval-expr e2))
[:MultExpr e1] (eval-expr e1)
[:AddExpr e1] (eval-expr e1)
:else expr))
and evaluate with:
(-> "1 * 2 + 3"
parse
eval-expr)
;; => 5
This doesn't use Instaparse or clojure.walk, but here's something I had for evaluating infix math using only reduce:
(defn evaluate
"Evaluates an infix arithmetic form e.g. (1 + 1 * 2)."
[e]
(let [eval-op (fn [op a b]
(let [f (resolve op)]
(f a b)))]
(reduce
(fn [[v op] elem]
(cond
(coll? elem)
(if op
[(eval-op op v (first (evaluate elem))) nil]
[(first (evaluate elem)) nil])
(and op (number? elem))
[(eval-op op v elem) nil]
(number? elem)
[elem nil]
(symbol? elem)
[v elem]
:else
(throw (ex-info "Invalid evaluation" {:v v :op op :elem (type elem)}))))
[0 nil]
e)))
(first (evaluate (clojure.edn/read-string "(1 * 2 + 3)")))
=> 5
(first (evaluate (clojure.edn/read-string "(1 * 2 + (3 * 5))")))
=> 17
This requires the input string to represent a valid Clojure list. I also had this function for grouping multiplication/division:
(defn pemdas
"Groups division/multiplication operations in e into lists."
[e]
(loop [out []
rem e]
(if (empty? rem)
(seq out)
(let [curr (first rem)
next' (second rem)]
(if (contains? #{'/ '*} next')
(recur (conj out (list curr next' (nth rem 2)))
(drop 3 rem))
(recur (conj out curr) (rest rem)))))))
(pemdas '(9.87 + 4 / 3 * 0.41))
=> (9.87 + (4 / 3) * 0.41)
This exact problem is why I first created the Tupelo Forest library.
Please see the talk from Clojure Conj 2017.
I've started some docs here. You can also see live examples here.
Update
Here is how you could use the Tupelo Forest library to do it:
First, define your Abstract Syntax Tree (AST) data using Hiccup format:
(with-forest (new-forest)
(let [data-hiccup [:rpc
[:fn {:type :+}
[:value 2]
[:value 3]]]
root-hid (add-tree-hiccup data-hiccup)
with result:
(hid->bush root-hid) =>
[{:tag :rpc}
[{:type :+, :tag :fn}
[{:tag :value, :value 2}]
[{:tag :value, :value 3}]]]
Show how walk-tree works using a "display interceptor"
disp-interceptor {:leave (fn [path]
(let [curr-hid (xlast path)
curr-node (hid->node curr-hid)]
(spyx curr-node)))}
>> (do
(println "Display walk-tree processing:")
(walk-tree root-hid disp-interceptor))
with result:
Display walk-tree processing:
curr-node => {:tupelo.forest/khids [], :tag :value, :value 2}
curr-node => {:tupelo.forest/khids [], :tag :value, :value 3}
curr-node => {:tupelo.forest/khids [1037 1038], :type :+, :tag :fn}
curr-node => {:tupelo.forest/khids [1039], :tag :rpc}
then define the operators and an interceptor to transform a subtree like (+ 2 3) => 5
op->fn {:+ +
:* *}
math-interceptor {:leave (fn [path]
(let [curr-hid (xlast path)
curr-node (hid->node curr-hid)
curr-tag (grab :tag curr-node)]
(when (= :fn curr-tag)
(let [curr-op (grab :type curr-node)
curr-fn (grab curr-op op->fn)
kid-hids (hid->kids curr-hid)
kid-values (mapv hid->value kid-hids)
result-val (apply curr-fn kid-values)]
(set-node curr-hid {:tag :value :value result-val} [])))))}
] ; end of let form
; imperative step replaces old nodes with result of math op
(walk-tree root-hid math-interceptor)
We can then display the modified AST tree which contains the result of (+ 2 3):
(hid->bush root-hid) =>
[{:tag :rpc}
[{:tag :value, :value 5}]]
You can see the live code here.
I have an application with a lot of large maps and other things, which are clumsy to read when printed, so I made a custom print function for them and set up print-method to call it, like this:
(defmethod print-method clojure.lang.PersistentArrayMap [v ^java.io.Writer w]
(.write w (fstr1 v)))
Inside fstr1, how can I call the ordinary print-method if I determine that the map is not one of the kinds that require special treatment?
This answer suggests putting a :type in the metadata, since print-method dispatches on that. I've had some success with that, but I can't always control the metadata, so I'm hoping there's a way to "forward" to the previously defined print-method from within fstr1.
For reference, here's my current implementation of fstr1:
(defn fstr1 ^String [x]
(cond
(ubergraph? x)
(fstr-ubergraph x)
(map? x)
(case (:type x)
:move (fstr-move x)
:workspace "(a workspace)"
:bdx (fstr-bdx x)
:rxn (fstr-rxn x)
(apply str (strip-type x)))
:else
(apply str (strip-type x))))
You can always rebind print-object and tuck the real print-object away so you can call it when appropriate:
user> (let [real-print-method print-method]
(with-redefs [print-method (fn [v w]
(if (and (map? v)
(:foo v))
(do
(real-print-method "{:foo " w)
(real-print-method (:foo v) w)
(real-print-method " ...}" w))
(real-print-method v w)))]
(println {:foo 42 :bar 23} {:baz 11 :quux 0})))
{:foo 42 ...} {:baz 11, :quux 0}
nil
user>
I need to change a value in a nested map where I don't know the values of keys in advance. I have come up with the following to do that.
;; input {String {String [String]}}
;; output {String {String String}}
(defn join-z
[x-to-y-to-z]
(zipmap (keys x-to-y-to-z)
(map (fn [y-to-z] (into {} (map (fn [[y z]] {y (clojure.string/join z)})
(seq y-to-z))))
(seq (vals x-to-y-to-z)))))
(def example
{"a" {"b" ["c" "d" "e"]}
"m" {"n" ["o" "p"]}})
;; (join-z example) => {"m" {"n" "op"}, "a" {"b" "cde"}}
This seems to be a hack. What is idiomatic clojure to do this? Or, is there something like Haskell's lens library to use?
UPDATE: based on user5187212 answer
(defn update-vals [f m0]
(reduce-kv (fn [m k v] (assoc m k (f v)))
{}
m0))
;; (update-vals clojure.string/join {"b" ["c" "d" "e"]}) => {"b" "cde"}
(defn join-z [x-to-y-to-z]
(update-vals (partial update-vals clojure.string/join) x-to-y-to-z))
;; (join-z example) => {"m" {"n" "op"}, "a" {"b" "cde"}}
This seems much more elegant. Thanks!
I would suggest reduce-kv.
For the last layer you can use something like:
(defn foo [x]
(reduce-kv
(fn [m k v]
(assoc m k (clojure.string/join v)))
{}
x))
then call it as many times as you need...
(reduce-kv
(fn [m k v]
(assoc m k (foo v)))
{}
example)
An other approach could be over all nested keys and then
(reduce
(fn [m ks]
(update-in m ks clojure.string/join))
example
all-nested-keys)
The short answer is yes, that is how you do it :)
I would go for something more like this:
(into {} (for [[k v] example]
[k (into {} (for [[k2 v2] v]
[k2 (string/join v2)]))]))
Which is pretty much the same thing.
There is a library called Specter
https://github.com/nathanmarz/specter
for queries and transformations:
(ns specter.core
(:require
[clojure.string :as string]
[com.rpl.specter :as s]))
(def example
{"a" {"b" ["c" "d" "e"]}
"m" {"n" ["o" "p"]}})
(s/transform
[s/ALL s/LAST s/ALL s/LAST]
string/join
example)
Which I think is a pretty neat way to express it.
Inspired by this excellent post I wanted to implement a simple expression simplifier in Clojure using the algorithm used in the post. The post gives example implementations in F#, Scala, Haskell, C++, and Julia which all appear fairly elegant.
I have come up with two different implementations (see below) but I have a nagging feeling that they are both less than idiomatic.
My question is: What would an idiomatic Clojure implementation look like?
First implementation, based primarily on protocols:
(defprotocol Expr
(simplify1 [e])
(simplify [e]))
(defrecord Const [n]
Expr
(simplify1 [this] this)
(simplify [this] this))
(defrecord Variable [name]
Expr
(simplify1 [this] this)
(simplify [this] this))
(defrecord Add [l r]
Expr
(simplify1 [{:keys [l r] :as expr}]
(let [lclass (class l)
rclass (class r)]
(cond
(= lclass rclass Const)
(Const. (+ (:n l) (:n r)))
(and (= lclass Const) (= (:n l) 0))
r
(and (= rclass Const) (= (:n r) 0))
l
:else expr)))
(simplify [{:keys [l r]}]
(simplify1 (Add. (simplify l) (simplify r)))))
(defrecord Mult [l r]
Expr
(simplify1 [{:keys [l r] :as expr}]
(let [lclass (class l)
rclass (class r)]
(cond
(= lclass rclass Const)
(Const. (* (:n l) (:n r)))
(and (= lclass Const) (= (:n l) 0))
(Const. 0)
(and (= rclass Const) (= (:n r) 0))
(Const. 0)
(and (= lclass Const) (= (:n l) 1))
r
(and (= rclass Const) (= (:n r) 1))
l
:else expr)))
(simplify [{:keys [l r]}]
(simplify1 (Mult. (simplify l) (simplify r)))))
(defmulti print-expr class)
(defmethod print-expr Const [e]
(print-str (.value e)))
(defmethod print-expr ::expr [e]
(print-str "The expression cannot be simplified to a constant"))
(let [e (Add. (Mult. (Add. (Const. 1) (Mult. (Const. 0) (Variable. "X"))) (Const. 3)) (Const. 12))]
(-> e
simplify
print-expr))
Second implementation, primarily based on multimethods and more verbose than the first:
(defrecord Const [value])
(defrecord Variable [name])
(defrecord Add [l r])
(defrecord Mult [l r])
(derive Const ::expr)
(derive Variable ::expr)
(derive Add ::expr)
(derive Mult ::expr)
(defn sim-1-disp [{:keys [l r] :as e}]
(if (some #{(class e)} [Add Mult])
[(class e) (class l) (class r)]
(class e)))
(defmulti simplify class)
(defmulti simplify1 sim-1-disp)
(defmulti print-expr class)
(defmethod simplify Add [{:keys [l r]}]
(simplify1 (Add. (simplify l) (simplify r))))
(defmethod simplify Mult [{:keys [l r]}]
(simplify1 (Mult. (simplify l) (simplify r))))
(defmethod simplify ::expr [e]
e)
(defmethod simplify1 [Add Const Const] [{:keys [l r]}]
(Const. (+ (:value l) (:value r))))
(defmethod simplify1 [Add Const ::expr] [{:keys [l r] :as e}]
(if (= (:value l) 0)
r
e))
(defmethod simplify1 [Add ::expr Const] [{:keys [l r] :as e}]
(if (= (:value r) 0)
l
e))
(defmethod simplify1 [Mult Const Const] [{:keys [l r]}]
(Const. (* (.value l) (.value r))))
(defmethod simplify1 [Mult Const ::expr] [{:keys [l r] :as e}]
(cond (= (:value l) 0)
(Const. 0)
(= (:value l) 1)
r
:else e))
(defmethod simplify1 [Mult ::expr Const] [{:keys [l r] :as e}]
(cond (= (:value r) 0)
(Const. 0)
(= (:value r) 1)
l
:else e))
(defmethod simplify1 ::expr [e]
e)
(defmethod print-expr Const [e]
(print-str (.value e)))
(defmethod print-expr ::expr [e]
(print-str "The expression cannot be simplified to a constant"))
(let [e (Add. (Mult. (Add. (Const. 1) (Mult. (Const. 0) (Variable. "X"))) (Const. 3)) (Const. 12))]
(-> e
simplify
print-expr))
Not sure about being the idiomatic implementation, but I think as Guillermo Winkler mentioned core.match is a pretty natural alternative approach, especially with variants. As your linked article says, sum types are pretty neat.
(ns simplify
(:require [clojure.core.match :refer [match]]))
(defn- simplify-1 [expr]
(match expr
[::add [::const 0] a] a
[::add a [::const 0]] a
[::add [::const a] [::const b]] [::const (+ a b)]
[::mult [::const 0] _] [::const 0]
[::mult _ [::const 0]] [::const 0]
[::mult a [::const 1]] a
[::mult [::const 1] a] a
[::mult [::const a] [::const b]] [::const (* a b)]
_ expr))
(defn simplify [expr]
(match expr
[::add a b ] (simplify-1 [::add (simplify a) (simplify b)])
[::mult a b ] (simplify-1 [::mult (simplify a) (simplify b)])
_ (simplify-1 expr)))
Example:
(simplify [::add
[::mult
[::add [::const 1] [::mult [::const 0] [::var 'x]]]
[::const 3]]
[::const 12]])
;=> [:simplify/const 15]
This lets you leverage pattern matching for terseness and have a similar elegance as some of your linked examples. There is a cost compared to your protocol/multimethod approaches though - those are sum types open to extension, including by other people's code without touching your source code. How useful that is depends on your application.
A few asides:
You can also define simplify in terms of clojure.walk/postwalk with simplify-1 as the function argument. This is maybe a tad easier to extend since simplify no longer needs to know which expr variants are operations and can be simplified beyond calling simplify-1 on them.
I tried to define a core.typed type for this, but my environment seems to have some issues loading that today so I can't check it.
Think this should more or less fit:
(defalias Expr
"A variant type for algebraic expressions."
(Rec [e]
(U [(Value ::const) Number]
[(Value ::add) e e]
[(Value ::mult) e e]
[(Value ::var) Symbol])))