Extend Clojure Regular Expressions with IFn to support map - regex

I want to be able to call map on regular expressions, like so:
(map #"ab+c*" ["abbb" "ac" "abbcc"])
=> ("abbb" "abbcc")
How do I extend regular expressions to support the IFn interface? Or is there a different way to do it?

ClojureScript:
(extend-type js/RegExp
IFn
(-invoke
([match s] (re-find match s))
([match replacement s]
(clojure.string/replace s match replacement))))
Now you can call regular expressions as functions and even pass them to map:
(#"abc+" "abcccc")
=> "abcccc"
(map #"abc+" ["abcccc" "abcccccccc"])
=> ("abcccc" "abcccccccc")
Unfortunately, IFn is not a protocol in Clojure, so you cannot extend it. That's unfortunate.

Since IFn isn't a protocol in core Clojure, I don't believe that this is possible.
The closest I could get is creating a wrapper type that implements IFn:
(defrecord R [^java.util.regex.Pattern regex]
clojure.lang.IFn
(invoke [this s]
(re-find regex s))
(invoke [this replacement s]
(clojure.string/replace s regex replacement)))
(map (->R #"abc+") ["abcccc" "abcccccccc"])
=> ("abcccc" "abcccccccc")

The trouble with trying to do this is that it's not directly obvious what you're trying to do with the regular expression - Particularly when most of your production code will look like (map #"ab+" entries)
Regular expressions are about a pattern matching only, they don't directly imply what transformation you want from them, so you really should steer clear of trying to shoehorn that into it.
If it's a once-off, just use
(map #(clojure.string/replace % #"ab+c*" "ab") ["ab" "ac" "abbcc"])
=> ("ab" "ac" "ab")
(It's not immediately obvious how your example is supposed to work? You have less elements in your result - are you filtering and transforming? How are you getting to the "abbb" element?)
If you're using this a lot, I would recommend simply creating a helper function in a common namespace that you can use with map instead of trying to extend the IFn interface.. Since creating a function is, in effect, a direct way to extend from IFn, but it's a named function with very specific semantics that you can customize precisely.

As CmdrDats says, using re-find in an anonymous function is definitely the way to go:
(filter #(re-find #"ab+c*" %) ["abbb" "ac" "abbcc"])
=> ("abbb" "abbcc")
I sometimes use a helper function to emphasize that I want just true/false output (not the match nor a sequence of matches), and since I'm always forgetting the differences between the re-xxx functions:
(ns demo.core
(:require [schema.core :as s]))
(s/defn contains-match? :- s/Bool
"Returns true if the regex matches any portion of the intput string."
[search-str :- s/Str
re :- s/Any]
#?(:clj (assert (instance? java.util.regex.Pattern re)))
(boolean (re-find re search-str)))

Related

Clojure evaluating string variable as a symbol, is this use of read-string okay?

I can use memfn to create a clojure function that invokes a java function.
(macroexpand '(memfn startsWith prefix))
=> (fn* ([target2780 prefix] (. target2780 (startsWith prefix))))
((memfn startsWith prefix) "abc" "a")
=> true
memfn requires that the function name be a symbol. I'm wondering if I can write a macro to invoke an arbitrary method whose name is provided as a string. That is, I'd like to be able to invoke the following:
(def fn-name "startsWith")
=> #'user/fn-name
(macroexpand '(memfn' fn-name "prefix"))
=> (fn* ([target2780 prefix] (. target2780 (startsWith prefix))))
((memfn fn-name "prefix") "abc" "a")
=> true
The only way I can think to do this involves using read-string.
(defmacro memfn' [fn-name arg-name]
`(memfn ~(read-string fn-name) ~arg-name))
Edit: A version using read-string and eval that actually works the way I want it to.
(defn memfn' [fn-name arg-name]
(eval (read-string (str "(memfn " fn-name " " arg-name ")"))))
Am I missing a fundamental macro building tool to take the string that a symbol references and turn it into a literal without potentially executing code, as read-string might?
Thanks for any ideas!
There's no way to do this, with or without read-string. Your proposed solution doesn't work. The distinction you're really trying to make is not between string and symbol, but between runtime data and compile-time literals. Macros do not evaluate the arguments they receive, so even if fn-name is the name of a var whose value is "startsWith", memfn (or your memfn' macro) will only ever see fn-name.
If you are interested in calling java methods only then you can rely on java.lang.reflect.Method and its invoke method.
Something like this should work for parameterless methods and would not require a macro.
(defn memfn' [m]
(fn [o] (.invoke (.getMethod (-> o .getClass) m nil) o nil)))
((memfn' "length") "clojure")
;=>7

How can I iterate over a list with a macro?

I am trying to print the documentation for all functions in a given namespace by invoking the following expression in a REPL:
(doseq
[f (dir-fn 'clojure.repl)]
(doc f))
However the invocation of this expression returns nil without printing the documentation to the REPL. I know this might have to do with doc being a macro, but I'm a Clojure novice and am not entirely sure how to understand the problem.
Why does this expression return nil without printing the documentation?
How can this expression be modified so that it prints the documentation for each function in a given namespace?
Thanks!
Update: Combined both provided answers:
(defn ns-docs [ns']
(doseq [[symbol var] (ns-interns ns')]
(newline)
(println symbol)
(print " ")
(println (:doc (meta var)))))
(ns-docs 'clojure.repl)
I would, instead, start here:
The Clojure CheatSheet
ClojureDocs.org
Clojure-Doc.org (similar name, but different)
The API & Reference sections at Clojure.org
Note that doc is in the namespace clojure.repl, which reflects its intended usage (by a human in a repl). Here is some code that will also iterate on a namespace & print doc strings (using a different technique):
(doseq [[fn-symbol fn-var] (ns-interns 'demo.core)]
(newline)
(println fn-symbol)
(println (:doc (meta fn-var))))
where demo.core is the namespace of interest.
Note that ns-interns gives you both a symbol and var like:
fn-symbol => <#clojure.lang.Symbol -main>
fn-var => <#clojure.lang.Var #'demo.core/-main>
The meta function has lots of other info you may want to use someday:
(meta fn-var) =>
<#clojure.lang.PersistentArrayMap
{ :arglists ([& args]),
:doc "The Main Man!",
:line 9, :column 1,
:file "demo/core.clj",
:name -main,
:ns #object[clojure.lang.Namespace 0x14c35a06 "demo.core"]}>
While this probably won't help you with answering your question, the problem of evaluating macro's comes up a lot when you are learning Clojure.
Macros are responsible for the evaluation of their arguments. In this case clojure.repl/doc will ignore the current lexical context and assume that the symbol f that you're giving it is the name of a function you want to see the documentation for. It does this because it's intended to be used at the REPL, and is assuming you wouldn't want to type quotes all the time.
As f doesn't exist, it prints nothing. Then doseq returns nil, since it exists to do something for side effects only - hence starting in do. In order to pass an argument to a macro that refuses to respect the lexical context like this, you need to write the code for each element in the list.
You can do this by hand, or by constructing the code as data, and passing it to eval to execute. You can do this in an imperative style, using doseq:
(doseq [f (ns-interns 'clojure.repl)]
(eval `(doc ~(symbol "clojure.repl" (str (first f))))))
or in a slightly more Clojurey way (which will allow you to see the code that it would execute by removing eval from the end and running it at the REPL):
(->> (ns-interns 'clojure.repl)
(map #(list 'clojure.repl/doc (symbol "clojure.repl" (str (first %)))))
(cons `do)
eval)
In both of these we use quote and syntax-quote to construct some code from the list of symbols reflected from the namespace, and pass it to eval to actually execute it. This page on Clojure's weird characters should point you in the right direction for understanding what's going on here.
This an example of why you shouldn't write macro's, unless you've got no other options. Macro's do not compose, and are often difficult to work with. For a more in depth discussion, Fogus's talk and Christophe Grand's talk are both good talks.
Why does this expression return nil without printing the documentation?
Because the doc macro is receiving the symbol f from your loop, instead of a function symbol directly.
How can this expression be modified so that it prints the documentation for each function in a given namespace?
(defn ns-docs [ns']
(let [metas (->> (ns-interns ns') (vals) (map meta) (sort-by :name))]
(for [m metas :when (:doc m)] ;; you could filter here if you want fns only
(select-keys m [:name :doc]))))
(ns-docs 'clojure.repl)
=>
({:name apropos,
:doc "Given a regular expression or stringable thing, return a seq of all
public definitions in all currently-loaded namespaces that match the
str-or-pattern."}
...
)
Then you can print those maps/strings if you want.

How to replace substrings?

I have this:
(defn page1 []
(layout/render
"index.html"
({:articles (db/get-articles)})))
The function
db/get-articles
returns a list of objects which have the key body. I need to parse the body of the articles and replace, if exists, a substring "aaa12aaa" with "bbb13bbb", "aaa22aaa" with "bbb23bbb" and so on in the bodies. How can I do that so it also won't consume plenty of RAM? Is using regex effective?
UPDATE:
The pattern I need to replace is : "[something="X" something else/]". where X is a number and it's unknown. I need to change X.
There can be many such patterns to replace or none.
I would just use Java's String.replace or String.replaceAll or clojure.string functions: replace/replace-first.
I wouldn't waste time for premature optimisations and first measure if the simple solution works. I am not sure how big the article contents are but I guess it shouldn't be an issue.
If it turns out you really need to optimise then maybe you should switch to streaming the contents of your articles from your data storage and either implement replace manually or using a library like streamflyer to perform modifications on the fly before sending the article contents to the HTTP response stream.
Something like this should be plenty fast:
(mapv
(fn [{:keys [body] :as m}]
(assoc m :body
(reduce-kv
(fn [body re repl]
(string/replace body re repl))
body
{"aaa12aaa" "bbb13bbb",
"aaa22aaa" "bbb23bbb"})))
[{:body "xy aaa12aaa fo aaa22aaa"}])
If you can guarantee that the string only occurs once you can replace replace by replace-first.
Regex works great in clojure:
(ns clj.core
(:use tupelo.core)
(:require
[clojure.string :as str]
)
(spyx (str/replace "xyz-aaa12aaa-def" #"aaa12aaa" "bbb13bbb"))
;=> (str/replace "xyz-aaa12aaa-def" #"aaa12aaa" "bbb13bbb") => "xyz-bbb13bbb-def"

Creating a Clojure macro that uses a string to call a java function

So I'm trying to make a Clojure macro that makes it easy to interop with Java classes utilizing the Builder pattern.
Here's what I've tried so far.
(defmacro test-macro
[]
(list
(symbol ".queryParam")
(-> (ClientBuilder/newClient)
(.target "https://www.test.com"))
"key1"
(object-array ["val1"])))
Which expands to the below
(.
#object[org.glassfish.jersey.client.JerseyWebTarget 0x107a5073 "org.glassfish.jersey.client.JerseyWebTarget#107a5073"]
queryParam
"key1"
#object["[Ljava.lang.Object;" 0x16751ba2 "[Ljava.lang.Object;#16751ba2"])
The desired result is:
(.queryParam
#object[org.glassfish.jersey.client.JerseyWebTarget 0x107a5073 "org.glassfish.jersey.client.JerseyWebTarget#107a5073"]
"key1"
#object["[Ljava.lang.Object;" 0x16751ba2 "[Ljava.lang.Object;#16751ba2"])
I guess the . is causing something to get evaluated and moved around? In which case the solution would to be to quote it. But how can I quote the results of an evaluated expression?
My goal is to convert maps into code that build the object by have the maps keys be the functions to be called and the values be the arguments passed into the Java functions.
I understand how to use the threading and do-to macros but am trying to make request building function data driven. I want to be able take in a map with the key as "queryParam" and the values as the arguments. By having this I can leverage the entirety on the java classes functions only having to write one function myself and there is enough of a 1 to 1 mapping I don't believe others will find it magical.
(def test-map {"target" ["https://www.test.com"]
"path" ["qa" "rest/service"]
"queryParam" [["key1" (object-array ["val1"])]
["key2" (object-array ["val21" "val22" "val23"])]] })
(-> (ClientBuilder/newClient)
(.target "https://www.test.com")
(.path "qa")
(.path "rest/service")
(.queryParam "key1" (object-array ["val1"]))
(.queryParam "key2" (object-array ["val21" "val22" "val23"])))
From your question it's not clear if you have to use map as your builder data structure. I would recommend using the threading macro for working directly with Java classes implementing the builder pattern:
(-> (ClientBuilder.)
(.forEndpoint "http://example.com")
(.withQueryParam "key1" "value1")
(.build))
For classes that don't implement builder pattern and their methods return void (e.g. setter methods) you can use doto macro:
(doto (Client.)
(.setEndpoint "http://example.com")
(.setQueryParam "key1" "value1"))
Implementing a macro using a map for encoding Java method calls is possible but awkward. You would have to keep each method arguments inside a sequence (in map values) to be a able to call methods with multiple parameters or have some convention for storing arguments for single parameter methods, handling varargs, using map to specify method calls doesn't guarantee the order they will be invoked etc. It will add much complexity and magic to your code.
This is how you could implement it:
(defmacro builder [b m]
(let [method-calls
(map (fn [[k v]] `(. (~(symbol k) ~#v))) m)]
`(-> ~b
~#method-calls)))
(macroexpand-1
'(builder (StringBuilder.) {"append" ["a"]}))
;; => (clojure.core/-> (StringBuilder.) (. (append "a")))
(str
(builder (StringBuilder.) {"append" ["a"] }))
;; => "a"

How can I evaluate "symbol" and "(symbol 1)" with the same name?

I want to get following results when I evaluate edit-url and (edit-url 1).
edit-url --> "/articles/:id/edit"
(edit-url 1) --> "/articles/1/edit"
Is it possible to define such a Var or something?
Now, I use following function, but I don't want to write (edit-url) to get const string.
(defn edit-url
([] "/articles/:id/edit")
([id] (str "/articles/" id "/edit")))
Thanks in advance.
If those behaviors are exactly what you want, print-method and tagged literals may be used to imitate them.
(defrecord Path [path]
clojure.lang.IFn
(invoke [this n]
(clojure.string/replace path ":id" (str n))))
(defmethod print-method Path [o ^java.io.Writer w]
(.write w (str "#path\"" (:path o) "\"")))
(set! *data-readers* (assoc *data-readers* 'path ->Path))
(comment
user=> (def p #path"/articles/:id/edit")
#'user/p
user=> p
#path"/articles/:id/edit"
user=> (p 1)
"/articles/1/edit"
user=>
)
edit-url will either have the value of an immutable string or function. Not both.
The problem will fade when you write a function with better abstraction that takes a string and a map of keywords to replace with words. It should work like this
(generate-url "/articles/:id/edit" {:id 1})
Clojure is a "Lisp 1" which means that is has a single namespace for all symbols, including both data scalars and functions. What you have written shows the functionally of both a string and a function but for a single name, which you can do in Common Lisp but not Clojure (not that a "Lisp 2" has its own inconveniences as well).
In general this type of "problem" is a non issue if you organize your vars better. Why not just make edit-url a function with variable arity? Without arguments it returns something, with arguments it returns something else. Really the possibilities are endless, even more so when you consider making a macro instead of a function (not that I'm advocating that).