I have this:
(defn page1 []
(layout/render
"index.html"
({:articles (db/get-articles)})))
The function
db/get-articles
returns a list of objects which have the key body. I need to parse the body of the articles and replace, if exists, a substring "aaa12aaa" with "bbb13bbb", "aaa22aaa" with "bbb23bbb" and so on in the bodies. How can I do that so it also won't consume plenty of RAM? Is using regex effective?
UPDATE:
The pattern I need to replace is : "[something="X" something else/]". where X is a number and it's unknown. I need to change X.
There can be many such patterns to replace or none.
I would just use Java's String.replace or String.replaceAll or clojure.string functions: replace/replace-first.
I wouldn't waste time for premature optimisations and first measure if the simple solution works. I am not sure how big the article contents are but I guess it shouldn't be an issue.
If it turns out you really need to optimise then maybe you should switch to streaming the contents of your articles from your data storage and either implement replace manually or using a library like streamflyer to perform modifications on the fly before sending the article contents to the HTTP response stream.
Something like this should be plenty fast:
(mapv
(fn [{:keys [body] :as m}]
(assoc m :body
(reduce-kv
(fn [body re repl]
(string/replace body re repl))
body
{"aaa12aaa" "bbb13bbb",
"aaa22aaa" "bbb23bbb"})))
[{:body "xy aaa12aaa fo aaa22aaa"}])
If you can guarantee that the string only occurs once you can replace replace by replace-first.
Regex works great in clojure:
(ns clj.core
(:use tupelo.core)
(:require
[clojure.string :as str]
)
(spyx (str/replace "xyz-aaa12aaa-def" #"aaa12aaa" "bbb13bbb"))
;=> (str/replace "xyz-aaa12aaa-def" #"aaa12aaa" "bbb13bbb") => "xyz-bbb13bbb-def"
Related
I want to be able to call map on regular expressions, like so:
(map #"ab+c*" ["abbb" "ac" "abbcc"])
=> ("abbb" "abbcc")
How do I extend regular expressions to support the IFn interface? Or is there a different way to do it?
ClojureScript:
(extend-type js/RegExp
IFn
(-invoke
([match s] (re-find match s))
([match replacement s]
(clojure.string/replace s match replacement))))
Now you can call regular expressions as functions and even pass them to map:
(#"abc+" "abcccc")
=> "abcccc"
(map #"abc+" ["abcccc" "abcccccccc"])
=> ("abcccc" "abcccccccc")
Unfortunately, IFn is not a protocol in Clojure, so you cannot extend it. That's unfortunate.
Since IFn isn't a protocol in core Clojure, I don't believe that this is possible.
The closest I could get is creating a wrapper type that implements IFn:
(defrecord R [^java.util.regex.Pattern regex]
clojure.lang.IFn
(invoke [this s]
(re-find regex s))
(invoke [this replacement s]
(clojure.string/replace s regex replacement)))
(map (->R #"abc+") ["abcccc" "abcccccccc"])
=> ("abcccc" "abcccccccc")
The trouble with trying to do this is that it's not directly obvious what you're trying to do with the regular expression - Particularly when most of your production code will look like (map #"ab+" entries)
Regular expressions are about a pattern matching only, they don't directly imply what transformation you want from them, so you really should steer clear of trying to shoehorn that into it.
If it's a once-off, just use
(map #(clojure.string/replace % #"ab+c*" "ab") ["ab" "ac" "abbcc"])
=> ("ab" "ac" "ab")
(It's not immediately obvious how your example is supposed to work? You have less elements in your result - are you filtering and transforming? How are you getting to the "abbb" element?)
If you're using this a lot, I would recommend simply creating a helper function in a common namespace that you can use with map instead of trying to extend the IFn interface.. Since creating a function is, in effect, a direct way to extend from IFn, but it's a named function with very specific semantics that you can customize precisely.
As CmdrDats says, using re-find in an anonymous function is definitely the way to go:
(filter #(re-find #"ab+c*" %) ["abbb" "ac" "abbcc"])
=> ("abbb" "abbcc")
I sometimes use a helper function to emphasize that I want just true/false output (not the match nor a sequence of matches), and since I'm always forgetting the differences between the re-xxx functions:
(ns demo.core
(:require [schema.core :as s]))
(s/defn contains-match? :- s/Bool
"Returns true if the regex matches any portion of the intput string."
[search-str :- s/Str
re :- s/Any]
#?(:clj (assert (instance? java.util.regex.Pattern re)))
(boolean (re-find re search-str)))
I'm new to Clojure, and hickory, and the idea of zippers.
What I want to do is, I want to use selectors to go to one location in an HTML document. And then, I want to be able to navigate from that location, up to a parent element, and then get 2nd sibling from that point.
Is this possible to do with hickory? From what I understand, it seems as though I only have the option of using selectors, or navigating the HTML in a zipper structure, but I can't figure out how to do both, or if that's even possible.
You could do something like this:
(:require
[hickory.select :as s]
[hickory.convert :as convert]
[clojure.zip :as z]
...
(let [html (convert/hiccup-to-hickory (list [:div
[:div {:class "didya"} "nevertheless"]]
[:div "possible"]
[:div "geometric"]))]
(-> (s/select-locs (s/class "didya") html)
(first)
(z/up)
(z/right)
(z/right)
(z/node)))
The forest library can do this easily. There is
a video from the last Clojure Conj
many examples also
docs are ongoing.
I'm trying to open a file that is to large to slurp. I want to then edit the file to remove all characters except numbers. Then write the data to a new file.
So far I have
(:require [clojure.java.io :as io])
(:require [clojure.string :as str])
:jvm-opts ["-Xmx2G"]
(with-open [rdr (io/reader "/Myfile.txt")
wrt (io/writer "/Myfile2.txt")]
(doseq [line (line-seq rdr)]
(.write wrt (str line "\n"))))
Which reads and writes but I'm unsure of the best way to go about editing.Any help is much appreciated. I'm very new to the language.
Looks like you just need to modify the line value before writing it. If you want to modify a string to remove all non-numeric characters, a regular expression is a pretty easy route. You could make a function to do this:
(defn numbers-only [s]
(clojure.string/replace s #"[^\d]" ""))
(numbers-only "this is 4 words")
=> "4"
Then use that function in your example:
(str (numbers-only line) "\n")
Alternatively, you could map numbers-only over the output of line-seq, and because both map and line-seq are lazy you'll get the same lazy/on-demand behavior:
(map numbers-only (line-seq rdr))
And then your doseq would stay the same. I would probably opt for this approach as it keeps your "stream" processing together, and your imperative/side-effect loop is only concerned with writing its inputs.
I want to change certain key's in a large map in clojure.
These key's can be present at any level in the map but will always be within a required-key
I was looking at using camel-snake-kebab library but need it to change only a given set of keys in the required-key map. It doesn't matter if the change is made in json or the map
(def my-map {:allow_kebab_or-snake {:required-key {:must_be_kebab ""}}
:allow_kebab_or-snake2 {:optional-key {:required-key {:must_be_kebab ""}}}})
currently using /walk/postwalk-replace but fear it may change keys not nested within the :required-key map
(walk/postwalk-replace {:must_be_kebab :must-be-kebab} my-map))
ummmm.. could you clarify: do you want to change the keys of the map?! or their associated values?
off-topic: your map above is not correct (having two identical keys :allow_kebab_or_snake - i-m assuming you're just underlining the point and not showing the actual example :))
postwalk-replace WILL replace any occurrence of the key with the value.
so if you know the exact map struct you could first select your sub-struct with get-in and then use postwalk-replace :
(walk/postwalk-replace {:must_be_kebab :mus-be-kebab}
(get-in my-map [:allow_kebab_or_snake :required-key]))
But then you'll have to assoc this into your initial map.
You should also consider the walk function and construct your own particular algorithm if the interleaved DS is too complex.
Here is a solution. Since you need to control when the conversion does/doesn't occur, you can't just use postwalk. You need to implement your own recursion and change the context from non-convert -> convert when your condition is found.
(ns tst.clj.core
(:use clj.core clojure.test tupelo.test)
(:require
[clojure.string :as str]
[clojure.pprint :refer [pprint]]
[tupelo.core :as t]
[tupelo.string :as ts]
))
(t/refer-tupelo)
(t/print-versions)
(def my-map
{:allow_kebab_or-snake {:required-key {:must_be_kebab ""}}
:allow_kebab_or-snake2 {:optional-key {:required-key {:must_be_kebab ""}}}})
(defn children->kabob? [kw]
(= kw :required-key))
(defn proc-child-maps
[ctx map-arg]
(apply t/glue
(for [curr-key (keys map-arg)]
(let [curr-val (grab curr-key map-arg)
new-ctx (if (children->kabob? curr-key)
(assoc ctx :snake->kabob true)
ctx)
out-key (if (grab :snake->kabob ctx)
(ts/kw-snake->kabob curr-key)
curr-key)
out-val (if (map? curr-val)
(proc-child-maps new-ctx curr-val)
curr-val)]
{out-key out-val}))))
(defn nested-keys->snake
[arg]
(let [ctx {:snake->kabob false}]
(if (map? arg)
(proc-child-maps ctx arg)
arg)))
The final result is shown in the unit test:
(is= (nested-keys->snake my-map)
{:allow_kebab_or-snake
{:required-key
{:must-be-kebab ""}},
:allow_kebab_or-snake2
{:optional-key
{:required-key
{:must-be-kebab ""}}}} ))
For this solution I used some of the convenience functions in the Tupelo library.
Just a left of field suggestion which may or may not work. This is a problem that can come up when dealing with SQL databases because the '-' is seen as a reserved word and cannot be used in identifiers. However, it is common to use '-' in keywords when using clojure. Many abstraction layers used when working with SQL in clojure take maps as arguments/bindings for prepared statements etc.
Ideally, what is needed is another layer of abstraction which converts between kebab and snake case as needed depending on the direction you are going i.e. to sql or from sql. The advantage of this aproach is your not walking through maps making conversions - you do the conversion 'on the fly" when it is needed.
Have a look at https://pupeno.com/2015/10/23/automatically-converting-case-between-sql-and-clojure/
I need to read large file (~1GB), process it and save to db. My solution looks like that:
data.txt
format: [id],[title]\n
1,Foo
2,Bar
...
code
(ns test.core
(:require [clojure.java.io :as io]
[clojure.string :refer [split]]))
(defn parse-line
[line]
(let [values (split line #",")]
(zipmap [:id :title] values)))
(defn run
[]
(with-open [reader (io/reader "~/data.txt")]
(insert-batch (map parse-line (line-seq reader)))))
; insert-batch just save vector of records into database
But this code does not work well, because it first parse all lines and then send them into database.
I think the ideal solution would be read line -> parse line -> collect 1000 parsed lines -> batch insert them into database -> repeat until there is no lines. Unfortunately, I have no idea how to implement this.
One suggestion:
Use line-seq to get a lazy sequence of lines,
use map to parse each line,
(so far this matches what you are doing)
use partition-all to partition your lazy sequence of parsed lines into batches, and then
use insert-batch with doseq to write each batch to the database.
And an example:
(->> (line-seq reader)
(map parse-line)
(partition-all 1000)
(#(doseq [batch %]
(insert-batch batch))))