Conditionals in Hiccup, can I make this more idiomatic? - clojure

Clojure beginner here! I added flash message support to my Hiccup code in a Noir project.
What I'm trying to do is check if the message string for each specific was set or not. If there's no message, then I don't want to display the specific flash element containing that message.
(defpartial success-flash [msg]
[:div.alert.notice.alert-success
[:a.close {:data-dismiss "alert"} "x"]
[:div#flash_notice msg]])
(defpartial error-flash [msg]
[:div.alert.notice.alert-error
[:a.close {:data-dismiss "alert"} "x"]
[:div#flash_notice msg]])
[..]
(defpartial layout [& content]
(html5
[:head
[...]
[:body
(list
[...]
[:div.container
(let [error-msg (session/flash-get :error-message)
error-div (if (nil? error-msg) () (error-flash error-msg))
success-msg (session/flash-get :success-message)
success-div (if (nil? success-msg) () (success-flash success-msg))]
warning-msg (session/flash-get :warning-message)
warning-div (if (nil? warning-msg) () (warning-flash warning-msg))]
(list error-div success-div warning-div content))])]))
Disclaimer: I completely agree that you won't likely ever be in a situation where you'll need more than one of those specific flashes on at once, but indulge me in my attempt at figuring out a better and more functional way of implementing this.
I'm confident that there's a pattern out there for handling similar situations. Basically I check the value of several expressions, do a bunch of stuff with those values, and then act based on the results. You could pull this off with a progressively more and more monstrous (cond), but my solution is at least somewhat cleaner.
Tips?

You could also use when-let.
(defpartial layout
[& contents]
(html5
[:body
(when-let [msg (session/flash-get :error-message)]
(error-flash msg))
(when-let [msg (session/flash-get :warning-message)]
(warning-flash msg))
(when-let [msg (session/flash-get :success-message)]
(success-flash msg))
contents))
I'm not a hiccup expert, but I think this should work. I find it a little clearer on what's going on, although it's slightly more verbose.

The pattern is called mapping value. Below is an example that uses keep function to apply the pattern of mapping values and then filtering them
(use 'clojure.contrib.core)
(def flash-message
[[:error-message error-flash]
[:success-message success-flash]
[:warning-message warning-flash]])
(keep (fn [m f] (-?>> m (session/flash-get) (f))) flash-message)

Related

Core.async: Take all values from collection of promise-chans

Consider a dataset like this:
(def data [{:url "http://www.url1.com" :type :a}
{:url "http://www.url2.com" :type :a}
{:url "http://www.url3.com" :type :a}
{:url "http://www.url4.com" :type :b}])
The contents of those URL's should be requested in parallel. Depending on the item's :type value those contents should be parsed by corresponding functions. The parsing functions return collections, which should be concatenated, once all the responses have arrived.
So let's assume that there are functions parse-a and parse-b, which both return a collection of strings when they are passed a string containing HTML content.
It looks like core.async could be a good tool for this. One could either have separate channels for each item ore one single channel. I'm not sure which way would be preferable here. With several channels one could use transducers for the postprocessing/parsing. There is also a special promise-chan which might be proper here.
Here is a code-sketch, I'm using a callback based HTTP kit function. Unfortunately, I could not find a generic solution inside the go block.
(defn f [data]
(let [chans (map (fn [{:keys [url type]}]
(let [c (promise-chan (map ({:a parse-a :b parse-b} type)))]
(http/get url {} #(put! c %))
c))
data)
result-c (promise-chan)]
(go (put! result-c (concat (<! (nth chans 0))
(<! (nth chans 1))
(<! (nth chans 2))
(<! (nth chans 3)))))
result-c))
The result can be read like so:
(go (prn (<! (f data))))
I'd say that promise-chan does more harm than good here. The problem is that most of core.async API (a/merge, a/reduce etc.) relies on fact that channels will close at some point, promise-chans in turn never close.
So, if sticking with core.async is crucial for you, the better solution will be not to use promise-chan, but ordinary channel instead, which will be closed after first put!:
...
(let [c (chan 1 (map ({:a parse-a :b parse-b} type)))]
(http/get url {} #(do (put! c %) (close! c)))
c)
...
At this point, you're working with closed channels and things become a bit simpler. To collect all values you could do something like this:
;; (go (put! result-c (concat (<! (nth chans 0))
;; (<! (nth chans 1))
;; (<! (nth chans 2))
;; (<! (nth chans 3)))))
;; instead of above, now you can do this:
(->> chans
async/merge
(async/reduce into []))
UPD (below are my personal opinions):
Seems, that using core.async channels as promises (either in form of promise-chan or channel that closes after single put!) is not the best approach. When things grow, it turns out that core.async API overall is (you may have noticed that) not that pleasant as it could be. Also there are several unsupported constructs, that may force you to write less idiomatic code than it could be. In addition, there is no built-in error handling (if error occurs within go-block, go-block will silently return nil) and to address this you'll need to come up with something of your own (reinvent the wheel). Therefore, if you need promises, I'd recommend to use specific library for that, for example manifold or promesa.
I wanted this functionality as well because I really like core.async but I also wanted to use it in certain places like traditional JavaScript promises. I came up with a solution using macros. In the code below, <? is the same thing as <! but it throws if there's an error. It behaves like Promise.all() in that it returns a vector of all the returned values from the channels if they all are successful; otherwise it will return the first error (since <? will cause it to throw that value).
(defmacro <<? [chans]
`(let [res# (atom [])]
(doseq [c# ~chans]
(swap! res# conj (serverless.core.async/<? c#)))
#res#))
If you'd like to see the full context of the function it's located on GitHub. It's heavily inspired from David Nolen's blog post.
Use pipeline-async in async.core to launch asynchronous operations like http/get concurrently while delivering the result in the same order as the input:
(let [result (chan)]
(pipeline-async
20 result
(fn [{:keys [url type]} ch]
(let [parse ({:a parse-a :b parse-b} type)
callback #(put! ch (parse %)(partial close! ch))]
(http/get url {} callback)))
(to-chan data))
result)
if anyone is still looking at this, adding on to the answer by #OlegTheCat:
You can use a separate channel for errors.
(:require [cljs.core.async :as async]
[cljs-http.client :as http])
(:require-macros [cljs.core.async.macros :refer [go]])
(go (as-> [(http/post <url1> <params1>)
(http/post <url2> <params2>)
...]
chans
(async/merge chans (count chans))
(async/reduce conj [] chans)
(async/<! chans)
(<callback> chans)))

Why in this example calling (f arg) and calling the body of f explicitly yields different results?

First, I have no experience with CS and Clojure is my first language, so pardon if the following problem has a solution, that is immediately apparent for a programmer.
The summary of the question is as follows: one needs to create atoms at will with unknown yet symbols at unknown times. My approach revolves around a) storing temporarily the names of the atoms as strings in an atom itself; b) changing those strings to symbols with a function; c) using a function to add and create new atoms. The problem pertains to step "c": calling the function does not create new atoms, but using its body does create them.
All steps taken in the REPL are below (comments follow code blocks):
user=> (def atom-pool
#_=> (atom ["a1" "a2"]))
#'user/atom-pool
'atom-pool is the atom that stores intermediate to-be atoms as strings.
user=> (defn atom-symbols []
#_=> (mapv symbol (deref atom-pool)))
#'user/atom-symbols
user=> (defmacro populate-atoms []
#_=> (let [qs (vec (remove #(resolve %) (atom-symbols)))]
#_=> `(do ~#(for [s qs]
#_=> `(def ~s (atom #{}))))))
#'user/populate-atoms
'populate-atoms is the macro, that defines those atoms. Note, the purpose of (remove #(resolve %) (atom-symbols)) is to create only yet non-existing atoms. 'atom-symbols reads 'atom-pool and turns its content to symbols.
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(nil nil nil)
Here it is confirmed that there are no 'a1', 'a2', 'a-new' atoms as of yet.
user=> (defn new-atom [a]
#_=> (do
#_=> (swap! atom-pool conj a)
#_=> (populate-atoms)))
#'user/new-atom
'new-atom is the function, that first adds new to-be atom as string to `atom-pool. Then 'populate-atoms creates all the atoms from 'atom-symbols function.
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(#'user/a1 #'user/a2 nil)
Here we see that 'a1 'a2 were created as clojure.lang.Var$Unbound just by defining a function, why?
user=> (new-atom "a-new")
#'user/a2
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(#'user/a1 #'user/a2 nil)
Calling (new-atom "a-new") did not create the 'a-new atom!
user=> (do
#_=> (swap! atom-pool conj "a-new")
#_=> (populate-atoms))
#'user/a-new
user=> (for [s ['a1 'a2 'a-new]]
#_=> (resolve s))
(#'user/a1 #'user/a2 #'user/a-new)
user=>
Here we see that resorting explicitly to 'new-atom's body did create the 'a-new atom. 'a-new is a type of clojure.lang.Atom, but 'a1 and 'a2 were skipped due to already being present in the namespace as clojure.lang.Var$Unbound.
Appreciate any help how to make it work!
EDIT: Note, this is an example. In my project the 'atom-pool is actually a collection of maps (atom with maps). Those maps have keys {:name val}. If a new map is added, then I create a corresponding atom for this map by parsing its :name key.
"The summary of the question is as follows: one needs to create atoms at will with unknown yet symbols at unknown times. "
This sounds like a solution looking for a problem. I would generally suggest you try another way of achieving whatever the actual functionality is without generating vars at runtime, but if you must, you should use intern and leave out the macro stuff.
You cannot solve this with macros since macros are expanded at compile time, meaning that in
(defn new-atom [a]
(do
(swap! atom-pool conj a)
(populate-atoms)))
populate-atoms is expanded only once; when the (defn new-atom ...) form is compiled, but you're attempting to change its expansion when new-atom is called (which necessarily happens later).
#JoostDiepenmaat is right about why populate-atoms is not behaving as expected. You simply cannot do this using macros, and it is generally best to avoid generating vars at runtime. A better solution would be to define your atom-pool as a map of keywords to atoms:
(def atom-pool
(atom {:a1 (atom #{}) :a2 (atom #{})}))
Then you don't need atom-symbols or populate-atoms because you're not dealing with vars at compile-time, but typical data structures at run-time. Your new-atom function could look like this:
(defn new-atom [kw]
(swap! atom-pool assoc kw (atom #{})))
EDIT: If you don't want your new-atom function to override existing atoms which might contain actual data instead of just #{}, you can check first to see if the atom exists in the atom-pool:
(defn new-atom [kw]
(when-not (kw #atom-pool)
(swap! atom-pool assoc kw (atom #{}))))
I've already submitted one answer to this question, and I think that that answer is better, but here is a radically different approach based on eval:
(def atom-pool (atom ["a1" "a2"]))
(defn new-atom! [name]
(load-string (format "(def %s (atom #{}))" name)))
(defn populate-atoms! []
(doseq [x atom-pool]
(new-atom x)))
format builds up a string where %s is substituted with the name you're passing in. load-string reads the resulting string (def "name" (atom #{})) in as a data structure and evals it (this is equivalent to (eval (read-string "(def ...)
Of course, then we're stuck with the problem of only defining atoms that don't already exist. We could change the our new-atom! function to make it so that we only create an atom if it doesn't already exist:
(defn new-atom! [name]
(when-not (resolve (symbol name))
(load-string (format "(def %s (atom #{}))" name name))))
The Clojure community seems to be against using eval in most cases, as it is usually not needed (macros or functions will do what you want in 99% of cases*), and eval can be potentially unsafe, especially if user input is involved -- see Brian Carper's answer to this question.
*After attempting to solve this particular problem using macros, I came to the conclusion that it either cannot be done without relying on eval, or my macro-writing skills just aren't good enough to get the job done with a macro!
At any rate, I still think my other answer is a better solution here -- generally when you're getting way down into the nuts & bolts of writing macros or using eval, there is probably a simpler approach that doesn't involve metaprogramming.

How to make '() to be nil?

How to make clojure to count '() as nil?
For example:
How to make something like
(if '() :true :false)
;to be
:false
;Or easier
(my-fun/macro/namespace/... (if '() :true :false))
:false
And not just if. In every way.
(= nil '()) or (my-something (= nil '()))
true
And every code to be (= '() nil) save.
(something (+ 1 (if (= nil '()) 1 2)))
2
I was thinking about some kind of regural expression. Which will look on code and replace '() by nil, but there are some things like (rest '(1)) and many others which are '() and I am not sure how to handle it.
I was told that macros allow you to build your own languages. I want to try it by changing clojure. So this is much about "How clojure works and how to change it?" than "I really need it to for my work."
Thank you for help.
'() just isn't the same thing as nil - why would you want it do be?
What you might be looking for though is the seq function, which returns nil if given an empty collection:
(seq [1 2 3])
=> (1 2 3)
(seq [])
=> nil
(seq '())
=> nil
seq is therefore often used to test for "emptiness", with idioms like:
(if (seq coll)
(do-something-with coll)
(get-empty-result))
You say you would like to change Clojure using the macros. Presently, as far as I know, this is not something you could do with the "regular" macro system (terminology fix anyone?). What you would really need (I think) is a reader macro. Things I have seen online (here, for example) seem to say that there exists something like reader macros in Clojure 1.4--but I have no familiarity with this because I really like using clooj as my IDE, and it currently is not using Clojure 1.4. Maybe somebody else has better info on this "extensible reader" magic.
Regardless, I don't really like the idea of changing the language in that way, and I think there is a potentially very good alternative: namely, the Clojure function not-empty.
This function takes any collection and either returns that collection as is, or returns nil if that collection is empty. This means that anywhere you will want () to return nil, you should wrap it not-empty. This answer is very similar to mikera's answer above, except that you don't have to convert your collections to sequences (which can be nice).
Both using seq and not-empty are pretty silly in cases where you have a "hand-written" collection. After all, if you are writing it by hand (or rather, typing it manually), then you are going to know for sure whether or not it is empty. The cases in which this is useful is when you have an expression or a symbol that returns a collection, and you do not know whether the returned collection will be empty or not.
Example:
=> (if-let [c (not-empty (take (rand-int 5) [:a :b :c :d]))]
(println c)
(println "Twas empty"))
;//80% of the time, this will print some non-empty sub-list of [:a :b :c :d]
;//The other 20% of the time, this will return...
Twas empty
=> nil
What about empty? ? It's the most expressive.
(if (empty? '())
:true
:false)
You can override macros and functions. For instance:
(defn classic-lisp [arg]
(if (seq? arg) (seq arg) arg))
(defn = [& args]
(apply clojure.core/= (map classic-lisp args)))
(defmacro when [cond & args]
`(when (classic-lisp ~cond) ~#args))
Unfortunately, you can't override if, as it is a special form and not a macro. You will have to wrap your code with another macro.
Let's make an if* macro to be an if with common-lisp behavior:
(defmacro if* [cond & args]
`(if (classic-lisp ~cond) ~#args)
With this, we can replace all ifs with if*s:
(use 'clojure.walk)
(defn replace-ifs [code]
(postwalk-replace '{if if*} (macroexpand-all code)))
(defmacro clojure-the-old-way [& body]
`(do ~#(map replace-ifs body)))
Now:
=> (clojure-the-old-way (if '() :true :false) )
:false
You should be able to load files and replace ifs in them too:
(defn read-clj-file [filename]
;; loads list of clojure expressions from file *filename*
(read-string (str "(" (slurp filename) ")")))
(defn load-clj-file-the-old-way [filename]
(doseq [line (replace-ifs (read-clj-file filename))] (eval line))
Note that I didn't test the code to load files and it might be incompatible with leiningen or namespaces. I believe it should work with overriden = though.

Huge file in Clojure and Java heap space error

I posted before on a huge XML file - it's a 287GB XML with Wikipedia dump I want ot put into CSV file (revisions authors and timestamps). I managed to do that till some point. Before I got the StackOverflow Error, but now after solving the first problem I get: java.lang.OutOfMemoryError: Java heap space error.
My code (partly taken from Justin Kramer answer) looks like that:
(defn process-pages
[page]
(let [title (article-title page)
revisions (filter #(= :revision (:tag %)) (:content page))]
(for [revision revisions]
(let [user (revision-user revision)
time (revision-timestamp revision)]
(spit "files/data.csv"
(str "\"" time "\";\"" user "\";\"" title "\"\n" )
:append true)))))
(defn open-file
[file-name]
(let [rdr (BufferedReader. (FileReader. file-name))]
(->> (:content (data.xml/parse rdr :coalescing false))
(filter #(= :page (:tag %)))
(map process-pages))))
I don't show article-title, revision-user and revision-title functions, because they just simply take data from a specific place in the page or revision hash. Anyone could help me with this - I'm really new in Clojure and don't get the problem.
Just to be clear, (:content (data.xml/parse rdr :coalescing false)) IS lazy. Check its class or pull the first item (it will return instantly) if you're not convinced.
That said, a couple things to watch out for when processing large sequences: holding onto the head, and unrealized/nested laziness. I think your code suffers from the latter.
Here's what I recommend:
1) Add (dorun) to the end of the ->> chain of calls. This will force the sequence to be fully realized without holding onto the head.
2) Change for in process-page to doseq. You're spitting to a file, which is a side effect, and you don't want to do that lazily here.
As Arthur recommends, you may want to open an output file once and keep writing to it, rather than opening & writing (spit) for every Wikipedia entry.
UPDATE:
Here's a rewrite which attempts to separate concerns more clearly:
(defn filter-tag [tag xml]
(filter #(= tag (:tag %)) xml))
;; lazy
(defn revision-seq [xml]
(for [page (filter-tag :page (:content xml))
:let [title (article-title page)]
revision (filter-tag :revision (:content page))
:let [user (revision-user revision)
time (revision-timestamp revision)]]
[time user title]))
;; eager
(defn transform [in out]
(with-open [r (io/input-stream in)
w (io/writer out)]
(binding [*out* out]
(let [xml (data.xml/parse r :coalescing false)]
(doseq [[time user title] (revision-seq xml)]
(println (str "\"" time "\";\"" user "\";\"" title "\"\n")))))))
(transform "dump.xml" "data.csv")
I don't see anything here that would cause excessive memory use.
Unfortunately data.xml/parse is not lazy, it attempts to read the whole file into memory and then parse it.
Instead use the this (lazy) xml library which holds only the part it is currently working on in ram. You will then need to re-structure your code to write the output as it reads the input instead of gathering all the xml, then outputting it.
your line
(:content (data.xml/parse rdr :coalescing false)
will load all the xml into memory and then request the content key from it. which will blow the heap.
a rough outline of a lazy answer would look something like this:
(with-open [input (java.io.FileInputStream. "/tmp/foo.xml")
output (java.io.FileInputStream. "/tmp/foo.csv"]
(map #(write-to-file output %)
(filter is-the-tag-i-want? (parse input))))
Have patience, working with (> data ram) always takes time :)
I don't know about Clojure but in plain Java one could use a SAX event based parser like http://docs.oracle.com/javase/1.4.2/docs/api/org/xml/sax/XMLReader.html
that doesn't need to load the XML to RAM

Rescraping data with Enlive

I tried to create function to scrape and tags from HTML page, whose URL I provide to a function, and this works as it should. I get sequence of <h3> and <table> elements, when I try to use select function to extract only table or h3 tags from resulting sequence,
I get (), or if I try to map those tags I get (nil nil nil ...).
Could you please help me to resolve this issue, or explain me what am I doing wrong?
Here is the code:
(ns Test2
(:require [net.cgrand.enlive-html :as html])
(:require [clojure.string :as string]))
(defn get-page
"Gets the html page from passed url"
[url]
(html/html-resource (java.net.URL. url)))
(defn h3+table
"returns sequence of <h3> and <table> tags"
[url]
(html/select (get-page url)
{[:div#wrap :div#middle :div#content :div#prospekt :div#prospekt_container :h3]
[:div#wrap :div#middle :div#content :div#prospekt :div#prospekt_container :table]}
))
(def url "http://www.belex.rs/trgovanje/prospekt/VZAS/show")
This line gives me headache :
(html/select (h3+table url) [:table])
Could you please tell me what am I doing wrong?
Just to clarify my question: is it possible to use enlive's select function to extract only table tags from result of (h3+table url) ?
As #Julien pointed out, you will probably have to work with the deeply nested tree structure that you get from applying (html/select raw-html selectors) on the raw html. It seems like you try to apply html/select multiple times, but this doesn't work. html/select parses html into a clojure datastructure, so you can't apply it on that datastructure again.
I found that parsing the website was actually a little involved, but I thought that it might be a nice use case for multimethods, so I hacked something together, maybe this will get you started:
(The code is ugly here, you can also checkout this gist)
(ns tutorial.scrape1
(:require [net.cgrand.enlive-html :as html]))
(def *url* "http://www.belex.rs/trgovanje/prospekt/VZAS/show")
(defn get-page [url]
(html/html-resource (java.net.URL. url)))
(defn content->string [content]
(cond
(nil? content) ""
(string? content) content
(map? content) (content->string (:content content))
(coll? content) (apply str (map content->string content))
:else (str content)))
(derive clojure.lang.PersistentStructMap ::Map)
(derive clojure.lang.PersistentArrayMap ::Map)
(derive java.lang.String ::String)
(derive clojure.lang.ISeq ::Collection)
(derive clojure.lang.PersistentList ::Collection)
(derive clojure.lang.LazySeq ::Collection)
(defn tag-type [node]
(case (:tag node)
:tr ::CompoundNode
:table ::CompoundNode
:th ::TerminalNode
:td ::TerminalNode
:h3 ::TerminalNode
:tbody ::IgnoreNode
::IgnoreNode))
(defmulti parse-node
(fn [node]
(let [cls (class node)] [cls (if (isa? cls ::Map) (tag-type node) nil)])))
(defmethod parse-node [::Map ::TerminalNode] [node]
(content->string (:content node)))
(defmethod parse-node [::Map ::CompoundNode] [node]
(map parse-node (:content node)))
(defmethod parse-node [::Map ::IgnoreNode] [node]
(parse-node (:content node)))
(defmethod parse-node [::String nil] [node]
node)
(defmethod parse-node [::Collection nil] [node]
(map parse-node node))
(defn h3+table [url]
(let [ws-content (get-page url)
h3s+tables (html/select ws-content #{[:div#prospekt_container :h3]
[:div#prospekt_container :table]})]
(for [node h3s+tables] (parse-node node))))
A few words on what's going on:
content->string takes a data structure and collects its content into a string and returns that so you can apply this to content that may still contain nested subtags (like <br/>) that you want to ignore.
The derive statements establish an ad hoc hierarchy which we will later use in the multi-method parse-node. This is handy because we never quite know which data structures we're going to encounter and we could easily add more cases later on.
The tag-type function is actually a hack that mimics the hierarchy statements - AFAIK you can't create a hierarchy out of non-namespace qualified keywords, so I did it like this.
The multi-method parse-node dispatches on the class of the node and if the node is a map additionally on the tag-type.
Now all we have to do is define the appropriate methods: If we're at a terminal node we convert the contents to a string, otherwise we either recur on the content or map the parse-node function on the collection we're dealing with. The method for ::String is actually not even used, but I left it in for safety.
The h3+table function is pretty much what you had before, I simplified the selectors a bit and put them into a set, not sure if putting them into a map as you did works as intended.
Happy scraping!
Your question is hard to understand, but I think your last line should simply be
(h3+table url)
This will return a deeply nested data structure containing scraped HTML that you can then dive into with the usual Clojure sequence APIs. Good luck.