Trouble with building up a string in Clojure - clojure

[this may seem like my problem is with Compojure, but it isn't - it's with Clojure]
I've been pulling my hair out on this seemingly simple issue - but am getting nowhere.
I am playing with Compojure (a light web framework for Clojure) and I would just like to generate a web page showing showing my list of todos that are in a PostgreSQL database.
The code snippets are below (left out the database connection, query, etc - but that part isn't needed because specific issue is that the resulting HTML shows nothing between the <body> and </body> tags).
As a test, I tried hard-coding the string in the call to main-layout, like this:
(html (main-layout "Aki's Todos" "Haircut<br>Study Clojure<br>Answer a question on Stackoverfolw")) - and it works fine.
So the real issue is that I do not believe I know how to build up a string in Clojure. Not the idiomatic way, and not by calling out to Java's StringBuilder either - as I have attempted to do in the code below.
A virtual beer, and a big upvote to whoever can solve it! Many thanks!
=============================================================
;The master template (a very simple POC for now, but can expand on it later)
(defn main-layout
"This is one of the html layouts for the pages assets - just like a master page"
[title body]
(html
[:html
[:head
[:title title]
(include-js "todos.js")
(include-css "todos.css")]
[:body body]]))
(defn show-all-todos
"This function will generate the todos HTML table and call the layout function"
[]
(let [rs (select-all-todos)
sbHTML (new StringBuilder)]
(for [rec rs]
(.append sbHTML (str rec "<br><br>")))
(html (main-layout "Aki's Todos" (.toString sbHTML)))))
=============================================================
Again, the result is a web page but with nothing between the body tags. If I replace the code in the for loop with println statements, and direct the code to the repl - forgetting about the web page stuff (ie. the call to main-layout), the resultset gets printed - BUT - the issue is with building up the string.
Thanks again.
~Aki

for is lazy, and in your function it's never being evaluated. Change for to doseq.
user> (let [rs ["foo" "bar"]
sbHTML (new StringBuilder)]
(for [rec rs]
(.append sbHTML (str rec "<br><br>")))
(.toString sbHTML))
""
user> (let [rs ["foo" "bar"]
sbHTML (new StringBuilder)]
(doseq [rec rs]
(.append sbHTML (str rec "<br><br>")))
(.toString sbHTML))
"foo<br><br>bar<br><br>"
You could also use reduce and interpose, or clojure.string/join from clojure.string, or probably some other options.
user> (let [rs ["foo" "bar"]]
(reduce str (interpose "<br><br>" rs)))
"foo<br><br>bar"
user> (require 'clojure.string)
nil
user> (let [rs ["foo" "bar"]]
(clojure.string/join "<br><br>" rs))
"foo<br><br>bar"

You would like to use the re-gsub like this:
(require 'clojure.contrib.str-utils) ;;put in head for enabling us to use re-gsub later on
(clojure.contrib.str-utils/re-gsub #"\newline" "<br><br>" your-string-with-todos-separated-with-newlines)
This last line will result in the string you like. The require-part is, as you maybe already know, there to enable the compiler to reach the powerful clojure.contrib.str-utils library without importing it to your current namespace (which could potentially lead to unnescessary collisions when the program grows).
re- is for reg-exp, and lets you define a reg-exp of the form #"regexp", which to replace all instances that is hit by the regexp with the argument afterwards, applied to the third argument. The \newline is in this case clojures way of expressing newlines in regexps as well as strings and the character we are looking for.
What I think you really wanted to do is to make a nifty ordered or unordered list in html-format. These can be done with [hiccup-page-helpers][2] (if you don't have them you probably have a compojure from the time before it got splited up in compojure, hiccup and more, since you use the html-function).
If you want to use hiccup-page-helpers, use the command re-split from the clojure.contrib.str-utils mentioned above in this fashion:
(use 'hiccup.page-helpers) ;;watch out for namespace collisions, since all the functions in hiccup.page-helpers got into your current namespace.
(unordered-list (clojure.contrib.str-utils/re-split #"\newline" your-string-with-todos-separated-with-newlines))
which should render a neat
<ul>
<li>todo-item1</li>
<li>todo-item2</li>
</ul>
(and yes, there is an ordered-list command that works the same way!)
In the last line of clojure code above, all you todos gets into a (list "todo1" "todo2") which is immediately consumed by hiccup.page-helpers unordered-list function and is there converted to an html-ized list.
Good luck with compojure and friends!

Related

How can I iterate over a list with a macro?

I am trying to print the documentation for all functions in a given namespace by invoking the following expression in a REPL:
(doseq
[f (dir-fn 'clojure.repl)]
(doc f))
However the invocation of this expression returns nil without printing the documentation to the REPL. I know this might have to do with doc being a macro, but I'm a Clojure novice and am not entirely sure how to understand the problem.
Why does this expression return nil without printing the documentation?
How can this expression be modified so that it prints the documentation for each function in a given namespace?
Thanks!
Update: Combined both provided answers:
(defn ns-docs [ns']
(doseq [[symbol var] (ns-interns ns')]
(newline)
(println symbol)
(print " ")
(println (:doc (meta var)))))
(ns-docs 'clojure.repl)
I would, instead, start here:
The Clojure CheatSheet
ClojureDocs.org
Clojure-Doc.org (similar name, but different)
The API & Reference sections at Clojure.org
Note that doc is in the namespace clojure.repl, which reflects its intended usage (by a human in a repl). Here is some code that will also iterate on a namespace & print doc strings (using a different technique):
(doseq [[fn-symbol fn-var] (ns-interns 'demo.core)]
(newline)
(println fn-symbol)
(println (:doc (meta fn-var))))
where demo.core is the namespace of interest.
Note that ns-interns gives you both a symbol and var like:
fn-symbol => <#clojure.lang.Symbol -main>
fn-var => <#clojure.lang.Var #'demo.core/-main>
The meta function has lots of other info you may want to use someday:
(meta fn-var) =>
<#clojure.lang.PersistentArrayMap
{ :arglists ([& args]),
:doc "The Main Man!",
:line 9, :column 1,
:file "demo/core.clj",
:name -main,
:ns #object[clojure.lang.Namespace 0x14c35a06 "demo.core"]}>
While this probably won't help you with answering your question, the problem of evaluating macro's comes up a lot when you are learning Clojure.
Macros are responsible for the evaluation of their arguments. In this case clojure.repl/doc will ignore the current lexical context and assume that the symbol f that you're giving it is the name of a function you want to see the documentation for. It does this because it's intended to be used at the REPL, and is assuming you wouldn't want to type quotes all the time.
As f doesn't exist, it prints nothing. Then doseq returns nil, since it exists to do something for side effects only - hence starting in do. In order to pass an argument to a macro that refuses to respect the lexical context like this, you need to write the code for each element in the list.
You can do this by hand, or by constructing the code as data, and passing it to eval to execute. You can do this in an imperative style, using doseq:
(doseq [f (ns-interns 'clojure.repl)]
(eval `(doc ~(symbol "clojure.repl" (str (first f))))))
or in a slightly more Clojurey way (which will allow you to see the code that it would execute by removing eval from the end and running it at the REPL):
(->> (ns-interns 'clojure.repl)
(map #(list 'clojure.repl/doc (symbol "clojure.repl" (str (first %)))))
(cons `do)
eval)
In both of these we use quote and syntax-quote to construct some code from the list of symbols reflected from the namespace, and pass it to eval to actually execute it. This page on Clojure's weird characters should point you in the right direction for understanding what's going on here.
This an example of why you shouldn't write macro's, unless you've got no other options. Macro's do not compose, and are often difficult to work with. For a more in depth discussion, Fogus's talk and Christophe Grand's talk are both good talks.
Why does this expression return nil without printing the documentation?
Because the doc macro is receiving the symbol f from your loop, instead of a function symbol directly.
How can this expression be modified so that it prints the documentation for each function in a given namespace?
(defn ns-docs [ns']
(let [metas (->> (ns-interns ns') (vals) (map meta) (sort-by :name))]
(for [m metas :when (:doc m)] ;; you could filter here if you want fns only
(select-keys m [:name :doc]))))
(ns-docs 'clojure.repl)
=>
({:name apropos,
:doc "Given a regular expression or stringable thing, return a seq of all
public definitions in all currently-loaded namespaces that match the
str-or-pattern."}
...
)
Then you can print those maps/strings if you want.

Why are multi-methods not working as functions for Reagent/Re-frame?

In a small app I'm building that uses Reagent and Re-frame I'm using multi-methods to dispatch which page should be shown based on a value in the app state:
(defmulti pages :name)
(defn main-panel []
(let [current-route (re-frame/subscribe [:current-route])]
(fn []
;...
(pages #current-route))))
and then I have methods such as:
(defmethod layout/pages :register [_] [register-page])
where the register-page function would generate the actual view:
(defn register-page []
(let [registration-form (re-frame/subscribe [:registration-form])]
(fn []
[:div
[:h1 "Register"]
;...
])))
I tried changing my app so that the methods generated the pages directly, as in:
(defmethod layout/pages :register [_]
(let [registration-form (re-frame/subscribe [:registration-form])]
(fn []
[:div
[:h1 "Register"]
;...
])))
and that caused no page to ever be rendered. In my main panel I changed the call to pages to square brackets so that Reagent would have visibility into it:
(defn main-panel []
(let [current-route (re-frame/subscribe [:current-route])]
(fn []
;...
[pages #current-route])))
and that caused the first visited page to work, but after that, clicking on links (which causes current-route to change) has no effect.
All the namespaces defining the individual methods are required in the file that is loaded first, that contains the init function, and the fact that I can pick any single page and have it displayed proves the code is loading (then, switching to another page doesn't work):
https://github.com/carouselapps/ninjatools/blob/master/src/cljs/ninjatools/core.cljs#L8-L12
In an effort to debug what's going on, I defined two routes, :about and :about2, one as a function and one as a method:
(defn about-page []
(fn []
[:div "This is the About Page."]))
(defmethod layout/pages :about [_]
[about-page])
(defmethod layout/pages :about2 [_]
(fn []
[:div "This is the About 2 Page."]))
and made the layout print the result of calling pages (had to use the explicit call instead of the square brackets of course). The wrapped function, the one that works, returns:
[#object[ninjatools$pages$about_page "function ninjatools$pages$about_page(){
return (function (){
return new cljs.core.PersistentVector(null, 2, 5, cljs.core.PersistentVector.EMPTY_NODE, [new cljs.core.Keyword(null,"div","div",1057191632),"This is the About Page."], null);
});
}"]]
while the method returns:
#object[Function "function (){
return new cljs.core.PersistentVector(null, 2, 5, cljs.core.PersistentVector.EMPTY_NODE, [new cljs.core.Keyword(null,"div","div",1057191632),"This is the About 2 Page."], null);
}"]
If I change the method to be:
(defmethod layout/pages :about2 [_]
[(fn []
[:div "This is the About 2 Page."])])
that is, returning the function in a vector, then, it starts to work. And if I make the reverse change to the wrapped function, it starts to fail in the same manner as the method:
(defn about-page []
(fn []
[:div "This is the About Page."]))
(defmethod layout/pages :about [_]
about-page)
Makes a bit of sense as Reagent's syntax is [function] but it was supposed to call the function automatically.
I also started outputting #current-route to the browser, as in:
[:main.container
[alerts/view]
[pages #current-route]
[:div (pr-str #current-route)]]
and I verified #current-route is being modified correctly and the output updated, just not [pages #current-route].
The full source code for my app can be found here: https://github.com/carouselapps/ninjatools/tree/multi-methods
Update: corrected the arity of the methods following Michał Marczyk's answer.
So, a component like this: [pages #some-ratom]
will rerender when pages changes or #some-ratom changes.
From reagent's point of view, pages hasn't changed since last time, it is still the same multi-method it was before. But #some-ratom might change, so that could trigger a rerender.
But when this rerender happens it will be done using a cached version of pages. After all, it does not appear to reagent that pages has changed. It is still the same multimethod it was before.
The cached version of pages will, of course, be the first version of pages which was rendered - the first version of the mutlimethod and not the new version we expect to see used.
Reagent does this caching because it must handle Form-2 functions. It has to keep the returned render function.
Bottom line: because of the caching, multimethods won't work very well, unless you find a way to completely blow up the component and start again, which is what the currently-top-voted approach does:
^{:key #current-route} [pages #current-route]
Of course, blowing up the component and starting again might have its own unwelcome implications (depending on what local state is held in that component).
Vaguely Related Background:
https://github.com/Day8/re-frame/wiki/Creating-Reagent-Components#appendix-a---lifting-the-lid-slightly
https://github.com/Day8/re-frame/wiki/When-do-components-update%3F
I don't have all the details, but apparently, when I was rendering pages like this:
[:main.container
[alerts/view]
[pages #current-route]]
Reagent was failing to notice that pages depended on the value of #current-route. The Chrome React plugin helped me figure it out. I tried using a ratom instead of a subscription and that seemed to work fine. Thankfully, telling Reagent/React the key to an element is easy enough:
[:main.container
[alerts/view]
^{:key #current-route} [pages #current-route]]
That works just fine.
The first problem that jumps out at me is that your methods take no arguments:
(defmethod layout/pages :register [] [register-page])
^ arglist
Here you have an empty arglist, but presumably you'll be calling this multimethod with one or two arguments (since its dispatch function is a keyword and keywords can be called with one or two arguments).
If you want to call this multimethod with a single argument and just ignore it inside the body of the :register method, change the above to
(defmethod layout/pages :register [_] [register-page])
^ argument to be ignored
Also, I expect you'll probably want to call pages yourself like you previously did (that is, revert the change to square brackets that you mentioned in the question).
This may or may not fix the app – there may be other problems – but it should get you started. (The multimethod will definitely not work with those empty arglists if you pass in any arguments.)
How about if you instead have a wrapper pages-component function which is a regular function that can be cached by reagent. It would look like this:
(defn pages-component [state]
(layout/pages #state))

Send Clojure 'doc' output through pager?

The docstrings for some Clojure functions (e.g. defrecord) are quite long. When running Clojure at in a terminal window, I'd like to be able to in effect send doc's output through a pager such as more (or less). If someone has written a pager function in Clojure, then I think I could use it along with something like:
(with-out-str (doc defrecord))
Or if there is a standard Java class that implements a pager, I can figure out how to send the output to that.
Alternatively, how can I send the output of doc to a shell command? This doesn't do the job:
(clojure.java.shell/sh "more" :in (with-out-str (doc defrecord))))
[This topic is difficult to search: "more", "less", and "doc" are obviously very common terms, and things like "java pager" bring up pages discussing ways to break text up into pages for formatting documents.]
You can use jline for this. If you call setPaginationEnabled, with true, and use the printColumns method, on your jline ConsoleReader, it will page.
But, if you are trying to do this at a standard Leiningen REPL, things get more complicated. The current version of Leiningen v2 uses REPL-y, which uses jline internally, but doesn't use printColumns, so the jline pagination is ignored.
You can, however, get the current Leiningen REPL height through REPL-y's ConsoleReader, in reply.reader.simple-jline/jline-state, and use it to partition the doc string.
(defmacro doc2 [x]
`(let [h# (-> #reply.reader.simple-jline/jline-state :reader (.. getTerminal getHeight) (- 4))
[s1# s2#] (split-at h# (-> ~x clojure.repl/doc with-out-str clojure.string/split-lines))]
(doseq [x# s1#] (println x#))
(doseq [i# (partition-all h# s2#)]
(println "\n<more>")
(read-line)
(doseq [x# i#] (println x#)))))
You would want to put this macro in your profiles.clj under the :repl profile.
{:user {:plugins [...]}
:repl {:repl-options {:init (defmacro doc2 [x] ...)}}}
This would put the doc2 macro in the user namespace when you load the repl.
echo "(doc defrecord)" |clj|more

Huge file in Clojure and Java heap space error

I posted before on a huge XML file - it's a 287GB XML with Wikipedia dump I want ot put into CSV file (revisions authors and timestamps). I managed to do that till some point. Before I got the StackOverflow Error, but now after solving the first problem I get: java.lang.OutOfMemoryError: Java heap space error.
My code (partly taken from Justin Kramer answer) looks like that:
(defn process-pages
[page]
(let [title (article-title page)
revisions (filter #(= :revision (:tag %)) (:content page))]
(for [revision revisions]
(let [user (revision-user revision)
time (revision-timestamp revision)]
(spit "files/data.csv"
(str "\"" time "\";\"" user "\";\"" title "\"\n" )
:append true)))))
(defn open-file
[file-name]
(let [rdr (BufferedReader. (FileReader. file-name))]
(->> (:content (data.xml/parse rdr :coalescing false))
(filter #(= :page (:tag %)))
(map process-pages))))
I don't show article-title, revision-user and revision-title functions, because they just simply take data from a specific place in the page or revision hash. Anyone could help me with this - I'm really new in Clojure and don't get the problem.
Just to be clear, (:content (data.xml/parse rdr :coalescing false)) IS lazy. Check its class or pull the first item (it will return instantly) if you're not convinced.
That said, a couple things to watch out for when processing large sequences: holding onto the head, and unrealized/nested laziness. I think your code suffers from the latter.
Here's what I recommend:
1) Add (dorun) to the end of the ->> chain of calls. This will force the sequence to be fully realized without holding onto the head.
2) Change for in process-page to doseq. You're spitting to a file, which is a side effect, and you don't want to do that lazily here.
As Arthur recommends, you may want to open an output file once and keep writing to it, rather than opening & writing (spit) for every Wikipedia entry.
UPDATE:
Here's a rewrite which attempts to separate concerns more clearly:
(defn filter-tag [tag xml]
(filter #(= tag (:tag %)) xml))
;; lazy
(defn revision-seq [xml]
(for [page (filter-tag :page (:content xml))
:let [title (article-title page)]
revision (filter-tag :revision (:content page))
:let [user (revision-user revision)
time (revision-timestamp revision)]]
[time user title]))
;; eager
(defn transform [in out]
(with-open [r (io/input-stream in)
w (io/writer out)]
(binding [*out* out]
(let [xml (data.xml/parse r :coalescing false)]
(doseq [[time user title] (revision-seq xml)]
(println (str "\"" time "\";\"" user "\";\"" title "\"\n")))))))
(transform "dump.xml" "data.csv")
I don't see anything here that would cause excessive memory use.
Unfortunately data.xml/parse is not lazy, it attempts to read the whole file into memory and then parse it.
Instead use the this (lazy) xml library which holds only the part it is currently working on in ram. You will then need to re-structure your code to write the output as it reads the input instead of gathering all the xml, then outputting it.
your line
(:content (data.xml/parse rdr :coalescing false)
will load all the xml into memory and then request the content key from it. which will blow the heap.
a rough outline of a lazy answer would look something like this:
(with-open [input (java.io.FileInputStream. "/tmp/foo.xml")
output (java.io.FileInputStream. "/tmp/foo.csv"]
(map #(write-to-file output %)
(filter is-the-tag-i-want? (parse input))))
Have patience, working with (> data ram) always takes time :)
I don't know about Clojure but in plain Java one could use a SAX event based parser like http://docs.oracle.com/javase/1.4.2/docs/api/org/xml/sax/XMLReader.html
that doesn't need to load the XML to RAM

One argument, many functions

I have an incoming lazy stream lines from a file I'm reading with tail-seq (to contrib - now!) and I want to process those lines one after one with several "listener-functions" that takes action depending on re-seq-hits (or other things) in the lines.
I tried the following:
(defn info-listener [logstr]
(if (re-seq #"INFO" logstr) (println "Got an INFO-statement")))
(defn debug-listener [logstr]
(if (re-seq #"DEBUG" logstr) (println "Got a DEBUG-statement")))
(doseq [line (tail-seq "/var/log/any/java.log")]
(do (info-listener logstr)
(debug-listener logstr)))
and it works as expected. However, there is a LOT of code-duplication and other sins in the code, and it's boring to update the code.
One important step seems to be to apply many functions to one argument, ie
(listen-line line '(info-listener debug-listener))
and use that instead of the boring and error prone do-statement.
I've tried the following seemingly clever approach:
(defn listen-line [logstr listener-collection]
(map #(% logstr) listener-collection))
but this only renders
(nil) (nil)
there is lazyiness or first class functions biting me for sure, but where do I put the apply?
I'm also open to a radically different approach to the problem, but this seems to be a quite sane way to start with. Macros/multi methods seems to be overkill/wrong for now.
Making a single function out of a group of functions to be called with the same argument can be done with the core function juxt:
=>(def juxted-fn (juxt identity str (partial / 100)))
=>(juxted-fn 50)
[50 "50" 2]
Combining juxt with partial can be very useful:
(defn listener [re message logstr]
(if (re-seq re logstr) (println message)))
(def juxted-listener
(apply juxt (map (fn [[re message]] (partial listner re message))
[[#"INFO","Got INFO"],
[#"DEBUG", "Got DEBUG"]]))
(doseq [logstr ["INFO statement", "OTHER statement", "DEBUG statement"]]
(juxted-listener logstr))
You need to change
(listen-line line '(info-listener debug-listener))
to
(listen-line line [info-listener debug-listener])
In the first version, listen-line ends up using the symbols info-listener and debug-listener themselves as functions because of the quoting. Symbols implement clojure.lang.IFn (the interface behind Clojure function invocation) like keywords do, i.e. they look themselves up in a map-like argument (actually a clojure.lang.ILookup) and return nil if applied to something which is not a map.
Also note that you need to wrap the body of listen-line in dorun to ensure it actually gets executed (as map returns a lazy sequence). Better yet, switch to doseq:
(defn listen-line [logstr listener-collection]
(doseq [listener listener-collection]
(listener logstr)))