Is there an idiomatic way to avoid long Clojure string literals? - clojure

Various Clojure style guides recommend avoiding lines longer than 80 characters. I am wondering if there is an idiomatic way to avoid long String literals.
While it's common these days to have wide screens, I still agree that long lines should be avoided.
Here are some examples (I'm tempted to follow the first):
;; break the String literal with `str`
(println (str
"The quick brown fox "
"jumps over the lazy dog"))
;; break the String literal with `join`
(println (join " " [
"The quick brown fox"
"jumps over the lazy dog"]))
I am aware that Clojure supports multi-line String literals, but using this approach has the undesired effect of the newline characters being interpreted, e.g. using the repl:
user=> (println "The quick brown fox
#_=> jumps over the lazy dog")
The quick brown fox
jumps over the lazy dog

You should probably store the string inside an external text file, and read the file from your code. If you still feel the need to store the string in your code, go ahead and use str.
EDIT:
As requested, I will demonstrate how you can read long strings at compile time.
(defmacro compile-time-slurp [file]
(slurp file))
Use it like this:
(def long-string (compile-time-slurp "longString.txt"))
You may invent similar macros to handle Java Properties files, XML/JSON configuration, SQL queries, HTML, or whatever else you need.

I find it convenient to use str to create strings and use character literals such as \newline or \tab instead of "\n" to break them.
I rarely violate the 80-column rule this way.

The most idiomatic ways I know of are the following:
1) Use (str) to split the string over multiple lines.
(str "User " (:user context)
" is now logged in.")
This is probably the most idiomatic usage. I've seen this done in multiple libraries and projects. It is fast, since (str) uses a StringBuilder under the hood. It also allows you to mix code in transparently, as I've done in the example.
2) Allow strings to break the 80 char limit by themselves, when it makes sense.
(format
"User %s is now logged in."
(:user context))
Basically, it's ok to break the 80 char limit for strings. Chances are it's less likely you care about reading the string when you work with the code, and on the off chance you need to, exceptionally, you'll need to scroll horizontally.
I've wrapped the string in a (format) here to be able to inject code similarly to my previous example. You don't need to.
Less idiomatic ways would be:
3) Put your strings in files and load them from there.
(slurp "/path/to/userLoggedIn.txt")
With a file: /path/to/userLoggedIn.txt containing:
User logged in.
I advise against this because:
It introduces IO side effects
It has the potential to fail, say the path is wrong, the resource is missing or corrupted, the disk errors, etc.
It has performance implications, disk reads are slow.
Its hard to inject content from code if you need too.
I would say do this only if your text is really big. Or if the content of the string needs to be changed by non devs. Or if the content is obtained externally.
4) Have a namespace where you def all your strings in, and load them from there.
(ns msgs)
(defn logged-in-msg [user]
(format
"User %s is now logged in."
user))
Which you then use like this:
(msgs/logged-in-msg (:user context))
I prefer this over #3. You still need to allow to use #2 here, where it's ok to have strings break the 80 char limit. In fact, here you put strings by themselves on a line, so they are easy to format. If you use code analysis like checkstyle, you can exclude this file from the rule. It also does not suffer from the issues of #3.
If you are going with #3 or #4, you probably have a special use case for your strings, like internationalization, or having business edit them, etc. In those cases, you might be better served building a more robust solution, that could be inspired from the above methods, or using a library that specializes in those use cases.

(defmacro strs
([]
"")
([a]
(if (string? a) `~a `(str ~a)))
([a & more]
`(str
~#(->> (cons a more)
(partition-by string?)
(mapcat #(if (string? (first %)) (cons (apply str %) nil) %))))))
Usage:
(strs "one "
"two "
3
" four" " five")
; => "one two 3 four five"
Neighbouring literal strings will be concatenated at compile time.

Related

How to speed up this Clojure code?

I am a beginner in clojure. I am trying to solve this simple problem on codechef using clojure. Below is my clojure code but this code is taking too long to run and gives TimeoutException. Can someone please help me to optimize this code and make it run faster.
(defn checkCase [str]
(let [len (count str)]
(and (> len 1) (re-matches #"[A-Z]+" str))))
(println (count (filter checkCase (.split (read-line) " "))))
Note: My program is not getting timedout due to input error. On codechef input is handled automatically (probably through input redirection. Please read the question for more details)
Thank you!
Most text finding exercises are exercizes in regexps, this one no different. It's usually pretty hard to find a more efficient way in whatever programming language that will outpace good regexp implementations.
In this case re-seq, look around regexps, repetition limiting and the multiline regexp flag (?m) are your friends
(defn find-acronyms
[s]
(re-seq #"(?m)(?<=\W|^)[A-Z]+(?=\W|$)" s))
(find-acronyms "I like coding and will participate in IOI Then there is ICPC")
=> ("IOI" "ICPC")
Let's dissect the regex:
(?m) The multiline flag: lets you match your regex over multiple lines, so no need to split into multiple strings
(?<=\W|^) The match should follow a non-word character or the beginning of the (multiline) string
[A-Z]{2,} Match concurrent capital letters, a minimum of 2
(?=\W|$) The match should be followed by a non-word character or the end of the (multiline) string
I can only guess that wherever you run this snippet of code, it doesn't feed anything to your read-line invocation. Or maybe it does, but doesn't send a newline as the last thing. So it hangs waiting.
(defn checkCase [str]
(let [len (count str)]
(and (> len 1) (re-matches #"[A-Z]+" str))))
(defn answer [str]
(println (count (filter checkCase (.split str " ")))))
So at the REPL:
=> (answer "GGG fff TTT")
;-> 2
;-> nil
The answer is being printed to the screen. But probably best to have your function return the answer rather than print it out:
(defn answer [str]
(count (filter checkCase (.split str " "))))
All I have done is replaced your (read-line) with an argument. (read-line) is expecting input from stdin and waiting for it forever - or until a timeout happens in your case.
I am not sure if this is the slow part of your code, but if it is your could try to split up the execution and safe gard the very slow regexp part by executing it when it is necessary. I think the current version with AND already does that. If it does not you can try to do something else, like this:
(defn checkCase [^String str]
(cond
(< (.length str) 2)
false
(re-matches #"[A-Z]+" str)
true
:else
false))
maybe you could try using re-seq instead of spltting the string and checking every item? So you will lose the filter, .split, and additional function call. Something like this:
(println (count (re-seq #"\b[A-Z]{2,}?\b" (read-line))))
You need to submit a Java program. You can test it on the command line before you submit it. You can but don't need to use redirection symbols (<,>). Just type the input and see that every time you do it returns the count after you have typed enter.
You will need aot compilation (Ahead Of Time, which means that .class files are included) and a main that is exported. Only then will it become a Java program.
Actually when they ask for a Java program they probably mean a .class file. You can run a .class file with the java program (which I imagine is what their test-runner does). Put it in a shell or batch file when testing, but just submit the .class file.

manipulate regexp-search matches using `query-regexp-replace` in a defun

Since version 22 of Emacs, we can use \,(function) for manipualting (parts of) the regex-search result before replacing it. But – this is mentioned often, but nonetheless still the truth – we can use this construct only in the standard interactive way. (Interactive like: By pressing C-M-% or calling query-replace-regexp with M-x.)
As an example:
If we have
[Foo Bar 1900]
and want to get
[Foo Bar \function{foo1900}{1900}]
we can use:
M-x query-replace-regexp <return>
\[\([A-Za-z-]+\)\([^0-9]*\) \([0-9]\{4\}\)\]
[\1\2 \\function{\,(downcase \1)\3}{\3}]
to get it done. So this can be done pretty easy.
In my own defun, I can use query only by replacing without freely modifying the match, or modify the prepared replaced string without any querying. The only way I see, is to serialize it in such a way:
(defun form-to-function ()
(interactive)
(goto-char (point-min))
(while (query-replace-regexp
"\\[\\([A-Za-z-]+\\)\\([^0-9]*\\) \\([0-9]\\{4\\}\\)\\]"
"[\\1\\2 \\\\function{\\1\\3}{\\3}]" ))
(goto-char (point-min))
(while (search-forward-regexp "\\([a-z0-9]\\)" nil t)
(replace-match (downcase (match-string 1)) t nil)
)
)
For me the query is important, because I can't be sure, what the buffer offers me (= I can't be sure, the author used this kind of string always in the same manner).
I want to use an elisp function, because it is not the only recurring replacement (and also not only one buffer (I know about dired-do-query-replace-regexp but I prefer working buffer-by-buffer with replace-defuns)).
At first I thought I only miss something like a query-replace-match to use instead of replace-match. But I fear, I am also missing the easy and flexible way of rearrange the string the the query-replace-regexp.
So I think, I need a \, for use in an defun. And I really wonder, if I am the only one, who is missing this feature.
If you want your rsearch&replace to prompt the user, that means you want it to be interactive, so it's perfectly OK to call query-replace-regexp (even if the byte-compiler will tell you that this is meant for interactive use only). If the warning bothers you, you can either wrap the call in with-no-warnings or call perform-replace instead.
The docstring of perform-replace sadly doesn't (or rather "didn't" until today) say what is the format of the replacements argument, but you can see it in the function's code:
;; REPLACEMENTS is either a string, a list of strings, or a cons cell
;; containing a function and its first argument. The function is
;; called to generate each replacement like this:
;; (funcall (car replacements) (cdr replacements) replace-count)
;; It must return a string.
The query-replace-function can handle replacement not only as a string, but as a list including the manipulating elements. The use of concat archives building an string from various elements.
So one who wants to manipulate the search match by a function before inserting the replacement can use query-replace-regexp also in a defun.
(defun form-to-function ()
(interactive)
(goto-char (point-min))
(query-replace-regexp
"\\[\\([A-Za-z-]+\\)\\([^0-9]*\\) \\([0-9]\\{4\\}\\)\\]"
(quote (replace-eval-replacement concat "[\\1\\2 \\\\function{"
(replace-quote (downcase (match-string 1))) "\\3}{\\3}]")) nil ))
match-string 1 returns the first expression of our regexp-search.
`replace-quote' helps us doublequoting the following expression.
concat forms a string from the following elements.
and
replace-eval-replacement is not documented.
Why it is in use here nevertheless, is because of emacs seems to use it internally, while performing the first »interactive« query-replace-regexp call. At least is it given by asking emacs with repeat-complex-command.
I came across repeat-complex-command (bound to [C-x M-:].) while searching for an answer in the source code of query-replace-regexp.
So an easy to create defun could be archieved by performing the standard search and replace way as told in the question and after first sucess pressing [C-x M-:] results in an already Lisp formed command, which can be copied and pasted in a defun.
Edit (perform-replace)
As Stefan mentioned, one can use perform-replace to avoid using query-replace-regexp.
Such a function could be:
(defun form-to-function ()
(interactive)
(goto-char (point-min))
(while (perform-replace
"\\[\\([A-Za-z-]+\\)\\([^0-9]*\\) \\([0-9]\\{4\\}\\)\\]"
(quote (replace-eval-replacement concat "[\\1\\2 \\\\function{"
(replace-quote (downcase (match-string 1))) "\\3}{\\3}]"))
t t nil)))
The first boolean (t) is a query flag, the second is the regexp switch. So it works also perfectly, but it didn't help finding the replacement expression as easy as in using \,.

Clojure re-find reg-ex OR

I've been trying to get a simple reg-ex working in Clojure to test a string for some SQL reserved words (select, from, where etc.) but just can't get it to work:
(defn areserved? [c]
(re-find #"select|from|where|order by|group by" c))
(I split a string by spaces then go over all the words)
Help would be greatly appreciated,
Thanks!
EDIT: My first goal (after only reading some examples and basic Clojure materials) is to parse a string and return for each part of it (i.e. words) what "job" they have in the statement (a reserved word, a string etc.).
What I have so far:
(use '[clojure.string :only (join split)])
(defn isdigit? [c]
(re-find #"[0-9]" c))
(defn isletter? [c]
(re-find #"[a-zA-Z]" c))
(defn issymbol? [c]
(re-find #"[\(\)\[\]!\.+-><=\?*]" c))
(defn isstring? [c]
(re-find #"[\"']" c))
(defn areserved? [c]
(if (re-find #"select|from|where|order by|group by" c)
true
false))
(defn what-is [token]
(let [c (subs token 0 1)]
(cond
(isletter? c) :word
(areserved? c) :reserved
(isdigit? c) :number
(issymbol? c) :symbol
(isstring? c) :string)))
(defn checkr [token]
{:token token
:type (what-is token)})
(defn goparse [sql-str]
(map checkr (.split sql-str " ")))
Thanks for all the help guys! it's great to see so much support for such a relatively new language (at least for me :) )
I'm not entirely sure what you want exactly, but here's a couple of variations to coerce your first regex match to a boolean:
(defn areserved? [c]
(string?
(re-find #"select|from|where|order by|group by"c)))
(defn areserved? [c]
(if (re-find #"select|from|where|order by|group by"c)
true
false))
UPDATE in response to question edit:
Thanks for posting more code. Unfortunately there are a number of issues here that we could
try to address by patching your existing code in a simplistic and naïve fashion, but it will
only get you so far, before you hit the next problem with this single iteration approach.
#alex is correct, that your areserved? method will fail to match order by if you have already
split your string by white space. That said, a simple fix is to treat order and by as separate keywords (which they are, even though they always appear together).
The next issue is that the areserved? function will match keywords in a string, but you are dispatching it against a character in the what-is function. You nearly always get a match in your cond for isletter?, so you will everything is marked as a 'word'.
All in all, it looks like you are trying to do too much work in a single application of map.
I'm not sure if you are just doing this for fun to play with Clojure (which is admirable - keep going!), in which case, maybe it doesn't matter if you press on with this simple parsing approach... you'll definitely learn something; but if you would like to take it further and parse SQL more successfully, then I would suggest that you may find it helpful to to read a little on Lexing, Parsing and building Abstract Syntax Trees (AST).
Brian Carper has written about using the Java parser generator "ANTLR" from Clojure - it's a few years old, but might be worth looking at.
You also might be able to get some transferrable ideas from this chapter from the F# programming book on lexing and parsing SQL.

How do I beautify lisp source code?

My code is a mess many long lines in this language like the following
(defn check-if-installed[x] (:exit(sh "sh" "-c" (str "command -v " x " >/dev/null 2>&1 || { echo >&2 \"\"; exit 1; }"))))
or
(def Open-Action (action :handler (fn [e] (choose-file :type :open :selection-mode :files-only :dir ListDir :success-fn (fn [fc file](setup-list file)))) :name "Open" :key "menu O" :tip "Open spelling list"))
which is terrible. I would like to format it like so
(if (= a something)
(if (= b otherthing)
(foo)))
How can I beautify the source code in a better way?
The real answer hinges on whether you're willing to insert the newlines yourself. Many systems
can indent the lines for you in an idiomatic way, once you've broken it up into lines.
If you don't want to insert them manually, Racket provides a "pretty-print" that does some of what you want:
#lang racket
(require racket/pretty)
(parameterize ([pretty-print-columns 20])
(pretty-print '(b aosentuh onethunoteh (nte huna) oehnatoe unathoe)))
==>
'(b
aosentuh
onethunoteh
(nte huna)
oehnatoe
unathoe)
... but I'd be the first to admit that inserting newlines in the right places is hard, because
the choice of line breaks has a lot to do with how you want people to read your code.
I use Clojure.pprint often for making generated code more palatable to humans.
it works well for reporting thought it is targeted at producing text. The formatting built into the clojure-mode emacs package produces very nicely formatted Clojure if you put the newlines in your self.
Now you can do it with Srefactor package.
Some demos:
Formatting whole buffer demo in Emacs Lisp (applicable in Common Lisp as well).
Transform between one line <--> Multiline demo
Available Commands:
srefactor-lisp-format-buffer: format whole buffer
srefactor-lisp-format-defun: format current defun cursor is in
srefactor-lisp-format-sexp: format the current sexp cursor is in.
srefactor-lisp-one-line: turn the current sexp of the same level into one line; with prefix argument, recursively turn all inner sexps into one line.
Scheme variants are not as polished as Emacs Lisp and Common Lisp yet but work for simple and small sexp. If there is any problem, please submit an issue report and I will be happy to fix it.

What is the difference between ; and ;; in Clojure code comments?

What is the difference between ; and ;; when starting a comment in Clojure? I see that my text editor colours them differently, so I'm assuming there is notionally some difference.
I also see that Marginalia treats them differently:
; Stripped entirely
;; Appears in text section of marginalia
(defn foobar []
; Appears in code section of marginalia output
;; Again, appears in code section of marginalia output
6)
There is no difference as far as the interpreter is concerned. Think of ; ;; ;;; and ;;;; as different heading levels.
Here is my personal use convention:
;;;; Top-of-file level comments, such as a description of the whole file/module/namespace
;;; Documentation for major code sections (i.e. groups of functions) within the file.
;; Documentation for single functions that extends beyond the doc string (e.g. an explanation of the algorithm within the function)
; In-line comments possibly on a single line, and possibly tailing a line of code
Check out the official description of the meaning of ; vs ;; in elisp: since the Clojure indenter is basically the same, it will treat them similarly. Basically, use ; if you are writing a long sentence/description "in the margins" that will span multiple lines but should be considered a single entity. Their example is:
(setq base-version-list ; there was a base
(assoc (substring fn 0 start-vn) ; version to which
file-version-assoc-list)) ; this looks like
; a subversion
The indenter will make sure those stay lined up next to each other. If, instead, you want to make several unrelated single-line comments next to each other, use ;;.
(let [x 99 ;; as per ticket #425
y "test"] ;; remember to test this
(str x y)) ;; TODO actually write this function
Emacs ; to be used for end-of-line comments and will indent in surprising ways if that is not your intent. ;; does not so I usually use ;;.
Clojure doesn't care - any line is ignored from the ; to EOL.
I believe there is a tradition in CL of using increasing numbers of ; to indicate more important comments/sections.
no meaning for the language. ; is a reader macro for comment
perhaps other tools parse them but "within clojure" they are the same.
There is no difference from a Clojure-perspective. I find that ;; stands out a little better than ;, but that's only my opinion.
Marginalia on the other hand treats them differently because there are times when a comment should remain in the code section (e.g. license) and those are flagged with ;. This is an arbitrary decision and may change in the future.
In emacs lisp modes including clojure-mode, ;; is formatted with the convention of being at the beginning of a line, and indented as as any other line, based on the context. ; is expected to be used at the end of a line, so emacs will not do what you want it to if you put a single-semicolon comment at the beginning of a line expecting it to tab to the indentation for the present context.
Example:
(let [foo 1]
;; a comment
foo) ; a comment
I'm not sure (not used Clojure and never heard of this before), but this thread might help.