Toggle case sensitivity for a huge chunk of Clojure code - clojure

There is a large chunk of code (mostly not mine) that does the following with user input (that is more or less, space separated list of commands with some arguments/options):
Remove all unsupported characters
Split on space into a vector
Recursively apply first item in vector on the rest of the vector (function uses whatever arguments it needs, and returns vector without itself and its arguments to the loop).
Functions themselves, as far as input is concerned, have a mix of (case), (cond), (condp), (=) and (compare) with some nasty (keyword) comparisons mixed in.
Everyone was fine with the fact that this all is strictly case-sensitive until very recently. Now some (previously unknown) ancient integration bits acting as users appeared and are having some casing issues that I have no control over.
Question: is there a viable way (shortcut before there will be more time to redo it all) to make string comparison case insensitive for some sort of a scope, based on some variable?
I considered 3 options:
Fixing the code (will be done sometime, anyway, but not viable at the moment).
Extracting some low level comparison function (hopefully just one) and rebinding it for the local scope (sounds great, but catching cases might be difficult and error-prone).
Standardize input (might not be possible without some hacks since some data, outside comparisons, NEEDS to be case sensitive).
After some research, the answer is probably no (and planning for major changes should start), but I figured asking would not hurt, maybe someone thought of it before.
Edit: sample problematic input:
"Command1 ARG1 aRG2 Command3 command9 Arg4 Arg9 aRg5 COMMAND4 arg8"
Breaking it down:
"Commands" with broken case I need to be able, on demand, to match case insensitively. Arguments are matched case insensitively on another level - so they do not concern this piece of code, but their case inside this bit of code should be preserved to be sent further along.
NB! It is not possible at the start of the processing to tell what is in the input a command and what is argument.

For what it's worth, here is a case-insensitive wrapper for simple case forms:
(ns lexer.core)
(defn- standardize [thing]
(assert (string? thing) (str thing " should be a string"))
(clojure.string/lower-case thing))
(defmacro case-insensitive-case [expr & pairs+default?]
(let [pairs (partition 2 pairs+default?)
convert (fn [[const form]]
(list (standardize const) form))
most-of-it `(case (standardize ~expr) ~#(mapcat convert pairs))]
(if (-> pairs+default? count even?)
most-of-it
(concat most-of-it [(last pairs+default?)]))))
For example,
(macroexpand-1 '(case-insensitive-case (test expression)
"Blam!" (1 + 1)
(whatever works)))
=> (clojure.core/case (lexer.core/standardize (test expression)) "blam!" (1 + 1) (whatever works))
The assert in standardize is necessary because lower-case turns things into strings:
(clojure.string/lower-case 22)
=> "22"

As per Alan Thompson's comment, str/lower-case was the right first half of approach - I just needed to find the right place to apply it to just command name.
Afterwards redefining = and couple of functions used inside cond and condp (credit to ClojureMostly) solved the matching part.
All that was left were the string literals inside case statements which I just find-and-replaced with lower case.

Related

Clojure pipe collection one by one

How in Clojure process collections like in Java streams - one by one thru all the functions instead of evaluating all the elements in all the stack frame. Also I would describe it as Unix pipes (next program pulls chunk by chunk from previous one).
As far as I understand your question, you may want to look into two things.
First, understand the sequence abstraction. This is a way of looking at collections which consumes them one by one and lazily. It is an important Clojure idiom and you'll meet well known functions like map, filter, reduce, and many more. Also the macro ->>, which was already mentioned in a comment, will be important.
After that, when you want to dig deeper, you probably want to look into transducers and reducers. In a grossly oversimplifying summary, they allow you combine several lazy functions into one function and then process a collection with less laziness, less memory consumption, more performance, and possibly on several threads. I consider these to be advanced topics, though. Maybe the sequences are already what you were looking for.
Here is a simple example from ClojureDocs.org
;; Use of `->` (the "thread-first" macro) can help make code
;; more readable by removing nesting. It can be especially
;; useful when using host methods:
;; Arguably a bit cumbersome to read:
user=> (first (.split (.replace (.toUpperCase "a b c d") "A" "X") " "))
"X"
;; Perhaps easier to read:
user=> (-> "a b c d"
.toUpperCase
(.replace "A" "X")
(.split " ")
first)
"X"
As always, don't forget the Clojure CheatSheet or Clojure for the Brave and True.

Clojure.spec - Why is it useful and when is it used

I have recently watched Rich Hickeys talk at Cojure Conj 2016 and although it was very interesting, I didn't really understand the point in clojure.spec or when you'd use it. It seemed like most of the ideas, such as conform, valid etc, had similar functions in Clojure already.
I have only been learning clojure for around 3 months now so maybe this is due to lack of programming/Clojure experience.
Do clojure.spec and cljs.spec work in similar ways to Clojure and Cljs in that, although they are not 100% the same, they are based on the same underlying principles.
Are you tired of documenting your programs?
Does the prospect of making up yet more tests cause procrastination?
When the boss says "test coverage", do you cower with fear?
Do you forget what your data names mean?
For smooth expression of hard specifications, you need Clojure.Spec!
Clojure.spec gives you a uniform method of documenting, specifying, and automatically testing your programs, and of validating your live data.
It steals virtually every one of its ideas. And it does nothing you can't do for yourself.
But in my - barely informed - opinion, it changes the economy of specification, making it worth while doing properly. A game-changer? - quite possibly.
At the clojure/conj conference last week, probably half of the presentations featured spec in some way, and it's not even out of alpha yet. spec is a major feature of clojure; it is here to stay, and it is powerful.
As an example of its power, take static type checking, hailed as a kind of safety net by so many, and a defining characteristic of so many programming languages. It is incredibly limited in that it's only good at compile time, and it only checks types. spec, on the other hand, validates and conforms any predicate (not just type) for the args, the return, and can also validate relationships between the two. All of this is external to the function's code, separating the logic of the function from being commingled with validation and documentation about the code.
Regarding WORKFLOW:
One archetypal example of the benefits of relationship-checking, versus only type-checking, is a function which computes the substring of a string. Type checking ensures that in (subs s start end) the s is a string and start and end are integers. However, additional checking must be done within the function to ensure that start and end are positive integers, that end is greater than start, and that the resulting substring is no larger than the original string. All of these things can be spec'd out, for example (forgive me if some of this is a bit redundant or maybe even inaccurate):
(s/fdef clojure.core/subs
:args (s/and (s/cat :s string? :start nat-int? :end (s/? nat-int?))
(fn [{:keys [s start end]}]
(if end
(<= 0 start end (count s))
(<= 0 start (count s)))))
:ret string?
:fn (fn [{{:keys [s start end]} :args, substring :ret}]
(and (if end
(= (- end start) (count substring))
(= (- (count s) start) (count substring)))
(<= (count substring) (count s)))))
Call the function with sample data meeting the above args spec:
(s/exercise-fn `subs)
Or run 1000 tests (this may fail a few times, but keep running and it will work--this is due to the built-in generator not being able to satisfy the second part of the :args predicate; a custom generator can be written if needed):
(stest/check `subs)
Or, want to see if your app makes calls to subs that are invalid while it's running in real time? Just run this, and you'll get a spec exception if the function is called and the specs are not met:
(stest/instrument `subs)
We have not integrated this into our work flow yet, and can't in production since it's still alpha, but the first goal is to write specs. I'm putting them in the same namespace but in separate files currently.
I foresee our work flow being to run the tests for spec'd functions using this (found in the clojure spec guide):
(-> (stest/enumerate-namespace 'user) stest/check)
Then, it would be advantageous to turn on instrumenting for all functions, and run the app under load as we normally would test it, and ensure that "real world" data works.
You can also use s/conform to destructure complex data in functions themselves, or use s/valid as pre- and post- conditions for running functions. I'm not too keen on this, as it's overhead in a production system, but it is a possibility.
The sky's the limit, and we've just scratched the surface! Cool things coming in the next months and years with spec!

Managing number of brackets in clojure

I am new to clojure and the main thing I am struggling with is writing readable code. I often end up with functions like the one below.
(fn rep
([lst n]
(rep (rest lst)
n
(take n
(repeat (first lst)))))
([lst n out]
(if
(empty? lst)
out
(rep
(rest lst) n
(concat out (take n
(repeat
(first lst))))))))
with lots of build ups of end brackets. What are the best ways of reducing this or formatting it in a way that makes it easier to spot missing brackets?
Using Emacs's paredit mode (emulated in a few other editors too) means you're generally - unless you're copy/pasting with mouse/forced-unstructured selections - dealing with matched brackets/braces/parentheses and related indenting with no counting needed.
Emacs with https://github.com/technomancy/emacs-starter-kit (highly recommended!) has paredit enabled for clojure by default. Otherwise, see http://emacswiki.org/emacs/ParEdit
In addition to having an editor that supports brace matching, you can also try to make your code less nested. I believe that your function could be rewritten as:
(defn rep [coll n] (mapcat (partial repeat n) coll))
Of course this is more of an art (craft) than science, but some pointers (in random order):
Problems on 4clojure and their solutions by top users (visible after solving particular problems) - I believe that Chris Houser is there under the handle chouser
Speaking of CH - "The Joy of Clojure" is a very useful read
Browsing docs on clojure.core - there are a lot of useful functions there
-> and ->> threading macros are very useful for flattening nested code
stackoverflow - some of the brightest and most helpful people in the world answer questions there ;-)
An editor that colors the parenthesis is extremely helpful in this case. For example, here's what your code looks in my vim editor (using vimclojure):
Since you didn't say which editor you use, you'll have to find the rainbow-coloring feature for your editor appropriately.
I cannot echo strongly enough how valuable it is to use paredit, or some similar feature in another editor. It frees you from caring at all about parens - they always match themselves up perfectly, and tedious, error-prone editing tasks like "change (foo (bar x) y) into (foo (bar x y))" become a single keystroke. For a week or so paredit will frustrate you beyond belief as it prevents you from doing things manually, but once you learn the automatic ways of handling parens, you will never be able to go back.
I recently heard someone say, and I think it's roughly accurate, that writing lisp without paredit is like writing java without auto-complete (possible, but not very pleasant).
(fn rep
([lst n]
(rep lst n nil))
([lst n acc]
(if-let [s (seq lst)]
(recur (rest s) n (concat acc (repeat n (first s))))
acc)))
that's more readable, i think. note that:
you should use recur when tail recursing
you should test with seq - see http://clojure.org/lazy
repeat can take a count
concat will drop nil, which saves repeating yourself
you don't need to start a new line for every open paren
as for the parens - your editor/ide should take care of that. i am typing blind here, so forgive me if it's wrong...
[RafaƂ Dowgird's code is shorter; i am learning too...]
[updated:] after re-reading the "lazy" link, i think i have been handling lazy sequences incorrectly,
I'm not sure you can avoid all the brackets. However, what I've seen Lispers do is use an editor with paren matching/highlight and maybe even rainbow brackets: http://emacs-fu.blogspot.com/2011/05/toward-balanced-and-colorful-delimiters.html
Frankly, these are the kind of features that would be useful for non-Lisp editors too :)
Always use 100% recycled closing parentheses made from at least 75% post-consumer materials; then you don't have to feel so bad about using so many.
Format it however you like. It is the editor's job to display code in whatever style the reader prefers. I like the C-style hierarchical tree-shaped format with single brackets on their own lines (all the LISPers boil with rage at that :-)))))))))))))
But, I sometimes use this style:
(fn rep
([lst n]
(rep (rest lst)
n
(take n
(repeat (first lst)) ) ) ) )
which is an update on the traditional style in which brackets are spaced (log2 branch-level)
The reason I like space is that my eyesight is poor and I simply cannot read dense text. So to the angry LISPers who are about to tell me to do things the traditional way I say, well, everyone has their own way, relax, it's ok.
Can't wait for someone to write a decent editor in Clojure though, which is not a text editor but an expression editor**, then the issue of formatting goes away. I'm writing one myself but it takes time. The idea is to edit expressions by applying functions to them, and I navigate the code with a zipper, expression-by-expression, not by words or characters or lines. The code is represented by whatever display function you want.
** yes, I know there's emacs/paredit, but I tried emacs and didn't like it sorry.

Can I use the clojure 'for' macro to reverse a string?

This is a follow up to my question "Recursively reverse a sequence in Clojure".
Is it possible to reverse a sequence using the Clojure "for" macro? I'm trying to better understand the limitations and use-cases of this macro.
Here is the code I'm starting from:
((defn reverse-with-for [s]
(for [c s] c))
Possible?
If so, I assume the solution may require wrapping the for macro in some expression that defines a mutable var, or that the body-expr of the for macro will somehow pass a sequence to the next iteration (similar to map).
Clojure for macro is being used with arbitrary Clojure sequences.
These sequences may or may not expose random access like vectors do. So, in general case, you do not have access to the last element of a Clojure sequence without traversing all the way to it, which would make making a pass through it in reverse order not possible.
I'm assumming you had something like this in mind (Java-like pseudocode):
for(int i = n-1; i--; i<=0){
doSomething(array[i]);
}
In this example we know array size n in advance and we can access elements by its index. With Clojure sequences we don't know that. In Java it makes sense to do that with arrays and ArrayLists. Clojure sequences are however much more like linked lists - you have an element, and a reference to next one.
Btw, even if there were a (probably non-idiomatic)* way to do that, its time complexity would be something like O(n^2) which is just not worth the effort compared to much easier solution in the linked post which is O(n^2) for lists and a much better O(n) for vectors (and it is quite elegant and idiomatic. In fact, the official reverse has that implementation).
EDIT:
A general advice: Don't try to do imperative programming in Clojure, it wasn't designed for it. Although many things may seem strange or counter-intuitive (as opposed to well known idioms from imperative programming) once you get used to the functional way of doing things it is a lot, and I mean a lot easier.
Specifically for this question, despite the same name Java (and other C-like) for and Clojure for are not the same thing! First is an actual loop - it defines a flow control. The second one is a comprehension - look at it conceptually as a higher function of a sequence and a function f to be done for each of its element, which returns another sequence of f(element) s. Java for is a statement, it doesn't evaluate to anything, Clojure for (as well as anything else in Clojure) is an expression - it evaluates to the sequence of f(element) s.
Probably the easiest way to get the idea is to play with sequence functions library: http://clojure.org/sequences. Also, you can solve some problems on http://www.4clojure.com/. The first problems are very easy but they gradually get harder as you progress through them.
*As shown in Alexandre's answer the solution to the problem in fact is idiomatic and quite clever. Kudos for that! :)
Here's how you could reverse a string with for:
(defn reverse-with-for [s]
(apply str
(for [i (range (dec (count s)) -1 -1)]
(get s i))))
Note that this code is mutation free. It's the same as:
(defn reverse-with-map [s]
(apply str
(map (partial get s) (range (dec (count s)) -1 -1))))
A simpler solution would be:
(apply str (reverse s))
First of all, as Goran said, for is not a statement - it is an expression, namely sequence comprehension. It construct sequences by iteration through other sequences. So in the form it is meant to be used it is pure function (without side-effects). for can be seen as enhanced map infused with filter. Because of this it cannot be used to hold iteration state as e.g. reduce do.
Secondly, you can express sequence reversal using for and mutable state, e.g. using an atom, which is rough equivalent (not taking into account its concurrency properties) of java variable. But doing so you are facing several problems:
You are breaking main language paradigm so you will definitely get worse looking and behaving code.
Since all clojure mutable state cells are designed to be thread-safe, they all use some kind of illegal concurrent modification protection, and there is no ability to remove it. Consequently, you will get poorer performance characteristics.
In this particular case, like Goran said, sequences are one of the wide-used Clojure abstractions. For example, there are lazy sequences, which could be potentially infinite, so you just cannot walk them to the end. You certainly will have difficulties trying to work with such sequences with imperative techniques.
So don't do it, at least in Clojure :)
EDIT: I forgot to mention it. for returns lazy sequence, so you have to evaluate it in some way in order to apply all state mutations you do in it. Another reason not to do so :)

How to rename an operation in Clojure?

In my list, addition, the operation + appears as #. How can I make this appear exactly as +? When I eval it, it should also work exactly the same as +.
I guess this would also apply in all kinds of functions in Clojure...
Thanks guys.
The # character is simply not a valid character in symbol names in Clojure (see this page for a list of valid characters) and while it might work sometimes (as it often will), it is not a good practice to use it. Also, it will definitely not work at the beginning of a symbol (actually a literal, you could still do (symbol "#"), though there's probably no point in that). As the Clojure reader currently stands, there's nothing to be done about it (except possibly hacking the reader open to have it treat # (that's '#' followed by a space) as the symbol # -- or simply + -- though that's something you really shouldn't do, so I almost feel guilty for providing a link to instructions on how to do it).
Should you want to alias a name to some other name which is legal in Clojure, you may find it convenient to use the clojure.contrib.def/defalias macro instead of plain def; this has the added benefit of setting metadata for you (and should handle macros, though it appears to have a bug which prevents that at this time, at least in 1.2 HEAD).
And in case you'd like to redefine some built-in names when creating your aliases... (If you don't, the rest of this may not be relevant to you.)
Firstly, if you work with Clojure 1.1 or earlier and you want to provide your own binding for a name from clojure.core, you'll need to use :refer-clojure when defining your namespace. E.g. if you want to provide your own +:
(ns foo.bar
(:refer-clojure :exclude [+]))
;; you can now define your own +
(defn + [x y]
(if (zero? y)
x
(recur (inc x) (dec y))))
;; examples
(+ 3 5)
; => 8
(+ 3 -1)
; => infinite loop
(clojure.core/+ 3 -1)
; => 2
The need for this results from Clojure 1.1 prohibiting rebinding of names which refer to Vars in other namespaces; ns-unmap provides a way around it appropriate for REPL use, whereas (:refer-clojure :exclude ...), (:use :exclude ...) and (:use :only ...) provide the means systematically to prevent unwanted names from being imported from other namespaces in the first place.
In current 1.2 snapshots there's a "last-var-in wins" policy, so you could do without the :refer-clojure; it still generates a compiler warning, though, and it's better style to use :refer-clojure, plus there's no guarantee that this policy will survive in the actual 1.2 release.
An operation is just a piece of code, assigned to a variable. If you want to rebind an operation, you just rebind that variable:
(def - +)
(- 1 2 3)
# => 6
The only problem here, is that the # character is special in Clojure. I'm not sure whether you can use # as a variable name at all, at the very least you will need to quote it when binding it and probably also when calling it:
(def # +)
# => java.lang.Exception: No dispatch macro for:
Unfortunately, I'm not familiar enough with Clojure to know how to quote it.