I'm looking for guidance on when to use Clojure BigInt versus Java BigInteger in Clojure. Both work just fine, and I am assuming that the main reason to use BigInt is to take advantage of operators like + and =, which have to be accessed via the Java instance methods .add and .equals, for instance. But there are few operators, such as isProbablePrime, that I can only access from BigInteger.
It seems pretty easy to shift from BigInt to BigInteger or vice versa, but the presence of both makes the use-cases unclear for me. My knee-jerk reaction is just to stick with BigInteger in the absence of clear criteria since some of the suggested usages seem not to work. From clojuredocs here:
user=> (def x (bigint 97))
user=> (.isProbablePrime x 1)
IllegalArgumentException No matching method found: isProbablePrime for class
clojure.lang.BigInt clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:53)
In "Clojure Programming" by C. Emerick et. al., p.428, there is a sidebar topic, "Why Does Clojure Have Its Own BigInt Class When Java Already Provides One in BigInteger?"
They note two reasons to prefer BigInt to Java's BigInteger. First, the latter's .hashCode implementation is inconsistent with that of Long (the same number expressed in each type gives a different hash value). This is generally not what you want when comparing equivalent values in e.g. hash maps.
The other reason is that BigInts are optimized to use primitive types when possible, so performance should be better for many cases.
I would use Clojure's numeric types unless you have a good reason not to (your use of .isProbablePrime suggests you might have a good enough reason).
Related
I've been learning Clojure and am a good way through a book on it when I realized how much I'm still struggling to interpret the code. What I'm looking for is the abstract structure, interface, or rules, Clojure uses to parse code. I think it looks something like:
(some-operation optional-args)
optional-args can be nearly anything and that's where I start getting confused.
(operation optional-name-string [vector of optional args]) would equal (defn newfn [argA, argB])
I think this pattern holds for all lists () but with so much flexibility and variation in Clojure, I'm not sure. It would be really helpful to see the rules the interpreter follows.
You are not crazy. Sure it's easy to point out how "easy" ("simple"? but that another discussion) Clojure syntax is but there are two things for a new learner to be aware of that are not pointed out very clearly in beginning tutorials that greatly complicate understanding what you are seeing:
Destructuring. Spend some quality time with guides on destructuring in Clojure. I will say that this adds a complexity to the language and is not dissimilar from "*args" and "**kwargs" arguments in Python or from the use of the "..." spread operator in javascript. They are all complicated enough to require some dedicated time to read. This relates to the optional-args you reference above.
macros and metaprogramming. In the some-operation you reference above, you wish to "see the rules the interpreter follows". In the majority of the cases it is a function but Clojure provides you no indication of whether you are looking at a function or a macro. In the standard library, you will just need to know some standard macros and how they affect the syntax they headline. (e.g. if, defn etc). For included libraries, there will typically be a small set of macros that are core to understanding that library. Any macro will to modify, dare I say, complicate the syntax in the parens you are looking at so be on your toes.
Clojure is fantastic and easy to learn but those two points are not to be glossed over IMHO.
Before you start coding with Clojure, I highly recommend studying functional programming and LISB. In Clojure, everything is a prefix, and when you want to run and specific function, you will call it and then feed it with some arguments. for example, 1+2+3 will be (+ 1 2 3) in Clojure. In other words, every function you call will be at the start of a parenthesis, and all of its arguments will be follows the function name.
If you define a function, you may do as follow:
(defn newfunc [a1 a2]
(+ 100 a1 a2))
Which newfunc add 100 and a1 and a2. When you call it, you should do this:
(newfunc 1 2)
and the result will be 103.
in the first example, + is a function, so we call it at the beginning of the parenthesis.
Clojure is a beautiful world full of simplicity. Please learn it deeply.
A question was posted about chained comparison operators and how they are interpreted in different languages.
Chaining comparison operators means that (x < y < z) would be interpreted as ((x < y) && (y < z)) instead of as ((x < y) < z).
The comments on that question show that Python, Perl 6, and Mathematica support chaining comparison operators, but what other languages support this feature and why is it not more common?
A quick look at the Python documentation shows that this feature has been since at least 1996. Is there a reason more languages have not added this syntax?
A statically typed language would have problems with type conversion, but are there other reasons this is not more common?
It should be more common, but I suspect it is not because it makes parsing languages more complex.
Benefits:
Upholds the principle of least surprise
Reads like math is taught
Reduces cognitive load (see previous 2 points)
Drawbacks:
Grammar is more complex for the language
Special case syntactic sugar
As to why not, my guesses are:
Language author(s) didn't think of it
Is on the 'nice to have' list
Was decided that it wasn't useful enough to justify implementing
The benefit is too small to justify complicating the language.
You don't need it that often, and it is easy to get the same effect cleanly with a few characters more.
Scheme (and probably most other Lisp family languages) supports multiple comparison efficiently within its grammar:
(< x y z)
This can be considered an ordinary function application of the < function with three arguments. See 6.2.5 Numerical Operations in the specification.
Clojure supports chained comparison too.
Chained comparison is a feature of BCPL, since the late 1960s.
I think ICON is the original language to have this, and in ICON it falls out of the way that booleans are handled as special 'fail' tags with all other values being treated as true.
So I want to use java.awt.Color for something, and I'd like to be able to write code like this:
(use 'java.awt.Color)
(= Color/BLUE (- Color/WHITE Color/RED Color/GREEN))
Looking at the core implementation of -, it talks specifically about clojure.lang.Numbers, which to me implies that there is nothing I do to 'hook' into the core implementation and extend it.
Looking around on the Internet, there seems to be two different things people do:
Write their own defn - function, which only knows about the data type they're interested in. To use you'd probably end up prefixing a namespace, so something like:
(= Color/BLUE (scdf.color/- Color/WHITE Color/RED Color/GREEN))
Or alternatively useing the namespace and use clojure.core/- when you want number math.
Code a special case into your - implementation that passes through to clojure.core/- when your implementation is passed a Number.
Unfortunately, I don't like either of these. The first is probably the cleanest, as the second makes the presumption that the only things you care about doing maths on is their new datatype and numbers.
I'm new to Clojure, but shouldn't we be able to use Protocols or Multimethods here, so that when people create / use custom types they can 'extend' these functions so they work seemlessly? Is there a reason that +,- etc doesn't support this? (or do they? They don't seem to from my reading of the code, but maybe I'm reading it wrong).
If I want to write my own extensions to common existing functions such as + for other datatypes, how should I do it so it plays nicely with existing functions and potentially other datatypes?
It wasn't exactly designed for this, but core.matrix might be of interest to you here, for a few reasons:
The source code provides examples of how to use protocols to define operations that work with with various different types. For example, (+ [1 2] [3 4]) => [4 6]). It's worth studying how this is done: basically the operators are regular functions that call a protocol, and each data type provides an implementation of the protocol via extend-protocol
You might be interested in making java.awt.Color work as a core.matrix implementation (i.e. as a 4D RGBA vector). I did something simiilar with BufferedImage here: https://github.com/clojure-numerics/image-matrix. If you implement the basic core.matrix protocols, then you will get the whole core.matrix API to work with Color objects. Which will save you a lot of work implementing different operations.
The probable reason for not making arithmetic operation in core based on protocols (and making them only work of numbers) is performance. A protocol implementation require an additional lookup for choosing the correct implementation of the desired function. Although from design point of view it may feel nice to have protocol based implementations and extend them whenever required, but when you have a tight loop that does these operations many times (and this is very common use case with arithmetic operations) you will start feeling the performance issues because of the additional lookup on each operation that happen at runtime.
If you have separate implementation for your own data types (ex: color/-) in their own namespace then it will be more performant due to a direct call to that function and it also make things more explicit and customizable for specific cases.
Another issue with these functions will be their variadic nature (i.e they can take any number of arguments). This is a serious issue in providing a protocol implementation as protocol extended type check only works on first parameter.
You can have a look at algo.generic.arithmetic in algo.generic. It uses multimethods.
I'm trying to use protocols to create an engineering number type (a "knumber"), so I can say (+ "1k" "2Meg") and get something like "2.001Meg". I should be able to get the floating point value from the knumber like so (:val my-knumber), but normally the printer should display the string, which is also accessible like so (:string my-knumber). This number will support all the usual p, n, u, m, k, Meg, G suffixes, and convert as required among them, such as (/ "1Meg" "1G") -> "1m"). I want to be able to pass this to any function which expects a number.
Anyway, Can someone suggest a strategy for this? I think I need to use protocols. I currently have a (defrecord knumber [val string]) but I'm not sure what's next.
What protocols do clojure numbers satsify? I'm thinking I need to extend some existing protocols/interfaces for this.
Thanks
I think your strategy should probably be as follows:
Define the record KNumber as something like (defrecord knumber [value unit-map])
Make unit-map a map of units to integer exponents (you are going to want units like m/s^2 if these are engineering numbers, right?). It might look something like {"m" 1 "s" -2},
Have KNumber implement java.lang.Number so that you can use it with other mathematical functions that already exist in Clojure. You'll need to implement doubleValue, longValue etc.
Define a protocol NumberWithUnits that you can extend to both KNumbers and normal clojure numbers. At a minimum it should have methods (numeric-value [number]) and (get-units [number])
Then define your mathematical functions +, *, - etc. in your own namespace that operate on anything that implements the NumberWithUnits protocol and return a KNumber.
Regarding different unit scales (e.g. "m" vs. "km") I would suggest standardising on a single scale for internal representation for each unit type (e.g. "m" for distances) but providing options for conversion to other unit scales for input/output purposes.
The frinj library is a Clojure library for calculations with units. Looking into the source will probably give you some nice ideas.
As a side project I'm creating a Clojure DSL for image synthesis (clisk).
I'm a little unsure on the best approach to function naming where I have functions in the DSL that are analogous to functions in Clojure core, for example the + function or something similar is needed in my DSL to additively compose images / perform vector maths operations.
As far as I can see it there are a few options:
Use the same name (+) in my own namespace. Looks nice in DSL code but will override the clojure.core version, which may cause issues. People could get confused.
Use the same name but require it to be qualified (my-ns/+). Avoids conflicts, but prevents people from useing the namespace for convenience and looks a bit ugly.
Use a different short name e.g. (v+). Can be used easily and avoid clashes, but the name is a bit ugly and might prove hard to remember.
Use a different long name e.g. (vector-add). Verbose but descriptive, no clashes.
Exclude clojure.core/+ and redefine with a multimethod + (as georgek suggests).
Example code might look something like:
(show (v+ [0.9 0.6 0.3]
(dot [0.2 0.2 0]
(vgradient (vseamless 1.0 plasma) ))))
What is the best/most idiomatic approach?
first, the repeated appearance of operators in an infix expression requires a nice syntax, but for a lisp, with prefix syntax, i don't think this is as important. so it's not such a crime to have the user type a few more characters for an explicit namespace. and clojure has very good support for namespaces and aliasing. so it's very easy for a user to select their own short prefix: (x/+ ...) for example.
second, looking at the reader docs there are not many non-alphanumeric symbols to play with, so something like :+ is out. so there's no "cute" solution - if you choose a prefix it's going to have to be a letter. that means something like x+ - better to let the user choose an alias, at the price of one more character, and have x/+.
so i would say: ignore core, but expect the user to (:require .... :as ...). if they love your package so much they want it to be default then they can (:use ...) and handle core explicitly. but you choosing a prefix to operators seems like a poor compromise.
(and i don't think i have seen any library that does use single letter prefixes).
one other possibility is to provide the above and also a separate package with long names instead of operators (which are simply def'ed to match the values in the original package). then if people do want to (:use ...) but want to avoid clashes, they can use that (but really what's the advantage of (vector-add ...) over (vector/+ ...)?)
and finally i would check how + is implemented, since if it already involves some kind of dispatch on types then georgek's comment makes a lot of sense.
(by "operator" above i just mean single-character, non-alphanumeric symbol)