Check for character equality in clojure

Check for character equality in clojure - clojure

I'm new to clojure and trying to compare a list of characters and I've encountered some confusing behavior. Why is it difficult (impossible?) to compare the equality of a list of characters when it is straightforward to compare the concatenated string version?
(identical? (\A \T \C \G) (\A \T \C \G) )
; ClassCastException java.lang.Character cannot be cast to clojure.lang.IFn user/eval672
;(NO_SOURCE_FILE:1)
(identical? '(\A \T \C \G) '(\A \T \C \G) )
;false
;convert to string
(identical? "ATCG" "ATCG" )
;True

From the REPL:
user=> (doc identical?)
-------------------------
clojure.core/identical?
([x y])
Tests if 2 arguments are the same object
If you know Java programming language, then identical? is behaving like == operator in Java when dealing with references.
You can try this:
(= '(\A \T \C \G) '(\A \T \C \G) )
=> true
Again, in the REPL:
user=> (doc =)
-------------------------
clojure.core/=
([x] [x y] [x y & more])
Equality. Returns true if x equals y, false if not. Same as
Java x.equals(y) except it also works for nil, and compares
numbers and collections in a type-independent manner. Clojure's immutable data
structures define equals() (and thus =) as a value, not an identity,
comparison.
So no, it is not impossible and definitely not difficult to compare the equality of lists in Clojure. REPL is your best friend.

As an addendum to #Chiron 's answer: you have three kinds of equality in Clojure.
Clojure Equality or Equivalence
= (and == for numbers) is specific to Clojure and is the one you'll use the most. It performs a type-independent value comparison, which means that numbers of the same category (integers vs decimals) and similar data structures (e.g. lists, vectors and sets or maps and sequences of pairs) can be equal under that definition. It works with pure Java types too.
(= 5 5N) ;; true
(import 'java.util.ArrayList)
(= '(:a :b)
(let [l (ArrayList.)
_ (.add l :a)
_ (.add l :b)]
l)) ;; true
Java Equality
All Clojure types are Java classes under the hood, so all non-nil Clojure entities implement .equals, and for most Clojure types it behaves just like = would. But it is a less type-blind comparison ; for instance, most Clojure numeric types are pure Java types and for all Java numeric types .equals is type-specific.
(.equals 5 5N) ;; false
Beware. There are many pitfalls in writing equality methods in Java. Many library developers fell.
Object Identity
identical? behaves just like Java's == operator which returns true if and only if both parameters are the same object instance ; it's "address in memory" equality, so the stricter tool available. But sometimes that's exactly what you need.
Regarding the behaviors you encountered:
(identical? '(\A \T \C \G) '(\A \T \C \G)) returns false because two distinct list instances are created in the compiled code (the same would happen with vector, map and set literals) ; as a consequence/counter-example, this will work:
(let [a '(A \T \C \G)]
(identical? a a))
The compiler doesn't see the two list literals as identical ; it only sees two parameters and compiles each of them as a new list, hence two distinct list instances. Their contents are the same at runtime, though, because these character literals are cached ; so are boolean literals and some (but not all) numeric literals:
(identical? \A \A)
(identical? true true)
(identical? 5 5) ;; Long
(identical? 0N 0N) ;; BigInt
(identical? (byte 6)
(byte 6))
(identical? (short 7)
(short 7))
(identical? (int 8)
(int 8))
(identical? (biginteger 9)
(biginteger 9))
(identical? (bigdec 10)
(bigdec 10))
Most type caches have limitations, though ; all of the following are not cached (and thus identical? returns false):
;; Java cache limitations
(identical? (char 128) ;; Characters with codepoint outside of 0..127
(char 128)) ;; i.e. non-"C0 Control/Basic Latin" Characters
(identical? 128 128) ;; Longs/Integers/Shorts outside of -128..127
(identical? 0. 0.) ;; Doubles and Floats are not cached
(identical? (biginteger 17)
(biginteger 17)) ;; BigIntegers outside of -16..16
(identical? (bigdec 11)
(bigdec 11)) ;; BigDecimals outside of 0..10
;; Clojure cache limitations
(identical? 1N 1N) ;; 0N is the only cached BigInt literal
(identical? 0M 0M) ;; BigDecimal literals are not cached
(identical? 1/2 1/2) ;; Ratios are not cached
;; Note that Ratio literals of the form X/1
;; and any other reducible to an integer
;; e.g. 10/10 are compiled as integer types
;; (Long/BigInt)
(identical? "ATCG" "ATCG" ) returns true because Clojure String literals are interned ; so are keywords:
(identical? :foo :foo)
;; but not symbol literals
(identical? 'foo 'foo) ;; false

Related

Why do Clojure numbers end with "N" in the REPL?

So, I grabbed the latest numeric tower for a couple quick calculations and noticed that the numbers returned have "N" at the end. Why? What does it mean?
clojure.math.numeric-tower=> (expt 64 20)
1329227995784915872903807060280344576N
clojure.math.numeric-tower=> (expt 36 20)
13367494538843734067838845976576N

That is the literal form of BigInt:
user=> (type 1N)
clojure.lang.BigInt
versus, for example:
user=> (type 1)
java.lang.Long
or
user=> (type 1.0)
java.lang.Double
There's also the M suffix for BigDecimal.
user=> (type 1M)
java.math.BigDecimal
I'm not sure of all the rules for promotion to arbitrary precision (BigInt, BigDecimal). I think most of the "regular" math functions won't promote to arbitrary precision, but there are a few that do (e.g. +', -', *', inc', dec').
e.g. Regular + overflows:
user=> (+ Long/MAX_VALUE 1)
ArithmeticException integer overflow clojure.lang.Numbers.throwIntOverflow (Numbers.java:1388)
but +' promotes:
user=> (+' Long/MAX_VALUE 1)
9223372036854775808N

Printing the binary value of a number in clojure

We can represent the number 12 as 2r001100 in clojure.
Is there a built-in function to print 2r001100 when given the number 12?

java.lang.Integer/toString will print numbers with arbitrary radix:
(Integer/toString 0xf2 2) ==> "11110010"
(Integer/toString 0xf2 16) ==> "f2"
(Integer/toString 0xf2 27) ==> "8q"

see cl-format
user=> (require '[clojure.pprint :refer (cl-format)])
nil
user=> (cl-format nil "2r~6,'0',B" 12)
"2r001100"

These functions generate and print strings using java.util.Formatter.
format
printf
But they don't do binary, so the best I could come up with is:
(fn [i] (str "2r" (Integer/toBinaryString i)))

All of these answers are good, but either won't support two's-complement for negative numbers (cl-format), or won't print out the correct number of bits based on the width of the data itself (e.g., calling Integer/toBinaryString or Integer/toString on a byte will not do what you want, especially for negative numbers).
Here's a solution that will correctly print out the exact bits of the underlying data:
(defn print-bits [b]
(let [class-name (.getName (class b))
is-byte (= "java.lang.Byte" class-name)
num-bits (clojure.lang.Reflector/getStaticField class-name "SIZE")
format-string (str "~" num-bits "'0b")
bin-str-fn #(clojure.lang.Reflector/invokeStaticMethod
(if is-byte "java.lang.Integer" class-name)
"toBinaryString"
(to-array [%]))
bit-string (if is-byte
(str/join (take-last 8 (bin-str-fn (Byte/toUnsignedInt b))))
(bin-str-fn b))]
(println (str (str/join (repeat (- num-bits (count bit-string)) \0))
bit-string))))

Test of extremes here, using (bit-shift-left 1 63), or 1000000000000000000000000000000000000000000000000000000000000000.
The cl-format solution provided gives me an integer overflow.
Integer/toBinaryString gives me Value out of range for int: -9223372036854775808.
But Long/toBinaryString gives me the string that I expected.

implement in Clojure integer? in scheme

I'm new to Clojure, and can't find an equivalent of integer? in Chez scheme 8.4, mainly for test cases as below:
(integer? 39.0)
=> #t
The function I've come up so far is:
(defn actual-integer? [x] (or (= 0.0 (- x (int x))) (integer? x)))
Does it work when x is arbitrary number types or is there a better solution?
Thanks.

Well, strictly speaking 39.0 isn't an integer literal because it has the .0 part at the end. A simple implementation of the procedure would be:
(defn actual-integer? [x] (== (int x) x))
Notice that the == operator:
Returns non-nil if nums all have the equivalent value (type-independent), otherwise false

How to match against a hierarchy with clojure.core.match?

Assume I have an ad-hoc hierarchy of entities represented as Clojure keywords, like this:
(def h
(-> (make-hierarchy)
(derive :animal :feline)
(derive :feline :cat)
(derive :feline :lion)
(derive :feline :tiger)))
What would be the best way (preferably using clojure.core.match) to write a match-hierarchy function so that these contrived examples return expected results:
(defn example [x y]
(match-hierarchy h [x y]
[:cat :cat] 0
[:cat :feline] 1
[:cat :animal] 2
[:animal :animal] 3))
(example :cat :cat)
;=> 0
(example :cat :tiger)
;=> 1
(example :tiger :tiger)
;=> 3
i.e., match-hierarchy should return the value corresponding to the first clause whose elements are all either equal to the corresponding element of the value being matched, or its (direct or indirect) ancestors?
I'm happy with using something custom instead of make-hierarchy to create the hierarchies. I'm also happy with not using core.match if another option is available. However, I need it to work with objects of pre-existing classes, like numbers (e.g., I want to be able to say 3 is a :positive-integer is an :integer is a :number).
Background: I am writing a toy x86 assembler in Clojure, where I need to assemble my instruction based on its name and operands. Currently my code includes something like:
(match [(-> instr name keyword) (operand-type op1) (operand-type op2)]
;; other options
[:int :imm nil] (if (= op1 3)
{:opcode 0xcc}
{:opcode 0xcd, :immediate (imm 8 op1)})
(i.e., an INT instruction assembles to two bytes, 0xcd followed by the interrupt number, unless it's an INT 3 in which case it's only one byte, 0xcc). I find this somewhat ugly, however, and am looking for a more elegant approach. So instead I'd like to say something along the lines of
(asm-match [(-> instr name keyword) op1 op2]
;; other options
[:int 3 nil] {:opcode 0xcc}
[:int :imm nil] {:opcode 0xcd, :immediate (imm 8 op1)})

(match [(-> instr name keyword) op1 (operand-type op1) (operand-type op2)]
[:int 3 :imm nil] {:opcode 0xcc}
[:int _ :imm nil] {:opcode 0xcd, :immediate (imm 8 op1)})
Does this not work for you?

Single predicate to test for "self-evaluating" atoms in Clojure

At the home site of Clojure, there is the following statement:
Strings, numbers, characters, true,
false, nil and keywords evaluate to
themselves.
Is there a single combined predicate that tests for any of these, combining string?, number?, char?, true?, false?, nil?, and keyword?. Should I just use (complement symbol?)?

Maybe I'm missing something, but you could use the following to test for any of those conditions and return true if one is true:
(defn self-eval?
[x]
(or (string? x)
(number? x)
(char? x)
(keyword? x)
(true? x)
(false? x)
(nil? x)))

It's easy enough to write a macro that asks "does the given expression evaluate to itself". In fact this is a good example of tasks that can only be done with a macro because they need to see the argument both evaluated and unevaluated.
(defmacro selfp [a] `(= ~a (quote ~a)))
#'user/selfp
user> (selfp 1)
true
user> (selfp +)
false
user> (selfp [1 2])
true
user> (selfp '(+ 1 2 3))
false
While strings, numbers, characters, keywords, and the booleans are all self-evaluating, other things such as [1 2] are as well,so this may not be a useful test in general.

Another option is to create a function that uses a map:
(defn myclassifier? [x]
(let [types-I-care-about #{java.lang.Sring ...}]
(if (types-I-care-about (type x))
true
false)))
Another option which may have better performance is to use java's dynamism:
(extend-type Object
IMyClassifier
(myclassifier? [x]
(let [c (.getClass x)]
(if (types-I-care-about (type c))
(do
(extend-type (.getClass x)
IMyClassifier
(myclassifier? [x] true))
true)
false))))
where types-I-care-about is a set of types you care about.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Check for character equality in clojure - clojure

Related

Why do Clojure numbers end with "N" in the REPL?

Printing the binary value of a number in clojure

implement in Clojure integer? in scheme

How to match against a hierarchy with clojure.core.match?

Single predicate to test for "self-evaluating" atoms in Clojure

Categories

Resources