What are Clojure Intrinsics - clojure

Browsing the Clojure source code I came across an Intrinsics.java file. It looks like it is a mapping of some clojure runtime functions to JVM opcodes.
However, I am not sure where they get applied. The following code
(def ^:const pi 3.141592)
(defn circumference [^double r] (* r 2.0 pi))
compiles to
public static java.lang.Object invokeStatic(double r);
0 dload_0 [r]
1 ldc2_w <Double 2.0> [14]
4 dmul
5 ldc2_w <Double 3.141592> [16]
8 invokestatic clojure.lang.Numbers.multiply(double, double) : double [23]
11 invokestatic java.lang.Double.valueOf(double) : java.lang.Double [29]
14 areturn
and I see that clojure.lang.Numbers.multiply(double, double) : double did not get replaced to DMUL.
How exactly are intrinsics used? Thank you.

Currently intrinsics are only used where the expression being compiled is meant to remain unboxed. Thus the (* r 2.0) multiplication in your example does receive the intrinsic treatment (resulting in the one dmul in your example invokeStatic), but the (* #<result of (* r 2.0)> 3.141592) multiplication does not.
You can get the clojure.lang.Numbers.multiply(double, double) : double intrinsic to be applied to the multiplication by r as well by ensuring that the return type is double as well.
For example this:
(def ^:const pi 3.141592)
(defn circumference ^double [^double r] (* r 2.0 pi))
compiles to the following:
public static double invokeStatic(double r);
0 dload_0 [r]
1 ldc2_w <Double 2.0> [14]
4 dmul
5 ldc2_w <Double 3.141592> [16]
8 dmul
9 dreturn

Related

Why does my hash function fail with "ArithmeticException integer overflow" even when using unchecked math

I am using the following function to try to create a 64-bit hash of a string, but it is failing with an ArithmeticException even though I am using the "unchecked" version of the arithmetic operators.
user> (reduce (fn [h c]
(unchecked-add (unchecked-multiply h 31) (long c)))
1125899906842597
"hello")
ArithmeticException integer overflow clojure.lang.Numbers.throwIntOverflow (Numbers.java:1388)
What am I doing wrong here?
have a hint here:
for whatever reason the first param in a function here is treated as integer. Adding type hint helps to solve this problem:
user> (reduce (fn [^long h c]
(unchecked-add (unchecked-multiply h 31) (long c)))
1125899906842597
"hello")
7096547112155234317
update:
moreover: it looks that it comes from the unchecked-multiply
user> (reduce (fn [h c]
(unchecked-add (unchecked-multiply ^long h 31) (long c)))
1125899906842597
"hello")
7096547112155234317
i will make some additional research, and update here, in case of any new information
update 2:
ok, that's what i've found out:
looking at the clojure's documentation at https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Numbers.java
we can see the following:
our case
static public Number unchecked_multiply(Object x, long y){return multiply(x,y);}
leads to:
static public Number multiply(Object x, long y){
return multiply(x,(Object)y);
}
then:
static public Number multiply(Object x, Object y){
return ops(x).combine(ops(y)).multiply((Number)x, (Number)y);
}
so at the end it calls multiply method from LongOps inner class.
final public Number multiply(Number x, Number y){
return num(Numbers.multiply(x.longValue(), y.longValue()));
}
so finally it leads us to a simple (checked?) multiply:
static public long multiply(long x, long y){
if (x == Long.MIN_VALUE && y < 0)
return throwIntOverflow();
long ret = x * y;
if (y != 0 && ret/y != x)
return throwIntOverflow();
return ret;
}
kaboom!
so i don't know whether it is a bug or the desired behavior, but it looks really weird to me.
so the only thing i could advice, is to always remember to typehint your values when using unchecked math in clojure.
You can get the behaviour you want by avoiding the function calls:
(loop [h 1125899906842597
cs "hello"]
(let [c (first cs)]
(if c
(recur (unchecked-add (unchecked-multiply h 31) (long c))
(rest cs))
h)))
;7096547112155234317
Why this is so, I don't know.

Creating a pixel given r g b in Clojure

Using imagez I can get a pixel from an image as [r g b]. Using this colour wheel I have verified that this extraction part is almost certainly working. This is the imagez code that does the extraction:
(defn components-rgb
"Return the RGB components of a colour value, in a 3-element vector of long values"
([^long rgb]
[(bit-shift-right (bit-and rgb 0x00FF0000) 16)
(bit-shift-right (bit-and rgb 0x0000FF00) 8)
(bit-and rgb 0x000000FF)]))
I need to do the opposite of this extraction. Here are some examples of the 'colour value' (or pixel) being extracted:
First pixel: -6700606 (in HEX: FFFFFFFFFF99C1C2)
Last pixel: -11449516 (in HEX: FFFFFFFFFF514B54)
First as colour: [153 193 194] (in HEX: 99 C1 C2)
Last as colour: [81 75 84] (in HEX: 51 4B 54)
Doing the opposite would mean that [153 193 194] becomes -6700606. This question has been asked before on SO, for example here. Here are two of my attempts which do not work:
;rgb = 0xFFFF * r + 0xFF * g + b
(defn my-rgb-1 [[r g b]]
(+ (* 0xFFFF r) (* 0xFF g) b))
;int rgb = ((r&0x0ff)<<16)|((g&0x0ff)<<8)|(b&0x0ff);
(defn my-rgb-2 [[r g b]]
(let [red (bit-shift-left 16 (bit-and r 0x0FF))
green (bit-shift-left 8 (bit-and g 0x0FF))
blue (bit-and b 0x0FF)]
(bit-or red green blue)))
image --1--> extracted-pixel --2--> rgb colour --3--> to-write-pixel --4--> image
Steps 1 and 2 are working, but step 3 is not. If step 3 were working extracted-pixel would be the same as to-write-pixel.
There is an rgb function in imagez, but it too does not work for me. (The author updated me to say it is not supposed to. See here). I might also add that the imagez function get-pixel is first used to get the pixel (step 1), followed by components-rgb (step 2) as shown above.
Take a look here where I have outlined the steps.
If I understand your poblem correctly - the issue you have is that:
"First pixel" in hex is i.e 0xFFFFFFFFF99C1C2
you convert it to decimal-rgb via `components-rgb` = [153 193 194]
you re-convert [153 193 194] to int via `rbg` and get
0x99C1C2
which by all means is correct
since clojure by default uses long the only thing you need to do
is use a unchecked-int to trucate the long to int as so:
(unchecked-int 0xFFFFFFFFFF99C1C2)
-6700606
which is what you want - right ?
I had the arguments to bit-shift-left in the wrong order. So this is a basic version of the improved my-rgb-2:
(defn my-rgb-2 [[r g b]]
(let [red (bit-shift-left r 16)
green (bit-shift-left g 8)
blue b]
(bit-or red green blue)))
I have left off (bit-and a 0x0FF), (where a is r, g or b), simply because it is only necessary if the number is more than 255, which would not be a proper value anyway. For this answer it doesn't really matter whether it is there or not.
The next thing to do is put all the 0xF (all 4 bits filled in) at the beginning. This nearly gets me there:
(defn my-rgb-2 [[r g b]]
(let [red (bit-shift-left (bit-and r 0x0FF) 16)
green (bit-shift-left (bit-and g 0x0FF) 8)
blue (bit-and b 0x0FF)]
(bit-or 0xFFFFFFFFF000000 (bit-or red green blue))))
Unfortunately if I put one more 0xF on the last line I get:
Exception in thread "main" java.lang.IllegalArgumentException: bit operation not supported for: class clojure.lang.BigInt
But that can be fixed by borrowing from the answer #birdspider gave:
(defn my-rgb-2 [[r g b]]
(let [red (bit-shift-left (bit-and r 0x0FF) 16)
green (bit-shift-left (bit-and g 0x0FF) 8)
blue (bit-and b 0x0FF)]
(bit-or (bit-or red green blue) (unchecked-int 0xFFFFFFFFFF000000))))
The imagez library now has this function, but done as a macro for performance reasons.

Transform Incanter matrix to nested vector

Consider a function that outputs an Incanter matrix.
Here is an example matrix containing output from the function:
A 6x4 matrix
-4.77e-01 8.45e-01 1.39e-01 -9.83e-18
8.55e-01 2.49e-01 1.33e-01 2.57e-17
-2.94e-03 6.60e-03 -9.63e-01 1.16e-16
...
6.64e-09 2.55e-08 1.16e-07 -1.11e-16
-1.44e-01 -3.33e-01 1.32e-01 -7.07e-01
-1.44e-01 -3.33e-01 1.32e-01 7.07e-01
I'd like to continue analyzing the rows of the matrix, which represent points. The function that I want to feed the Incanter matrix to takes nested vectors as inputs.
So the function would need the above data in the form
[[-4.77e-01 8.45e-01 1.39e-01 -9.83e-18] [8.55e-01 2.49e-01 1.33e-01 2.57e-17]
[-2.94e-03 6.60e-03 -9.63e-01 1.16e-16] [6.64e-09 2.55e-08 1.16e-07 -1.11e-16]
[-1.44e-01 -3.33e-01 1.32e-01 -7.07e-01] [-1.44e-01 -3.33e-01 1.32e-01 7.07e-01]]
It is the transformation from the Incanter matrix representation to the nested vector structure that I am unsure how to perform. Is there a simple way to convert the data's representation?
You can do it with build-in to-vect function:
(to-vect m)
or with build-in to-list function:
(to-list m)
Both functions will produce vector-of-vectors when given a matrix:
=> (def m (matrix [[1 2] [3 4]]))
A 2x2 matrix
-------------
1.00e+00 2.00e+00
3.00e+00 4.00e+00
=> (to-vect m)
[[1.0 2.0] [3.0 4.0]]
=> (to-list m)
[[1.0 2.0] [3.0 4.0]]

Tuppled function versus curried function performance in SML/NJ

I'm learning functional programming using the SML language. While reading my study notes, I came across a question, that asks which kind of a function (tuppled or curried) performs faster.
I've looked at the video here, where the instructor says that this is a matter of language implementation and states (at 5:25) that SML/NJ performs faster with tuppled functions, but doesn't state why that is.
I think my instructor once said, that it's because the curried function creates more closures, but I think I didn't hear right.
Can someone, please, elaborate on this?
There's some more intermediate evaluation for curried functions. Let's say we wanted a function so sum three numbers. We consider the following two definitions:
fun sum (x,y,z) = x + y + z
Alternatively,
fun sum x y z = x + y + z
Consider the following rough evaluation trace on the first version:
:> sum(1,2,3)
1 + 2 + 3 (substitution using pattern matching on the contents of the tuple)
(1 + 2) + 3
3 + 3
6
On the other hand, with the curried version SML will construct some anonymous functions on the fly as it is evaluating the expression. This is because curried functions take advantage of the fact that anonymous functions can be returned as the results of other functions in order to capture the behavior of applying multiple arguments to a single function. Constructing the functions takes some constant amount of time.
:> sum 1 2 3
((sum 1) 2) 3
(((fn x => (fn y => (fn z => x + y + z))) 1) 2) 3
((fn y => (fn z => 1 + y + z)) 2) 3
(fn z => 1 + 2 + z) 3
1 + 2 + 3
(1 + 2) + 3
3 + 3
6
So there are some extra steps involved. It certainly should not cause performance issues in your program, however.

Struggling with BFGS minimization algorithm for Logistic regression in Clojure with Incanter

I'm trying to implement a simple logistic regression example in Clojure using the Incanter data analysis library. I've successfully coded the Sigmoid and Cost functions, but Incanter's BFGS minimization function seems to be causing me quite some trouble.
(ns ml-clj.logistic
(:require [incanter.core :refer :all]
[incanter.optimize :refer :all]))
(defn sigmoid
"compute the inverse logit function, large positive numbers should be
close to 1, large negative numbers near 0,
z can be a scalar, vector or matrix.
sanity check: (sigmoid 0) should always evaluate to 0.5"
[z]
(div 1 (plus 1 (exp (minus z)))))
(defn cost-func
"computes the cost function (J) that will be minimized
inputs:params theta X matrix and Y vector"
[X y]
(let
[m (nrow X)
init-vals (matrix (take (ncol X) (repeat 0)))
z (mmult X init-vals)
h (sigmoid z)
f-half (mult (matrix (map - y)) (log (sigmoid (mmult X init-vals))))
s-half (mult (minus 1 y) (log (minus 1 (sigmoid (mmult X init-vals)))))
sub-tmp (minus f-half s-half)
J (mmult (/ 1 m) (reduce + sub-tmp))]
J))
When I try (minimize (cost-func X y) (matrix [0 0])) giving minimize a function and starting params the REPL throws an error.
ArityException Wrong number of args (2) passed to: optimize$minimize clojure.lang.AFn.throwArity (AFn.java:437)
I'm very confused as to what exactly the minimize function is expecting.
For reference, I rewrote it all in python, and all of the code runs as expected, using the same minimization algorithm.
import numpy as np
import scipy as sp
data = np.loadtxt('testSet.txt', delimiter='\t')
X = data[:,0:2]
y = data[:, 2]
def sigmoid(X):
return 1.0 / (1.0 + np.e**(-1.0 * X))
def compute_cost(theta, X, y):
m = y.shape[0]
h = sigmoid(X.dot(theta.T))
J = y.T.dot(np.log(h)) + (1.0 - y.T).dot(np.log(1.0 - h))
cost = (-1.0 / m) * J.sum()
return cost
def fit_logistic(X,y):
initial_thetas = np.zeros((len(X[0]), 1))
myargs = (X, y)
theta = sp.optimize.fmin_bfgs(compute_cost, x0=initial_thetas,
args=myargs)
return theta
outputting
Current function value: 0.594902
Iterations: 6
Function evaluations: 36
Gradient evaluations: 9
array([ 0.08108673, -0.12334958])
I don't understand why the Python code can run successfully, but my Clojure implementation fails. Any suggestions?
Update
rereading the docstring for minimize i've been trying to calculate the derivative of cost-func which throws a new error.
(def grad (gradient cost-func (matrix [0 0])))
(minimize cost-func (matrix [0 0]) (grad (matrix [0 0]) X))
ExceptionInfo throw+: {:exception "Matrices of different sizes cannot be differenced.", :asize [2 1], :bsize [1 2]} clatrix.core/- (core.clj:950)
using trans to convert the 1xn col matrix to a nx1 row matrix just yields the same error with opposite errors.
:asize [1 2], :bsize [2 1]}
I'm pretty lost here.
I can't say anything about your implementation, but incanter.optimize/minimize expects (at least) three parameters and you're giving it only two:
Arguments:
f -- Objective function. Takes a collection of values and returns a scalar
of the value of the function.
start -- Collection of initial guesses for the minimum
f-prime -- partial derivative of the objective function. Takes
a collection of values and returns a collection of partial
derivatives with respect to each variable. If this is not
provided it will be estimated using gradient-fn.
Unfortunately, I'm not able to tell you directly what to supply (for f-prime?) here but maybe someone else is. Btw, I think the ArityException Wrong number of args (2) passed to [...] is actually quite helpful here.
Edit: Actually, I think the docstring above is not correct, since the source code does not use gradient-fn to estimate f-prime. Maybe, you can use incanter.optimize/gradient to generate your own?
First, Your cost function should have a parameter for theta like your python implementation however your implementation has fixed as initial-value.
Second, if your cost-func is correct, then you can call optimize like this
(optimize (fn [theta] (cost-func theta X y)) [0 0 0])
Hope this can help you.