I've been trying to implement Spam Classifier. I wrote one function to get some probability but when I call this function with two arguments I get clojure.lang.ArityException "Wrong number of args passed to function". Here is my function:
(defn weightedprob
[f cat]
(let [weight 1
ap 0.5
basicprob (atom 0)
totals (atom 0)
bp (atom 0)]
(swap! basicprob #(fprob f cat))
(swap! totals #(reduce + (vals (get #fc f))))
(swap! bp #(/ (+ (* weight ap) (* totals basicprob)) (+ weight totals)))
bp))
And here is the call:
(weightedprob "money" "good")
It works now. If you have better idea how to implement this function, I would be happy to see that. Here is version that works:
(defn weightedprob
[f cat]
(let [weight 1
ap 0.5
basicprob (atom 0)
totals (atom 0)
bp (atom 0)]
(reset! basicprob (fprob f cat))
(reset! totals (reduce + (vals (get #fc f))))
(reset! bp (/ (+ (* weight ap) (* #totals #basicprob)) (+ weight
#totals)))
#bp))
The function in Python I have been implementing looks as folows:
def weightedprob(self,f,cat,prf,weight=1.0,ap=0.5):
# Calculate current probability
basicprob=prf(f,cat)
# Count the number of times this feature has appeared in
# all categories
totals=sum([self.fcount(f,c) for c in self.categories()])
# Calculate the weighted average
bp=((weight*ap)+(totals*basicprob))/(weight+totals)
return bp
This is from Collective Intelligence book, chapter 6, Document Filtering
Function without atoms:
(defn weightedprob1
[f cat]
(let [weight 1
ap 0.5
basicprob (fprob f cat)
totals (reduce + (vals (get #fc f)))
bp (/ (+ (* weight ap) (* totals basicprob)) (+ weight totals))]
bp))
Your original problem was caused by swap! expecting a function that takes a single argument. You were supplying a function that didn't take any arguments though, thus the error.
The corrected code would be what you've already posted:
(defn weightedprob [f cat]
(let [weight 1
ap 0.5
basicprob (atom 0)
totals (atom 0)
bp (atom 0)]
(reset! basicprob (fprob f cat))
(reset! totals (reduce + (vals (get #fc f))))
(reset! bp (/ (+ (* weight ap) (* #totals #basicprob)) (+ weight #totals)))
#bp))
This doesn't require atoms at all though. Atoms are mainly just used for managing data between threads. You don't need any kind of mutation at all here. Just continue using let:
(defn weightedprob [f cat]
(let [weight 1
ap 0.5
basicprob (fprob f cat)
totals (reduce + (vals (get #fc f)))]
(/ (+ (* weight ap) (* totals basicprob)) (+ weight totals))))
And it's not clear what fc is, but it likely doesn't need to be an atom either.
Related
I was trying to modify a vector of vector but ended up with lazy-seq inside. I am new to clojure. Can someone help me to get this correctly?
(require '[clojure.string :as str])
;;READ CUST.TXT
(def my_str(slurp "cust.txt"))
(defn esplit [x] (str/split x #"\|" ))
(def cust (vec (sort-by first (vec (map esplit (vec (str/split my_str #"\n")))))))
;;func to print
(for [i cust] (do (println (str (subs (str i) 2 3) ": [" (subs (str i) 5 (count (str i)))))))
;;CODE TO SEARCH CUST
(defn cust_find [x] (for [i cust :when (= (first i) x )] (do (nth i 1))))
(type (cust_find "2"))
;;READ PROD.TXT
(def my_str(slurp "prod.txt"))
(def prod (vec (sort-by first (vec (map esplit (vec (str/split my_str #"\n")))))))
;;func to print
(for [i prod] (do (println (str (subs (str i) 2 3) ": [" (subs (str i) 5 (count (str i)))))))
;;CODE TO SEARCH PROD
(defn prod_find [x y] (for [i prod :when (= (first i) x )] (nth i y)))
(prod_find "2" 1)
(def my_str(slurp "sales.txt"))
(def sales (vec (sort-by first (vec (map esplit (vec (str/split my_str #"\n")))))))
; (for [i (range(count sales))] (cust_find (nth (nth sales i) 1)))
; (defn fill_sales_1 [x]
; (assoc x 1
; (cust_find (nth x 1))))
; (def sales(map fill_sales_1 (sales)))
(def sales (vec (for [i (range(count sales))] (assoc (nth sales i) 1 (str (cust_find (nth (nth sales i) 1)))))))
; (for [i (range(count sales))] (assoc (nth sales i) 2 (str (prod_find (nth (nth sales i) 2) 1))))
(for [i sales] (println i))
When I print sales vector I get
[1 clojure.lang.LazySeq#10ae5ccd 1 3]
[2 clojure.lang.LazySeq#a5d0ddf9 2 3]
[3 clojure.lang.LazySeq#a5d0ddf9 1 1]
[4 clojure.lang.LazySeq#d80cb028 3 4]
If you need the text files I will upload them as well.
In Clojure, for and map, as well as other functions and macros working with sequences, generate a lazy sequence instead of a vector.
In the REPL, lazy sequences are usually fully computed when printing - to have it printed, it's enough to remove the str in your second to last line:
(def sales (vec (for [i (range(count sales))] (assoc (nth sales i) 1 (cust_find (nth (nth sales i) 1))))))
Just in case, note that your code can be prettified / simplified to convey the meaning better. For example, you are just iterating over a sequence of sales - you don't need to iterate over the indices and then get each item using nth:
(def sales
(vec (for [rec sales])
(assoc rec 1 (cust_find (nth rec 1)))))
Second, you can replace nth ... 1 with second - it will be easier to understand:
(def sales
(vec (for [rec sales])
(assoc rec 1 (cust_find (second rec))))
Or, alternatively, you can just use update instead of assoc:
(def sales
(vec (for [rec sales])
(update rec 1 cust_find)))
And, do you really need the outer vec here? You can do most of what you intend without it:
(def sales
(for [rec sales])
(update rec 1 cust_find))
Also, using underscores in Clojure function names is considered bad style: dashes (as in cust-find instead of cust_find) are easier to read and easier to type.
(for [i sales] (println (doall i)))
doall realizes a lazy sequence. Bare in mind that if the sequence is huge, you might not want to do this.
I've pasted the code on this page in an IDE and it works. The problem is that when I replace the definition of target-data with this vector of pairs* it gives me this error**.
(vector [[1 2]
[2 3]
[3 4]
[4 5]] ) ; *
UnsupportedOperationException nth not supported on this type: core$vector clojure.lang.RT.nthFrom (RT.java:857) **
What should I do to use my own target-data?
UPDATED FULL CODE:
(ns evolvefn.core)
;(def target-data
; (map #(vector % (+ (* % %) % 1))
; (range -1.0 1.0 0.1)))
;; We'll use input (x) values ranging from -1.0 to 1.0 in increments
;; of 0.1, and we'll generate the target [x y] pairs algorithmically.
;; If you want to evolve a function to fit your own data then you could
;; just paste a vector of pairs into the definition of target-data instead.
(def target-data
(vec[1 2]
[2 3]
[3 4]
[4 5]))
;; An individual will be an expression made of functions +, -, *, and
;; pd (protected division), along with terminals x and randomly chosen
;; constants between -5.0 and 5.0. Note that for this problem the
;; presence of the constants actually makes it much harder, but that
;; may not be the case for other problems.
(defn random-function
[]
(rand-nth '(+ - * pd)))
(defn random-terminal
[]
(rand-nth (list 'x (- (rand 10) 5))))
(defn random-code
[depth]
(if (or (zero? depth)
(zero? (rand-int 2)))
(random-terminal)
(list (random-function)
(random-code (dec depth))
(random-code (dec depth)))))
;; And we have to define pd (protected division):
(defn pd
"Protected division; returns 0 if the denominator is zero."
[num denom]
(if (zero? denom)
0
(/ num denom)))
;; We can now evaluate the error of an individual by creating a function
;; built around the individual, calling it on all of the x values, and
;; adding up all of the differences between the results and the
;; corresponding y values.
(defn error
[individual]
(let [value-function (eval (list 'fn '[x] individual))]
(reduce + (map (fn [[x y]]
(Math/abs
(- (value-function x) y)))
target-data))))
;; We can now generate and evaluate random small programs, as with:
;; (let [i (random-code 3)] (println (error i) "from individual" i))
;; To help write mutation and crossover functions we'll write a utility
;; function that injects something into an expression and another that
;; extracts something from an expression.
(defn codesize [c]
(if (seq? c)
(count (flatten c))
1))
(defn inject
"Returns a copy of individual i with new inserted randomly somwhere within it (replacing something else)."
[new i]
(if (seq? i)
(if (zero? (rand-int (count (flatten i))))
new
(if (< (rand)
(/ (codesize (nth i 1))
(- (codesize i) 1)))
(list (nth i 0) (inject new (nth i 1)) (nth i 2))
(list (nth i 0) (nth i 1) (inject new (nth i 2)))))
new))
(defn extract
"Returns a random subexpression of individual i."
[i]
(if (seq? i)
(if (zero? (rand-int (count (flatten i))))
i
(if (< (rand) (/ (codesize (nth i 1))
(- (codesize i)) 1))
(extract (nth i 1))
(extract (nth i 2))))
i))
;; Now the mutate and crossover functions are easy to write:
(defn mutate
[i]
(inject (random-code 2) i))
(defn crossover
[i j]
(inject (extract j) i))
;; We can see some mutations with:
;; (let [i (random-code 2)] (println (mutate i) "from individual" i))
;; and crossovers with:
;; (let [i (random-code 2) j (random-code 2)]
;; (println (crossover i j) "from" i "and" j))
;; We'll also want a way to sort a populaty by error that doesn't require
;; lots of error re-computation:
(defn sort-by-error
[population]
(vec (map second
(sort (fn [[err1 ind1] [err2 ind2]] (< err1 err2))
(map #(vector (error %) %) population)))))
;; Finally, we'll define a function to select an individual from a sorted
;; population using tournaments of a given size.
(defn select
[population tournament-size]
(let [size (count population)]
(nth population
(apply min (repeatedly tournament-size #(rand-int size))))))
;; Now we can evolve a solution by starting with a random population and
;; repeatedly sorting, checking for a solution, and producing a new
;; population.
(defn evolve
[popsize]
(println "Starting evolution...")
(loop [generation 0
population (sort-by-error (repeatedly popsize #(random-code 2)))]
(let [best (first population)
best-error (error best)]
(println "======================")
(println "Generation:" generation)
(println "Best error:" best-error)
(println "Best program:" best)
(println " Median error:" (error (nth population
(int (/ popsize 2)))))
(println " Average program size:"
(float (/ (reduce + (map count (map flatten population)))
(count population))))
(if (< best-error 0.1) ;; good enough to count as success
(println "Success:" best)
(recur
(inc generation)
(sort-by-error
(concat
(repeatedly (* 1/2 popsize) #(mutate (select population 7)))
(repeatedly (* 1/4 popsize) #(crossover (select population 7)
(select population 7)))
(repeatedly (* 1/4 popsize) #(select population 7)))))))))
;; Run it with a population of 1000:
(evolve 1000)
And the error is:
(evolve 1000)
Starting evolution...
IllegalArgumentException No matching method found: abs clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:80)
evolvefn.core=>
In Clojure I want to find the result of multiple reductions while only consuming the sequence once. In Java I would do something like the following:
double min = Double.MIN_VALUE;
double max = Double.MAX_VALUE;
for (Item item : items) {
double price = item.getPrice();
if (price > min) {
min = price;
}
if (price < max) {
max = price;
}
}
In Clojure I could do much the same thing by using loop and recur, but it's not very composable - I'd like to do something that lets you add in other aggregation functions as needed.
I've written the following function to do this:
(defn reduce-multi
"Given a sequence of fns and a coll, returns a vector of the result of each fn
when reduced over the coll."
[fns coll]
(let [n (count fns)
r (rest coll)
initial-v (transient (into [] (repeat n (first coll))))
fns (into [] fns)
reduction-fn
(fn [v x]
(loop [v-current v, i 0]
(let [y (nth v-current i)
f (nth fns i)
v-new (assoc! v-current i (f y x))]
(if (= i (- n 1))
v-new
(recur v-new (inc i))))))]
(persistent! (reduce reduction-fn initial-v r))))
This can be used in the following way:
(reduce-multi [max min] [4 3 6 7 0 1 8 2 5 9])
=> [9 0]
I appreciate that it's not implemented in the most idiomatic way, but the main problem is that it's about 10x as slow as doing the reductions one at at time. This might be useful for lots performing lots of reductions where the seq is doing heavy IO, but surely this could be better.
Is there something in an existing Clojure library that would do what I want? If not, where am I going wrong in my function?
that's what i would do: simply delegate this task to a core reduce function, like this:
(defn multi-reduce
([fs accs xs] (reduce (fn [accs x] (doall (map #(%1 %2 x) fs accs)))
accs xs))
([fs xs] (when (seq xs)
(multi-reduce fs (repeat (count fs) (first xs))
(rest xs)))))
in repl:
user> (multi-reduce [+ * min max] (range 1 10))
(45 362880 1 9)
user> (multi-reduce [+ * min max] [10])
(10 10 10 10)
user> (multi-reduce [+ * min max] [])
nil
user> (multi-reduce [+ * min max] [1 1 1000 0] [])
[1 1 1000 0]
user> (multi-reduce [+ * min max] [1 1 1000 0] [1])
(2 1 1 1)
user> (multi-reduce [+ * min max] [1 1 1000 0] (range 1 10))
(46 362880 1 9)
user> (multi-reduce [max min] (range 1000000))
(999999 0)
The code for reduce is fast for reducible collections. So it's worth trying to base multi-reduce on core reduce. To do so, we have to be able to construct reducing functions of the right shape. An ancillary function to do so is ...
(defn juxt-reducer [f g]
(fn [[fa ga] x] [(f fa x) (g ga x)]))
Now we can define the function you want, which combines juxt with reduce as ...
(defn juxt-reduce
([[f g] coll]
(if-let [[x & xs] (seq coll)]
(juxt-reduce (list f g) [x x] xs)
[(f) (g)]))
([[f g] init coll]
(reduce (juxt-reducer f g) init coll)))
For example,
(juxt-reduce [max min] [4 3 6 7 0 1 8 2 5 9]) ;=> [9 0]
The above follows the shape of core reduce. It can clearly be extended to cope with more than two functions. And I'd expect it to be faster than yours for reducible collections.
Here is how I would do it:
(ns clj.core
(:require [clojure.string :as str] )
(:use tupelo.core))
(def data (flatten [ (range 5 10) (range 5) ] ))
(spyx data)
(def result (reduce (fn [cum-result curr-val] ; reducing (accumulator) fn
(it-> cum-result
(update it :min-val min curr-val)
(update it :max-val max curr-val)))
{ :min-val (first data) :max-val (first data) } ; inital value
data)) ; seq to reduce
(spyx result)
(defn -main [] )
;=> data => (5 6 7 8 9 0 1 2 3 4)
;=> result => {:min-val 0, :max-val 9}
So the reducing function (fn ...) carries along a map like {:min-val xxx :max-val yyy} through each element of the sequence, updating the min & max values as required at each step.
While this does make only one pass through the data, it is doing a lot of extra work calling update twice per element. Unless your sequence is very unusual, it is probably more efficient to make two (very efficient) passes through the data like:
(def min-val (apply min data))
(def max-val (apply max data))
(spyx min-val)
(spyx max-val)
;=> min-val => 0
;=> max-val => 9
What is the best way of implementing map function together with an updatable state between applications of function to each element of sequence? To illustrate the issue let's suppose that we have a following problem:
I have a vector of the numbers. I want a new sequence where each element is multiplied by 2 and then added number of 10's in the sequence up to and including the current element. For example from:
[20 30 40 10 20 10 30]
I want to generate:
[40 60 80 21 41 22 62]
Without adding the count of 10 the solution can be formulated using a high level of abstraction:
(map #(* 2 %) [20 30 40 10 20 10 30])
Having count to update forced me to "go to basic" and the solution I came up with is:
(defn my-update-state [x state]
(if (= x 10) (+ state 1) state)
)
(defn my-combine-with-state [x state]
(+ x state))
(defn map-and-update-state [vec fun state update-state combine-with-state]
(when-not (empty? vec)
(let [[f & other] vec
g (fun f)
new-state (update-state f state)]
(cons (combine-with-state g new-state) (map-and-update-state other fun new-state update-state combine-with-state))
)))
(map-and-update-state [20 30 40 50 10 20 10 30 ] #(* 2 %) 0 my-update-state my-combine-with-state )
My question: is it the appropriate/canonical way to solve the problem or I overlooked some important concepts/functions.
PS:
The original problem is walking AST (abstract syntax tree) and generating new AST together with updating symbol table, so when proposing the solution to the problem above please keep it in mind.
I do not worry about blowing up stack, so replacement with loop+recur is not
my concern here.
Is using global Vars or Refs instead of passing state as an argument a definite no-no?
You can use reduce to accumulate a pair of the number of 10s seen so far and the current vector of results.:
(defn map-update [v]
(letfn [(update [[ntens v] i]
(let [ntens (+ ntens (if (= 10 i) 1 0))]
[ntens (conj v (+ ntens (* 2 i)))]))]
(second (reduce update [0 []] v))))
To count # of 10 you can do
(defn count-10[col]
(reductions + (map #(if (= % 10) 1 0) col)))
Example:
user=> (count-10 [1 2 10 20 30 10 1])
(0 0 1 1 1 2 2)
And then a simple map for the final result
(map + col col (count-10 col)))
Reduce and reductions are good ways to traverse a sequence keeping a state. If you feel your code is not clear you can always use recursion with loop/recur or lazy-seq like this
(defn twice-plus-ntens
([coll] (twice-plus-ntens coll 0))
([coll ntens]
(lazy-seq
(when-let [s (seq coll)]
(let [item (first s)
new-ntens (if (= 10 item) (inc ntens) ntens)]
(cons (+ (* 2 item) new-ntens)
(twice-plus-ntens (rest s) new-ntens)))))))
have a look at map source code evaluating this at your repl
(source map)
I've skipped chunked optimization and multiple collection support.
You can make it a higher-order function this way
(defn map-update
([mf uf coll] (map-update mf uf (uf) coll))
([mf uf val coll]
(lazy-seq
(when-let [s (seq coll)]
(let [item (first s)
new-status (uf item val)]
(cons (mf item new-status)
(map-update mf uf new-status (rest s))))))))
(defn update-f
([] 0)
([item status]
(if (= item 10) (inc status) status)))
(defn map-f [item status]
(+ (* 2 item) status))
(map-update map-f update-f in)
The most appropriate way is to use function with state
(into
[]
(map
(let [mem (atom 0)]
(fn [val]
(when (== val 10) (swap! mem inc))
(+ #mem (* val 2)))))
[20 30 40 10 20 10 30])
also see
memoize
standard function
I know that the -> form can be used to pass the results of one function result to another:
(f1 (f2 (f3 x)))
(-> x f3 f2 f1) ; equivalent to the line above
(taken from the excellent Clojure tutorial at ociweb)
However this form requires that you know the functions you want to use at design time. I'd like to do the same thing, but at run time with a list of arbitrary functions.
I've written this looping function that does it, but I have a feeling there's a better way:
(defn pipe [initialData, functions]
(loop [
frontFunc (first functions)
restFuncs (rest functions)
data initialData ]
(if frontFunc
(recur (first restFuncs) (rest restFuncs) (frontFunc data) )
data )
) )
What's the best way to go about this?
I must admit I'm really new to clojure and I might be missing the point here completely, but can't this just be done using comp and apply?
user> (defn fn1 [x] (+ 2 x))
user> (defn fn2 [x] (/ x 3))
user> (defn fn3 [x] (* 1.2 x))
user> (defn pipe [initial-data my-functions] ((apply comp my-functions) initial-data))
user> (pipe 2 [fn1 fn2 fn3])
2.8
You can do this with a plain old reduce:
(defn pipe [x fs] (reduce (fn [acc f] (f acc)) x fs))
That can be shortened to:
(defn pipe [x fs] (reduce #(%2 %1) x fs))
Used like this:
user> (pipe [1 2 3] [#(conj % 77) rest reverse (partial map inc) vec])
[78 4 3]
If functions is a sequence of functions, you can reduce it using comp to get a composed function. At a REPL:
user> (def functions (list #(* % 5) #(+ % 1) #(/ % 3)))
#'user/my-list
user> ((reduce comp functions) 9)
20
apply also works in this case because comp takes a variable number of arguments:
user> (def functions (list #(* % 5) #(+ % 1) #(/ % 3)))
#'user/my-list
user> ((apply comp functions) 9)
20