I'm trying to build an XML structure using the internal data types from BaseX from Clojure.
(defn basex-elem [token-name dict]
(let [elem (org.basex.query.item.FElem.
(org.basex.query.item.QNm. token-name))]
(for [[k v] dict]
(do
(println "THIS IS REACHED")
(let [k-name (org.basex.query.item.QNm. (.getName k))
k-attr (org.basex.query.item.FAttr.
k-name
org.basex.util.Token/token v))]
(.add elem k-attr))))
elem))
When using this to cry to create an element, "THIS IS REACHED" is never printed:
(def test-elem (basex-elem "element-name" {:key1 "value1", :key2 "value2"}))
; => #'user/test-elem
And thus the value comes back without any attributes:
test-elem
; => #<FElem <element-name/>>
But adding attributes works otherwise.
(.add test-elem
(org.basex.query.item.FAttr.
(org.basex.query.item.QNm. "foo")
(org.basex.util.Token/token "bar")))
; => #<FElem <element-name foo="bar"/>>
Thus, presumably I'm doing something wrong with the loop. Any pointers?
for is not a loop construct in clojure, rather it's a list comprehension and produces a lazy sequence.
Use doseq instead when side effects are intended.
Related
(defn shuffle-letters
[word]
(let [letters (clojure.string/split word #"")
shuffled-letters (shuffle letters)]
(clojure.string/join "" shuffled-letters)))
But if you put in "test" you can get "test" back sometimes.
How to modify the code to be sure that output will never be equal to input.
I feel embarrassing, I can solve it easily in Python, but Clojure is so different to me...
Thank you.
P.S. I thing we can close the topic now... The loop is in fact all I needed...
You can use loop. When the shuffled letters are the same as the original, recur back up to the start of the loop:
(defn shuffle-letters [word]
(let [letters (clojure.string/split word #"")]
(loop [] ; Start a loop
(let [shuffled-letters (shuffle letters)]
(if (= shuffled-letters letters) ; Check if they're equal
(recur) ; If they're equal, loop and try again
(clojure.string/join "" shuffled-letters)))))) ; Else, return the joined letters
There's many ways this could be written, but this is I think as plain as it gets. You could also get rid of the loop and make shuffle-letters itself recursive. This would lead to unnecessary work though. You could also use let-fn to create a local recursive function, but at that point, loop would likely be cleaner.
Things to note though:
Obviously, if you try to shuffle something like "H" or "HH", it will get stuck and loop forever since no amount of shuffling will cause them to differ. You could do a check ahead of time, or add a parameter to loop that limits how many times it tries.
This will actually make your shuffle less random. If you disallow it from returning the original string, you're reducing the amount of possible outputs.
The call to split is unnecessary. You can just call vec on the string:
(defn shuffle-letters [word]
(let [letters (vec word)]
(loop []
(let [shuffled-letters (shuffle letters)]
(if (= shuffled-letters letters)
(recur)
(clojure.string/join "" shuffled-letters))))))
Here's another solution (using transducers):
(defn shuffle-strict [s]
(let [letters (seq s)
xform (comp (map clojure.string/join)
(filter (fn[v] (not= v s))))]
(when (> (count (into #{} letters)) 1)
(first (eduction xform (iterate shuffle letters))))))
(for [_ (range 20)]
(shuffle-strict "test"))
;; => ("etts" "etts" "stte" "etts" "sett" "tste" "tste" "sett" "ttse" "sett" "ttse" "tset" "stte" "ttes" "ttes" "stte" "stte" "etts" "estt" "stet")
(shuffle-strict "t")
;; => nil
(shuffle-strict "ttttt")
;; => nil
We basically create a lazy list of possible shuffles, and then we take the first of them to be different from the input. We also make sure that there are at least 2 different characters in the input, so as not to hang (we return nil here since you don't want to have the input string as a possible result).
If you want your function to return a sequence:
(defn my-shuffle [input]
(when (-> input set count (> 1))
(->> input
(iterate #(apply str (shuffle (seq %))))
(remove #(= input %)))))
(->> "abc" my-shuffle (take 5))
;; => ("acb" "cba" "bca" "acb" "cab")
(->> "bbb" my-shuffle (take 5))
;; => ()
So being new to Clojure and functional programming in general, I sometimes (to quote a book) "feel like your favourite tool has been taken from you". Trying to get a better grasp on this stuff I'm doing string manipulation problems.
So knowing the functional paradigm is all about recursion (and other things) I've been using tail recursive functions to do things I'd normally do with loops, then trying to implement using map or reduce. For those more experienced, does this sound like a sane thing to do?
I'm starting to get frustrated because I'm running into problems where I need to keep track of the index of each character when iterating over strings but that's proving difficult because reduce and map feel "isolated". I can't increment a value while a string is being reduced...
Is there something I'm missing; a function for exactly this.. Or can this specific case just not be implemented using these core functions? Or is the way I'm going about it just wrong and un-functional-like which is why I'm stuck?
Here's an example I'm having:
This function takes five separate strings then using reduce, builds a vector containing all the characters at position char-at in each string. How could you change this code so that char-at (in the anonymous function) gets incremented after each string gets passed? This is what I mean by it feels "isolated" and I don't know how to get around this.
(defn new-string-from-five
"This function takes a character at position char-at from each of the strings to make a new vector"
[five-strings char-at]
(reduce (fn [result string]
(conj result (get-char-at string char-at)))
[]
five-strings))
Old :
"abc" "def" "ghi" "jkl" "mno" -> [a d g j m] (always taken from index 0)
Modified :
"abc" "def" "ghi" "jkl" "mno" ->[a e i j n] (index gets incremented and loops back around)
I don't think there's anything insane about writing string manip functions to get your head around things, though it's certainly not the only way. I personally found clojure for the brave and true, 4clojure, and the clojurians slack channel most helpful when learning clojure.
On your question, probably the most common thing to do would be to add an index to your initial collection (in this case a string) using map-indexed
(user=> (map-indexed vector [9 9 9])
([0 9] [1 9] [2 9])
So for your example
(defn new-string-from-five
"This function takes a character at position char-at from each of the strings to make a new vector"
[five-strings char-at]
(reduce (fn [result [string-idx string]]
(conj result (get-char-at string (+ string-idx char-at))))
[]
(map-indexed vector five-strings)))
But how would I build map-indexed? Well
Non-lazily:
(defn map-indexed' [f coll]
(loop [idx 0
res []
rest-coll coll]
(if (empty? rest-coll)
res
(recur (inc idx) (conj res (f idx (first rest-coll))) (rest rest-coll)))))
Lazily (recommend not trying to understand this yet):
(defn map-indexed' [f coll]
(letfn [(map-indexed'' [idx f coll]
(if (empty? coll)
'()
(lazy-seq (conj (map-indexed'' (inc idx) f (rest coll)) (f idx (first coll))))))]
(map-indexed'' 0 f coll)))
You can use reductions:
(defn new-string-from-five
[five-strings]
(->> five-strings
(reductions
(fn [[res i] string]
[(get-char-at string i) (inc i)])
[nil 0])
rest
(mapv first)))
But in this case, I think map, mapv or map-indexed is cleaner. E.g.
(map-indexed
(fn [i s] (get-char-at s i))
["abc" "def" "ghi" "jkl" "mno"])
I'd like to create a list depending on the results of some functions. In Java (my background), I'd do something like:
List<String> messages = ...
if(condition 1)
messages.add(message 1);
if(condition 2)
messages.add(message 2);
...
if(condition N)
messages.add(message N);
In clojure, I think I'll need to create a list using let like the following (just dummy example):
(let [result
(vec
(if (= 1 1) "message1" "message2")
(if (= 1 0) "message3" "message4"))]
result)
I've also checked cond but I need to be appending the elements to the list considering all the validations (and cond breaks after one condition is satisfied).
Which way should I follow to achieve this?
If you want them to be conditionally added like in the Java example, you could use cond->, which does not short circuit:
(let [messages []]
(cond-> messages ; Conditionally thread through conj
(= 1 1) (conj "Message1")
(= 0 1) (conj "Message2")
(= 0 0) (conj "Message3")))
=> ["Message1" "Message3"]
If you want to conditionally add one or the other like your second example suggests however, you could just use plain conj with some if expressions:
(let [messages []]
(conj messages
(if (= 1 1) "Message1" "Message2")
(if (= 0 1) "Message3" "Message4")))
=> ["Message1" "Message4"]
And I'll note that your original attempt almost worked. Instead of vec, you could have used vector, or just a vector literal:
(let [messages [(if (= 1 1) "Message1" "Message2")
(if (= 1 0) "Message3" "Message4")]]
messages)
=> ["Message1" "Message4"]
Although, this is would only be beneficial if you didn't already have a messages populated that you wanted to add to. If that was the case, you'd have to use concat or into:
(let [old-messages ["old stuff"]
messages [(if (= 1 1) "Message1" "Message2")
(if (= 1 0) "Message3" "Message4")]]
(into old-messages messages))
=> ["old stuff" "Message1" "Message4"]
Take a look at cond->.
For example, your Java example could be written like:
(cond-> (some-fn-returning-messages)
(= 1 1) (conj "message1")
(= 1 2) (conj "message2")
...
(= 1 n) (conj "messagen"))
I see several answers pointing to the cond-> macro which appears to match your request most closely in that it is nearest to the style outlined in your question.
Depending on the number of conditions you have, your question seems like a good candiate for simply using filter.
(def nums (range 10))
(filter #(or (even? %) (= 7 %)) nums)
If you have a bunch of conditions (functions), and "or-ing" them together would be unwieldy, you can use some-fn.
Numbers from 0-19 that are either even, divisible by 7, greater than 17, or exactly equal to 1. Stupid example I know, just wanted to show a simple use-case.
(filter (some-fn
even?
#(zero? (mod % 7))
#(> % 17)
#(= 1 %))
(range 20))
Looks like everyone had the same idea! I did mine with keywords:
(ns tst.demo.core
(:use tupelo.core demo.core tupelo.test))
(defn accum
[conds]
(cond-> [] ; append to the vector in order 1,2,3
(contains? conds :cond-1) (conj :msg-1)
(contains? conds :cond-2) (conj :msg-2)
(contains? conds :cond-3) (conj :msg-3)))
(dotest
(is= [:msg-1] (accum #{:cond-1}))
(is= [:msg-1 :msg-3] (accum #{:cond-1 :cond-3}))
(is= [:msg-1 :msg-2] (accum #{:cond-2 :cond-1}))
(is= [:msg-2 :msg-3] (accum #{:cond-2 :cond-3}))
(is= [:msg-1 :msg-2 :msg-3] (accum #{:cond-3 :cond-2 :cond-1 })) ; note sets are unsorted
)
If you want more power, you can use cond-it-> from the Tupelo library. It threads the target value through both the condition and the action forms, and uses the special symbol it to show where the threaded value is to be placed. This modified example shows a 4th condition where, "msg-3 is jealous of msg-1" and always boots it out of the result:
(ns tst.demo.core
(:use tupelo.core demo.core tupelo.test))
(defn accum
[conds]
(cond-it-> #{} ; accumulate result in a set
(contains? conds :cond-1) (conj it :msg-1)
(contains? conds :cond-2) (conj it :msg-2)
(contains? conds :cond-3) (conj it :msg-3)
(contains? it :msg-3) (disj it :msg-1) ; :msg-3 doesn't like :msg-1
))
; remember that sets are unsorted
(dotest
(is= #{:msg-1} (accum #{:cond-1}))
(is= #{:msg-3} (accum #{:cond-1 :cond-3}))
(is= #{:msg-1 :msg-2} (accum #{:cond-2 :cond-1}))
(is= #{:msg-2 :msg-3} (accum #{:cond-2 :cond-3}))
(is= #{:msg-2 :msg-3} (accum #{:cond-3 :cond-2 :cond-1 }))
)
Not necessarily relevant to your use case, and certainly not a mainstream solution, but once in a while I like cl-format's conditional expressions:
(require '[clojure.pprint :refer [cl-format]])
(require '[clojure.data.generators :as g])
(cl-format nil
"~:[He~;She~] ~:[did~;did not~] ~:[thought about it~;care~]"
(g/boolean) (g/boolean) (g/boolean))
A typical case would be validating a piece of data to produce an error list.
I would construct a table that maps condition to message:
(def error->message-table
{condition1 message1
condition2 message2
...})
Note that the conditions are functions. Since we can never properly recognise functions by value, you could make this table a sequence of pairs.
However you implement the table, all we have to do is collect the messages for the predicates that apply:
(defn messages [stuff]
(->> error->message-table
(filter (fn [pair] ((first pair) stuff)))
(map second)))
Without a coherent example, it's difficult to be more explicit.
First-class functions and the packaged control structures within filter and map give us the means to express the algorithm briefly and clearly, isolating the content into a data structure.
I developed a function in clojure to fill in an empty column from the last non-empty value, I'm assuming this works, given
(:require [flambo.api :as f])
(defn replicate-val
[ rdd input ]
(let [{:keys [ col ]} input
result (reductions (fn [a b]
(if (empty? (nth b col))
(assoc b col (nth a col))
b)) rdd )]
(println "Result type is: "(type result))))
Got this:
;=> "Result type is: clojure.lang.LazySeq"
The question is how do I convert this back to type JavaRDD, using flambo (spark wrapper)
I tried (f/map result #(.toJavaRDD %)) in the let form to attempt to convert to JavaRDD type
I got this error
"No matching method found: map for class clojure.lang.LazySeq"
which is expected because result is of type clojure.lang.LazySeq
Question is how to I make this conversion, or how can I refactor the code to accomodate this.
Here is a sample input rdd:
(type rdd) ;=> "org.apache.spark.api.java.JavaRDD"
But looks like:
[["04" "2" "3"] ["04" "" "5"] ["5" "16" ""] ["07" "" "36"] ["07" "" "34"] ["07" "25" "34"]]
Required output is:
[["04" "2" "3"] ["04" "2" "5"] ["5" "16" ""] ["07" "16" "36"] ["07" "16" "34"] ["07" "25" "34"]]
Thanks.
First of all RDDs are not iterable (don't implement ISeq) so you cannot use reductions. Ignoring that a whole idea of accessing previous record is rather tricky. First of all you cannot directly access values from an another partition. Moreover only transformations which don't require shuffling preserve order.
The simplest approach here would be to use Data Frames and Window functions with explicit order but as far as I know Flambo doesn't implement required methods. It is always possible to use raw SQL or access Java/Scala API but if you want to avoid this you can try following pipeline.
First lets create a broadcast variable with last values per partition:
(require '[flambo.broadcast :as bd])
(import org.apache.spark.TaskContext)
(def last-per-part (f/fn [it]
(let [context (TaskContext/get) xs (iterator-seq it)]
[[(.partitionId context) (last xs)]])))
(def last-vals-bd
(bd/broadcast sc
(into {} (-> rdd (f/map-partitions last-per-part) (f/collect)))))
Next some helper for the actual job:
(defn fill-pair [col]
(fn [x] (let [[a b] x] (if (empty? (nth b col)) (assoc b col (nth a col)) b))))
(def fill-pairs
(f/fn [it] (let [part-id (.partitionId (TaskContext/get)) ;; Get partion ID
xs (iterator-seq it) ;; Convert input to seq
prev (if (zero? part-id) ;; Find previous element
(first xs) ((bd/value last-vals-bd) part-id))
;; Create seq of pairs (prev, current)
pairs (partition 2 1 (cons prev xs))
;; Same as before
{:keys [ col ]} input
;; Prepare mapping function
mapper (fill-pair col)]
(map mapper pairs))))
Finally you can use fill-pairs to map-partitions:
(-> rdd (f/map-partitions fill-pairs) (f/collect))
A hidden assumption here is that order of the partitions follows order of the values. It may or may not be in general case but without explicit ordering it is probably the best you can get.
Alternative approach is to zipWithIndex, swap order of values and perform join with offset.
(require '[flambo.tuple :as tp])
(def rdd-idx (f/map-to-pair (.zipWithIndex rdd) #(.swap %)))
(def rdd-idx-offset
(f/map-to-pair rdd-idx
(fn [t] (let [p (f/untuple t)] (tp/tuple (dec' (first p)) (second p))))))
(f/map (f/values (.rightOuterJoin rdd-idx-offset rdd-idx)) f/untuple)
Next you can map using similar approach as before.
Edit
Quick note on using atoms. What is the problem there is lack of referential transparency and that you're leveraging incidental properties of a given implementation not a contract. There is nothing in the map semantics that requires elements to be processed in a given order. If internal implementation changes it may be no longer valid. Using Clojure
(defn foo [x] (let [aa #a] (swap! a (fn [&args] x)) aa))
(def a (atom 0))
(map foo (range 1 20))
compared to:
(def a (atom 0))
(pmap foo (range 1 20))
I'm getting unexpected behaviour in some monads I'm writing.
I've created a parser-m monad with
(def parser-m (state-t maybe-m))
which is pretty much the example given everywhere (here, here and here)
I'm using m-plus to act as a kind of fall-through query mechanism, in my case, it first reads values from a cache (database), if that returns nil, the next method is to read from "live" (a REST call).
However, the second value in the m-plus list is always called, even though its value is disgarded (if the cache hit was good) and the final return is that of the first monadic function.
Here's a cutdown version of the issue i'm seeing, and some solutions I found, but I don't know why.
My questions are:
Is this expected behaviour or a bug in m-plus? i.e. will the 2nd method in a m-plus list always be evaluated even if the first item returns a value?
Minor in comparison to the above, but if i remove the call
_ (fetch-state) from checker, when i evaluate that method, it
prints out the messages for the functions the m-plus is calling
(when i don't think it should). Is this also a bug?
Here's a cut-down version of the code in question highlighting the problem. It simply checks key/value pairs passed in are same as the initial state values, and updates the state to mark what it actually ran.
(ns monads.monad-test
(:require [clojure.algo.monads :refer :all]))
(def parser-m (state-t maybe-m))
(defn check-k-v [k v]
(println "calling with k,v:" k v)
(domonad parser-m
[kv (fetch-val k)
_ (do (println "k v kv (= kv v)" k v kv (= kv v)) (m-result 0))
:when (= kv v)
_ (do (println "passed") (m-result 0))
_ (update-val :ran #(conj % (str "[" k " = " v "]")))
]
[k v]))
(defn filler []
(println "filler called")
(domonad parser-m
[_ (fetch-state)
_ (do (println "filling") (m-result 0))
:when nil]
nil))
(def checker
(domonad parser-m
[_ (fetch-state)
result (m-plus
;; (filler) ;; intitially commented out deliberately
(check-k-v :a 1)
(check-k-v :b 2)
(check-k-v :c 3))]
result))
(checker {:a 1 :b 2 :c 3 :ran []})
When I run this as is, the output is:
> (checker {:a 1 :b 2 :c 3 :ran []})
calling with k,v: :a 1
calling with k,v: :b 2
calling with k,v: :c 3
k v kv (= kv v) :a 1 1 true
passed
k v kv (= kv v) :b 2 2 true
passed
[[:a 1] {:a 1, :b 2, :c 3, :ran ["[:a = 1]"]}]
I don't expect the line k v kv (= kv v) :b 2 2 true to show at all. The final result is the value returned from the first function to m-plus, as I expect, but I don't expect the second function to even be called.
Now, I've found if I pass a filler into m-plus that does nothing (i.e. uncomment the (filler) line) then the output is correct, the :b value isn't evaluated.
If I don't have the filler method, and make the first method test fail (i.e. change it to (check-k-v :a 2) then again everything is good, I don't get a call to check :c, only a and b are tested.
From my understanding of what the state-t maybe-m transformation is giving me, then the m-plus function should look like:
(defn m-plus
[left right]
(fn [state]
(if-let [result (left state)]
result
(right state))))
which would mean that right isn't called unless left returns nil/false.
Edit:
After looking at state-t and maybe-m source, the m-plus looks more like:
(fn [& statements]
(fn [state]
(apply (fn [& xs]
(first (drop-while nil? xs)))
(map #(% state) statements))))
But the principle is the same, (first (drop-while nil? ...) should only execute over the items that return a valid value.
I'd be interested to know if my understanding is correct or not, and why I have to put the filler method in to stop the extra evaluation (whose effects I don't want to happen).
Edit:
If I switch over to using Jim Duey's hand written implementation of parser-m (from his excellent blogs), there is no evaluation of the second function in m-plus, which seems to imply the transformation monad is breaking m-plus. However, even in this implementation, if I remove the initial (fetch-state) call in the checker function, the domonad definition causes the output of the creation of the m-plus functions, suggesting something going on in domonad's implementation I'm not expecting.
Apologies for the long winded post!