I'm trying to build a set of functions to compare sentences to one another. So I wrote a function called split-to-sentences that takes an input like this:
"This is a sentence. And so is this. And this one too."
and returns:
["This is a sentence" "And so is this" "And this one too."]
What I am struggling with is how to iterate over this vector and get the items that aren't the current value. I tried nosing around with drop and remove but haven't quite figured it out.
I guess one thing I could do is use first and rest in the loop and conj the previous value to the output of rest.
(remove #{current-value} sentences-vector)
Just use filter:
(filter #(not= current-value %) sentences-vector)
I believe you may want something like this function:
(defn without-each [x]
(map (fn [i] (concat (subvec x 0 i) (subvec x (inc i))))
(range (count x))))
Use it like this:
>>> (def x ["foo" "bar" "baz"])
>>> (without-each x)
==> (("bar" "baz") ("foo" "baz") ("foo" "bar"))
The returned elements are lazily concatenated, which is why they are not vectors. This is desirable, since true vector concatenation (e.g. (into a b)) is O(n).
Because subvec uses sharing with the original sequence this should not use an excessive amount of memory.
The trick is to pass your sentences twice into the reduce function...
(def sentences ["abcd" "efg" "hijk" "lmnop" "qrs" "tuv" "wx" "y&z"])
(reduce
(fn [[prev [curr & foll]] _]
(let [aren't-current-value (concat prev foll)]
(println aren't-current-value) ;use it here
[(conj prev curr) foll]))
[[] sentences]
sentences)
...once to see the following ones, and once to iterate.
You might consider using subvec or pop because both operate very quickly on vectors.
Related
My try:
(defn inc-by-f [v]
map #(+ (first v) %) v)
EDIT
(The original question was stupid; I missed the parenthesis. I am still leaving the question, so that perhaps I learn some new ways to deal with it.)
(defn inc-by-f [v]
(map #(+ (first v) %) v))
What other cool “Clojure” ways to achieve the desired result?
"Cooler" way (answered later than https://stackoverflow.com/a/62536870/823470 by Bob Jarvis):
(defn inc-by-f
[[v1 :as v]]
(map (partial + v1) v))
This uses
sequential destructuring to extract the first element of the input vector while still maintaining a reference to the entire vector using :as
partial to avoid the need for an anonymous function literal, which increases readability in some peoples' opinion (count me in!)
Note that the vector destructuring is only useful if the increment value is in a place that is easily accessible by destructuring. It could work if the value was the "2nd in the vector" ([_ v2 :as v]), for example, but not if the value was "the maximum element in the vector". In that case, the max would have to be obtained explicitly, e.g.
(defn inc-by-max
[v]
(map (partial + (apply max v)) v))
Also note that anonymous functions are evaluated on each call, unlike partial which is handed all its arguments and then those no longer need to be evaluated. In other words, if we take the first element of a 1000-element v inside the anonymous function, that will result in 1000 calls to first, instead of just one if we get the first element and pass it to partial. Demonstration:
user=> (dorun (map #(+ (do (println "called") 42) %) (range 3)))
called
called
called
=> nil
user=> (dorun (map (partial + (do (println "called") 42)) (range 3)))
called
=> nil
You're missing parentheses around the map invocation. The following works as you expect:
(defn inc-by-f [v]
(map #(+ (first v) %) v))
(defn shuffle-letters
[word]
(let [letters (clojure.string/split word #"")
shuffled-letters (shuffle letters)]
(clojure.string/join "" shuffled-letters)))
But if you put in "test" you can get "test" back sometimes.
How to modify the code to be sure that output will never be equal to input.
I feel embarrassing, I can solve it easily in Python, but Clojure is so different to me...
Thank you.
P.S. I thing we can close the topic now... The loop is in fact all I needed...
You can use loop. When the shuffled letters are the same as the original, recur back up to the start of the loop:
(defn shuffle-letters [word]
(let [letters (clojure.string/split word #"")]
(loop [] ; Start a loop
(let [shuffled-letters (shuffle letters)]
(if (= shuffled-letters letters) ; Check if they're equal
(recur) ; If they're equal, loop and try again
(clojure.string/join "" shuffled-letters)))))) ; Else, return the joined letters
There's many ways this could be written, but this is I think as plain as it gets. You could also get rid of the loop and make shuffle-letters itself recursive. This would lead to unnecessary work though. You could also use let-fn to create a local recursive function, but at that point, loop would likely be cleaner.
Things to note though:
Obviously, if you try to shuffle something like "H" or "HH", it will get stuck and loop forever since no amount of shuffling will cause them to differ. You could do a check ahead of time, or add a parameter to loop that limits how many times it tries.
This will actually make your shuffle less random. If you disallow it from returning the original string, you're reducing the amount of possible outputs.
The call to split is unnecessary. You can just call vec on the string:
(defn shuffle-letters [word]
(let [letters (vec word)]
(loop []
(let [shuffled-letters (shuffle letters)]
(if (= shuffled-letters letters)
(recur)
(clojure.string/join "" shuffled-letters))))))
Here's another solution (using transducers):
(defn shuffle-strict [s]
(let [letters (seq s)
xform (comp (map clojure.string/join)
(filter (fn[v] (not= v s))))]
(when (> (count (into #{} letters)) 1)
(first (eduction xform (iterate shuffle letters))))))
(for [_ (range 20)]
(shuffle-strict "test"))
;; => ("etts" "etts" "stte" "etts" "sett" "tste" "tste" "sett" "ttse" "sett" "ttse" "tset" "stte" "ttes" "ttes" "stte" "stte" "etts" "estt" "stet")
(shuffle-strict "t")
;; => nil
(shuffle-strict "ttttt")
;; => nil
We basically create a lazy list of possible shuffles, and then we take the first of them to be different from the input. We also make sure that there are at least 2 different characters in the input, so as not to hang (we return nil here since you don't want to have the input string as a possible result).
If you want your function to return a sequence:
(defn my-shuffle [input]
(when (-> input set count (> 1))
(->> input
(iterate #(apply str (shuffle (seq %))))
(remove #(= input %)))))
(->> "abc" my-shuffle (take 5))
;; => ("acb" "cba" "bca" "acb" "cab")
(->> "bbb" my-shuffle (take 5))
;; => ()
So being new to Clojure and functional programming in general, I sometimes (to quote a book) "feel like your favourite tool has been taken from you". Trying to get a better grasp on this stuff I'm doing string manipulation problems.
So knowing the functional paradigm is all about recursion (and other things) I've been using tail recursive functions to do things I'd normally do with loops, then trying to implement using map or reduce. For those more experienced, does this sound like a sane thing to do?
I'm starting to get frustrated because I'm running into problems where I need to keep track of the index of each character when iterating over strings but that's proving difficult because reduce and map feel "isolated". I can't increment a value while a string is being reduced...
Is there something I'm missing; a function for exactly this.. Or can this specific case just not be implemented using these core functions? Or is the way I'm going about it just wrong and un-functional-like which is why I'm stuck?
Here's an example I'm having:
This function takes five separate strings then using reduce, builds a vector containing all the characters at position char-at in each string. How could you change this code so that char-at (in the anonymous function) gets incremented after each string gets passed? This is what I mean by it feels "isolated" and I don't know how to get around this.
(defn new-string-from-five
"This function takes a character at position char-at from each of the strings to make a new vector"
[five-strings char-at]
(reduce (fn [result string]
(conj result (get-char-at string char-at)))
[]
five-strings))
Old :
"abc" "def" "ghi" "jkl" "mno" -> [a d g j m] (always taken from index 0)
Modified :
"abc" "def" "ghi" "jkl" "mno" ->[a e i j n] (index gets incremented and loops back around)
I don't think there's anything insane about writing string manip functions to get your head around things, though it's certainly not the only way. I personally found clojure for the brave and true, 4clojure, and the clojurians slack channel most helpful when learning clojure.
On your question, probably the most common thing to do would be to add an index to your initial collection (in this case a string) using map-indexed
(user=> (map-indexed vector [9 9 9])
([0 9] [1 9] [2 9])
So for your example
(defn new-string-from-five
"This function takes a character at position char-at from each of the strings to make a new vector"
[five-strings char-at]
(reduce (fn [result [string-idx string]]
(conj result (get-char-at string (+ string-idx char-at))))
[]
(map-indexed vector five-strings)))
But how would I build map-indexed? Well
Non-lazily:
(defn map-indexed' [f coll]
(loop [idx 0
res []
rest-coll coll]
(if (empty? rest-coll)
res
(recur (inc idx) (conj res (f idx (first rest-coll))) (rest rest-coll)))))
Lazily (recommend not trying to understand this yet):
(defn map-indexed' [f coll]
(letfn [(map-indexed'' [idx f coll]
(if (empty? coll)
'()
(lazy-seq (conj (map-indexed'' (inc idx) f (rest coll)) (f idx (first coll))))))]
(map-indexed'' 0 f coll)))
You can use reductions:
(defn new-string-from-five
[five-strings]
(->> five-strings
(reductions
(fn [[res i] string]
[(get-char-at string i) (inc i)])
[nil 0])
rest
(mapv first)))
But in this case, I think map, mapv or map-indexed is cleaner. E.g.
(map-indexed
(fn [i s] (get-char-at s i))
["abc" "def" "ghi" "jkl" "mno"])
I'm trying to split a string into n chunks of variable sizes.
As input I have a seq of the sizes of the different chunks:
(10 6 12)
And a string:
"firstchunksecondthirdandlast"
I would like to split the string using the sizes as so:
("firstchunk" "second" "thirdandlast")
As a newbie I still have a hard time wrapping my head around the most idiomatic way to do this.
Here is two ways to do this:
One version uses reduce which you can use very often if you want to carry some kind of state (here: The index where you're currently at). The reduce would need a second fn call applied to it to have the result in your form.
;; Simply take second as a result:
(let [s "firstchunksecondthirdandlast"]
(reduce
(fn [[s xs] len]
[(subs s len)
(conj xs (subs s 0 len))])
[s []]
[10 6 12]))
The other version first builds up the indices of start-end and then uses destructing to get them out of the sequence:
(let [s "firstchunksecondthirdandlast"]
(mapv
(fn [[start end]]
(subs s start end))
;; Build up the start-end indices:
(partition 2 1 (reductions + (cons 0 [10 6 12])))))
Note that neither of these are robust and throw ugly errors if the string it too short. So you should be much more defensive and use some asserts.
Here is my go at the problem (still a beginner with the language), it uses an anonymous function and recursion until the chunks list is empty. I have found this pattern useful when wanting to accumulate results until a condition is met.
str-orig chunks-orig [] sets the initial arguments for the anonymous function: the full string, full list of chunks and an empty vec to collect results into.
(defn split-chunks [str-orig chunks-orig]
((fn [str chunks result]
(if-let [len (first chunks)] (recur
(subs str len)
(rest chunks)
(conj result (subs str 0 len)))
result))
str-orig chunks-orig []))
(split-chunks "firstchunksecondthirdandlast" '(10 6 12))
; ["firstchunk" "second" "thirdandlast"]
How do I get the index of any of the elements on a list of strings as so:
(list "a" "b" "c")
For example, (function "a") would have to return 0, (function "b") 1, (function "c") 2 and so on.
and... will it be better to use any other type of collection if dealing with a very long list of data?
Christian Berg's answer is fine. Also it is possible to just fall back on Java's indexOf method of class String:
(.indexOf (apply str (list "a" "b" "c")) "c")
; => 2
Of course, this will only work with lists (or more general, seqs) of strings (of length 1) or characters.
A more general approach would be:
(defn index-of [e coll] (first (keep-indexed #(if (= e %2) %1) coll)))
More idiomatic would be to lazily return all indexes and only ask for the ones you need:
(defn indexes-of [e coll] (keep-indexed #(if (= e %2) %1) coll))
(first (indexes-of "a" (list "a" "a" "b"))) ;; => 0
I'm not sure I understand your question. Do you want the nth letter of each of the strings in a list? That could be accomplished like this:
(map #(nth % 1) (list "abc" "def" "ghi"))
The result is:
(\b \e \h)
Update
After reading your comment on my initial answer, I assume your question is "How do I find the index (position) of a search string in a list?"
One possibility is to search for the string from the beginning of the list and count all the entries you have to skip:
(defn index-of [item coll]
(count (take-while (partial not= item) coll)))
Example: (index-of "b" (list "a" "b" "c")) returns 1.
If you have to do a lot of look-ups, it might be more efficient to construct a hash-map of all strings and their indices:
(def my-list (list "a" "b" "c"))
(def index-map (zipmap my-list (range)))
(index-map "b") ;; returns 1
Note that with the above definitions, when there are duplicate entries in the list index-of will return the first index, while index-map will return the last.
You can use the Java .indexOf method reliably for strings and vectors, but not for lists. This solution should work for all collections, I think:
(defn index-of
"Clojure doesn't have an index-of function. The Java .indexOf method
works reliably for vectors and strings, but not for lists. This solution
works for all three."
[item coll]
(let [v (if
(or (vector? coll) (string? coll))
coll
(apply vector coll))]
(.indexOf coll item)))
Do you mean, how do you get the nth element of a list?
For example, if you want to get the 2nd element on the list (with zero-based index):
(nth (list "a" "b" "c") 2)
yields
"c"
Cat-skinning is fun. Here's a low-level approach.
(defn index-of
([item coll]
(index-of item coll 0))
([item coll from-idx]
(loop [idx from-idx coll (seq (drop from-idx coll))]
(if coll
(if (= item (first coll))
idx
(recur (inc idx) (next coll)))
-1))))
This is a Lispy answer, I suspect those expert in Clojure could do it better:
(defn position
"Returns the position of elt in this list, or nil if not present"
([list elt n]
(cond
(empty? list) nil
(= (first list) elt) n
true (position (rest list) elt (inc n))))
([list elt]
(position list elt 0)))
You seem to want to use the nth function.
From the docs for that function:
clojure.core/nth
([coll index] [coll index not-found])
Returns the value at the index. get returns nil if index out of
bounds, nth throws an exception unless not-found is supplied. nth
also works for strings, Java arrays, regex Matchers and Lists, and,
in O(n) time, for sequences.
That last clause means that in practice, nth is slower for elements "farther off" in sequences, with no guarantee to work quicker for collections that in principle support faster access (~ O(n)) to indexed elements. For (clojure) sequences, this makes sense; the clojure seq API is based on the linked-list API and in a linked list, you can only access the nth item by traversing every item before it. Keeping that restriction is what makes concrete list implementations interchangeable with lazy sequences.
Clojure collection access functions are generally designed this way; functions that do have significantly better access times on specific collections have separate names and cannot be used "by accident" on slower collections.
As an example of a collection type that supports fast "random" access to items, clojure vectors are callable; (vector-collection index-number) yields the item at index index-number - and note that clojure seqs are not callable.
I know this question has been answered a million time but here is a recursive solution that leverages deconstructing.
(defn index-of-coll
([coll elm]
(index-of-coll coll elm 0))
([[first & rest :as coll] elm idx]
(cond (empty? coll) -1
(= first elm) idx
:else (recur rest elm (inc idx)))))
(defn index-of [item items]
(or (last (first (filter (fn [x] (= (first x) item))
(map list items (range (count items))))))
-1))
seems to work - but I only have like three items in my list