Howto find a listitem which contains a specific substring - clojure

I have a list of strings, fx '("abc" "def" "gih") and i would like to be able to search the list for any items containing fx "ef" and get the item or index returned.
How is this done?

Combining filter and re-find can do this nicely.
user> (def fx '("abc" "def" "gih"))
#'user/fx
user> (filter (partial re-find #"ef") fx)
("def")
user> (filter (partial re-find #"a") fx)
("abc")
In this case I like to combine them with partial though defining an anonymous function works fine in that case as well. It is also useful to use re-pattern if you don't know the search string in advance:
user> (filter (partial re-find (re-pattern "a")) fx)
("abc")

If you want to retrieve all the indexes of the matching positions along with the element you can try this:
(filter #(re-find #"ef" (second %)) (map-indexed vector '("abc" "def" "gih")))
=>([1 "def"])
map-indexed vector generates an index/value lazy sequence
user> (map-indexed vector '("abc" "def" "gih"))
([0 "abc"] [1 "def"] [2 "gih"])
Which you can then filter using a regular expression against the second element of each list member.
#(re-find #"ef" (second %))

Just indices:
Lazily:
(keep-indexed #(if (re-find #"ef" %2)
%1) '("abc" "def" "gih"))
=> (1)
Using loop/recur
(loop [[str & strs] '("abc" "def" "gih")
idx 0
acc []]
(if str
(recur strs
(inc idx)
(cond-> acc
(re-find #"ef" str) (conj idx)))
acc))
For just the element, refer to Arthur Ulfeldts answer.

Here is a traditional recursive definition that returns the index. It's easy to modify to return the corresponding string as well.
(defn strs-index [re lis]
(let [f (fn [ls n]
(cond
(empty? ls) nil
(re-find re (first ls)) n
:else (recur (rest ls) (inc n))))]
(f lis 0)))
user=> (strs-index #"de" ["abc" "def" "gih"])
1
user=> (strs-index #"ih" ["abc" "def" "gih"])
2
user=> (strs-index #"xy" ["abc" "def" "gih"])
nil
(Explanation: The helper function f is defined as a binding in let, and then is called at the end. If the sequence of strings passed to it is not empty, it searches for the regular expression in the first element of the sequence and returns the index if it finds the string. This uses the fact that re-find's result counts as true unless it fails, in which case it returns nil. If the previous steps don't succeed, the function starts over with the rest of the sequence and an incremented index. If it gets to the end of the sequence, it returns nil.)

Related

clojure.lang.LazySeq cannot be cast to java.lang.CharSequence

I am writing a function that, for any given string, replaces any digits within that String with the same number of '.' characters.
Examples:
AT2X -> AT..X
QW3G45 -> QW...G.........
T3Z1 -> T...Z.
I've written the following Clojure function but I am getting an error I don't quite understand:
java.lang.ClassCastException: clojure.lang.LazySeq (in module: Unnamed Module) cannot be case to java.lang.Charsequence
I'm interpreting from the error that I need to force an evaluation of a lazy sequence back into a String (or CharSequence) but I can't figure out where to do so or if this is correct.
(defn dotify
;;Replaces digits with the same number of '.'s for use in traditional board formats
[FEN]
(let [values (doall (filter isDigit (seq FEN)))]
(fn [values]
(let [value (first values)]
(str/replace FEN value (fn dots [number]
(fn [s times]
(if (> times 0)
(recur (str s ".") (dec times)))) "" (Character/digit number 10)) value))
(recur (rest values))) values))
There is a standard clojure.string/replace function that may handle that case. Its last argument might be not just a string or a pattern but also a function that turns a found fragment into what you want.
Let's prepare such a function first:
(defn replacer [sum-str]
(let [num (read-string num-str)]
(apply str (repeat num \.))))
You may try it in this way:
user> (replacer "2")
..
user> (replacer "9")
.........
user> (replacer "22")
......................
user>
Now pass it into replace as follows:
user> (clojure.string/replace "a2b3c11" #"\d+" replacer)
a..b...c...........
Here's a way to do this using reduce:
(defn dotify [s]
(->> s
(reduce (fn [acc elem]
(if (Character/isDigit elem)
(let [dots (Integer/parseInt (str elem))]
(apply conj acc (repeat dots \.)))
(conj acc elem)))
[])
(apply str)))
(dotify "zx4g1z2h")
=> "zx....g.z..h"
And another version using mapcat:
(defn dotify-mapcat [s]
(apply str
(mapcat (fn [c]
(if (Character/isDigit c)
(repeat (Integer/parseInt (str c)) \.)
[c]))
s)))
There are some issues in your example:
Many of the internal forms are themselves functions, but it looks like you just want their bodies or implementations instead of wrapping them in functions.
It's hard to tell by the indentation/whitespace, but the entire function is just recur-ing, the fn above it is not being used or returned.
One of the arguments to str/replace is a function that returns a function.
It helps to break the problem down into smaller pieces. For one, you know you'll need to examine each character in a string and decide whether to just return it or expand it into a sequence of dots. So you can start with a function:
(defn expand-char [^Character c]
(if (Character/isDigit c)
(repeat (Integer/parseInt (str c)) \.)
[c]))
Then use that function that operates on one character at a time in a higher-order function that operates on the entire string:
(apply str (mapcat expand-char s))
=> "zx....g.z..h"
Note this is also ~5x faster than the examples above because of the ^Character type-hint in expand-char function.
You can do this with str/replace too:
(defn expand-char [s]
(if (Character/isDigit ^Character (first s))
(apply str (repeat (Integer/parseInt s) \.))
s))
(str/replace "zx4g1z2h" #"." expand-char)
=> "zx....g.z..h"

Clojure for loop not returning updated values of atom

I'm trying to write a function that counts the number of vowels and consonants in a given string. The return value is a map with two keys, vowels and consonants. The values for each respective key are simply the counts.
The function that I have been able to develop so far is
(defn count-vowels-consenants [s]
(let [m (atom {"vowels" 0 "consenants" 0})
v #{"a" "e" "i" "o" "u"}]
(for [xs s]
(if
(contains? v (str xs))
(swap! m update-in ["vowels"] inc)
(swap! m update-in ["consenants"] inc)
))
#m))
however (count-vowels-consenants "sldkfjlskjwe") returns {"vowels":0 "consenants": 0}
What am I doing wrong?
EDIT: changed my input from str to s as str is a function in Clojure.
I think for is lazy so you're not going to actually do anything until you try to realize it. I added a first onto the for loop which realized the list and resulted in an error which you made by overwriting the str function with the str string. Ideally, you would just do this without the atom rigmarole.
(defn count-vowels-consonants [s]
(let [v #{\a \e \i \o \u}
vowels (filter v s)
consonants (remove v s)]
{:consonants (count consonants)
:vowels (count vowels)}))
if the atom is what you want, then use doseq instead of for and it will update the atom for everything in the string. also make sure you don't overwrite the str function by using it in your function binding.
if this side effecting scheme is inevitable (for sume educational reason, i suppose) just replace for with doseq which is a side effecting eager equivalent of for
(by the way: there is a mistake in your initial code: you use str as an input param name, and then try to use it as a function. So you are shadowing the def from the clojure.core, just try to avoid using params named like the core functions):
(defn count-vowels-consenants [input]
(let [m (atom {"vowels" 0 "consenants" 0})
v #{"a" "e" "i" "o" "u"}]
(doseq [s input]
(if (contains? v (str s))
(swap! m update-in ["vowels"] inc)
(swap! m update-in ["consenants"] inc)))
#m))
#'user/count-vowels-consenants
user> (count-vowels-consenants "asdfg")
;; {"vowels" 1, "consenants" 4}
otherwise you could do something like this:
user> (reduce #(update %1
(if (#{\a \e \i \o \u} %2)
"vowels" "consonants")
(fnil inc 0))
{} "qwertyui")
;;{"consonants" 5, "vowels" 3}
or
user> (frequencies (map #(if (#{\a \e \i \o \u} %)
"vowels" "consonants")
"qwertyui"))
;;{"consonants" 5, "vowels" 3}
or this (if you're good with having true/false instead of "vowels/consonants"):
user> (frequencies (map (comp some? #{\a \e \i \o \u}) "qwertyui"))
;;{false 5, true 3}
for is lazy as mentioned by #Brandon H. You can use loop recur if you want. Here I change for with loop-recur.
(defn count-vowels-consenants [input]
(let [m (atom {"vowels" 0 "consenants" 0})
v #{"a" "e" "i" "o" "u"}]
(loop [s input]
(when (> (count s) 0)
(if
(contains? v (first (str s) ))
(swap! m update-in ["vowels"] inc)
(swap! m update-in ["consenants"] inc)
))
(recur (apply str (rest s))))
#m))
The question, and every extant answer, assumes that every character is a vowel or a consonant: not so. And even in ASCII, there are lower and upper case letters. I'd do it as follows ...
(defn count-vowels-consonants [s]
(let [vowels #{\a \e \i \o \u
\A \E \I \O \U}
classify (fn [c]
(if (Character/isLetter c)
(if (vowels c) :vowel :consonant)))]
(map-v count (dissoc (group-by classify s) nil))))
... where map-v is a function that map's the values of a map:
(defn map-v [f m] (reduce (fn [a [k v]] (assoc a k (f v))) {} m))
For example,
(count-vowels-consonants "s2a Boo!")
;{:vowel 3, :consonant 2}
This traverses the string just once.

Clojure zip function

I need to build a seq of seqs (vec of vecs) by combining first, second, etc elements of the given seqs.
After a quick searching and looking at the cheat sheet. I haven't found one and finished with writing my own:
(defn zip
"From the sequence of sequences return a another sequence of sequenses
where first result sequense consist of first elements of input sequences
second element consist of second elements of input sequenses etc.
Example:
[[:a 0 \\a] [:b 1 \\b] [:c 2 \\c]] => ([:a :b :c] [0 1 2] [\\a \\b \\c])"
[coll]
(let [num-elems (count (first coll))
inits (for [_ (range num-elems)] [])]
(reduce (fn [cols elems] (map-indexed
(fn [idx coll] (conj coll (elems idx))) cols))
inits coll)))
I'm interested if there is a standard method for this?
(apply map vector [[:a 0 \a] [:b 1 \b] [:c 2 \c]])
;; ([:a :b :c] [0 1 2] [\a \b \c])
You can use the variable arity of map to accomplish this.
From the map docstring:
... Returns a lazy sequence consisting of the result of applying f to
the set of first items of each coll, followed by applying f to the set
of second items in each coll, until any one of the colls is exhausted.
Any remaining items in other colls are ignored....
Kyle's solution is a great one and I see no reason why not to use it, but if you want to write such a function from scratch you could write something like the following:
(defn zip
([ret s]
(let [a (map first s)]
(if (every? nil? a)
ret
(recur (conj ret a) (map rest s)))))
([s]
(reverse (zip nil s))))

seq to vec conversion - Key must be integer

I want to get the indices of nil elements in a vector eg.
[1 nil 3 nil nil 4 3 nil] => [1 3 4 7]
(defn nil-indices [vec]
(vec (remove nil? (map
#(if (= (second %) nil) (first %))
(partition-all 2 (interleave (range (count vec)) vec)))))
)
Running this code results in
java.lang.IllegalArgumentException: Key must be integer
(NO_SOURCE_FILE:0)
If I leave out the (vec) call surrounding everything, it seems to work, but returns a sequence instead of a vector.
Thank you!
Try this instead:
(defn nil-indices [v]
(vec (remove nil? (map
#(if (= (second %) nil) (first %))
(partition-all 2 (interleave (range (count v)) v))))))
Clojure is a LISP-1: It has a single namespace for both functions and data, so when you called (vec ...), you were trying to pass your result sequence to your data as a parameter, not to the standard-library vec function.
See other answer for your problem (you are shadowing vec), but consider using a simpler approach.
map can take multiple arguments, in which case they are passed as additional arguments to the map function, e.g. (map f c1 c2 ...) calls (f (first c1) (first c2) ...) etc, until one of the sequence arguments is exhausted.
This means your (partition-all 2 (interleave ...)) is a very verbose way of saying (map list (range) v). There is also a function map-indexed which does the same thing. However, it only takes one sequence argument, so (map-indexed f c1 c2) is not legal.
Here is your function rewritten for clarity using map-indexed, threading, and nil?:
(defn nil-indices [v]
; Note: map fn called like (f range-item v-item)
; Not like (f (range-item v-item)) as in your code.
(->> (map-indexed #(when (nil? %2) %1) v) ;; like (map #(when ...) (range) v)
(remove nil?)
vec))
However, you can do this instead with reduction and the reduce-kv function. This function is like reduce, except the reduction function receives three arguments instead of two: the accumulator, the key of the item in the collection (index for vectors, key for maps), and the item itself. Using reduce-kv you can rewrite this function even more clearly (and it will probably run faster, especially with transients):
(defn nil-indices [v]
(reduce-kv #(if (nil? %3) (conj %1 %2) %1) [] v))

Clojure: index of a value in a list or other collection

How do I get the index of any of the elements on a list of strings as so:
(list "a" "b" "c")
For example, (function "a") would have to return 0, (function "b") 1, (function "c") 2 and so on.
and... will it be better to use any other type of collection if dealing with a very long list of data?
Christian Berg's answer is fine. Also it is possible to just fall back on Java's indexOf method of class String:
(.indexOf (appl­y str (list­ "a" "b" "c"))­ "c")
; => 2
Of course, this will only work with lists (or more general, seqs) of strings (of length 1) or characters.
A more general approach would be:
(defn index-of [e coll] (first (keep-indexed #(if (= e %2) %1) coll)))
More idiomatic would be to lazily return all indexes and only ask for the ones you need:
(defn indexes-of [e coll] (keep-indexed #(if (= e %2) %1) coll))
(first (indexes-of "a" (list "a" "a" "b"))) ;; => 0
I'm not sure I understand your question. Do you want the nth letter of each of the strings in a list? That could be accomplished like this:
(map #(nth % 1) (list "abc" "def" "ghi"))
The result is:
(\b \e \h)
Update
After reading your comment on my initial answer, I assume your question is "How do I find the index (position) of a search string in a list?"
One possibility is to search for the string from the beginning of the list and count all the entries you have to skip:
(defn index-of [item coll]
(count (take-while (partial not= item) coll)))
Example: (index-of "b" (list "a" "b" "c")) returns 1.
If you have to do a lot of look-ups, it might be more efficient to construct a hash-map of all strings and their indices:
(def my-list (list "a" "b" "c"))
(def index-map (zipmap my-list (range)))
(index-map "b") ;; returns 1
Note that with the above definitions, when there are duplicate entries in the list index-of will return the first index, while index-map will return the last.
You can use the Java .indexOf method reliably for strings and vectors, but not for lists. This solution should work for all collections, I think:
(defn index-of
"Clojure doesn't have an index-of function. The Java .indexOf method
works reliably for vectors and strings, but not for lists. This solution
works for all three."
[item coll]
(let [v (if
(or (vector? coll) (string? coll))
coll
(apply vector coll))]
(.indexOf coll item)))
Do you mean, how do you get the nth element of a list?
For example, if you want to get the 2nd element on the list (with zero-based index):
(nth (list "a" "b" "c") 2)
yields
"c"
Cat-skinning is fun. Here's a low-level approach.
(defn index-of
([item coll]
(index-of item coll 0))
([item coll from-idx]
(loop [idx from-idx coll (seq (drop from-idx coll))]
(if coll
(if (= item (first coll))
idx
(recur (inc idx) (next coll)))
-1))))
This is a Lispy answer, I suspect those expert in Clojure could do it better:
(defn position
"Returns the position of elt in this list, or nil if not present"
([list elt n]
(cond
(empty? list) nil
(= (first list) elt) n
true (position (rest list) elt (inc n))))
([list elt]
(position list elt 0)))
You seem to want to use the nth function.
From the docs for that function:
clojure.core/nth
([coll index] [coll index not-found])
Returns the value at the index. get returns nil if index out of
bounds, nth throws an exception unless not-found is supplied. nth
also works for strings, Java arrays, regex Matchers and Lists, and,
in O(n) time, for sequences.
That last clause means that in practice, nth is slower for elements "farther off" in sequences, with no guarantee to work quicker for collections that in principle support faster access (~ O(n)) to indexed elements. For (clojure) sequences, this makes sense; the clojure seq API is based on the linked-list API and in a linked list, you can only access the nth item by traversing every item before it. Keeping that restriction is what makes concrete list implementations interchangeable with lazy sequences.
Clojure collection access functions are generally designed this way; functions that do have significantly better access times on specific collections have separate names and cannot be used "by accident" on slower collections.
As an example of a collection type that supports fast "random" access to items, clojure vectors are callable; (vector-collection index-number) yields the item at index index-number - and note that clojure seqs are not callable.
I know this question has been answered a million time but here is a recursive solution that leverages deconstructing.
(defn index-of-coll
([coll elm]
(index-of-coll coll elm 0))
([[first & rest :as coll] elm idx]
(cond (empty? coll) -1
(= first elm) idx
:else (recur rest elm (inc idx)))))
(defn index-of [item items]
(or (last (first (filter (fn [x] (= (first x) item))
(map list items (range (count items))))))
-1))
seems to work - but I only have like three items in my list