Is there an idiomatic way of encoding and decoding a string in Clojure as hexadecimal? Example from Python:
'Clojure'.encode('hex')
# ⇒ '436c6f6a757265'
'436c6f6a757265'.decode('hex')
# ⇒ 'Clojure'
To show some effort on my part:
(defn hexify [s]
(apply str
(map #(format "%02x" (int %)) s)))
(defn unhexify [hex]
(apply str
(map
(fn [[x y]] (char (Integer/parseInt (str x y) 16)))
(partition 2 hex))))
(hexify "Clojure")
;; ⇒ "436c6f6a757265"
(unhexify "436c6f6a757265")
;; ⇒ "Clojure"
Since all posted solutions have some flaws, I'm sharing my own:
(defn hexify "Convert byte sequence to hex string" [coll]
(let [hex [\0 \1 \2 \3 \4 \5 \6 \7 \8 \9 \a \b \c \d \e \f]]
(letfn [(hexify-byte [b]
(let [v (bit-and b 0xFF)]
[(hex (bit-shift-right v 4)) (hex (bit-and v 0x0F))]))]
(apply str (mapcat hexify-byte coll)))))
(defn hexify-str [s]
(hexify (.getBytes s)))
and
(defn unhexify "Convert hex string to byte sequence" [s]
(letfn [(unhexify-2 [c1 c2]
(unchecked-byte
(+ (bit-shift-left (Character/digit c1 16) 4)
(Character/digit c2 16))))]
(map #(apply unhexify-2 %) (partition 2 s))))
(defn unhexify-str [s]
(apply str (map char (unhexify s))))
Pros:
High performance
Generic byte stream <--> string conversions with specialized wrappers
Handling leading zero in hex result
Your implementation(s) don't work for non-ascii characters,
(defn hexify [s]
(apply str
(map #(format "%02x" (int %)) s)))
(defn unhexify [hex]
(apply str
(map
(fn [[x y]] (char (Integer/parseInt (str x y) 16)))
(partition 2 hex))))
(= "\u2195" (unhexify(hexify "\u2195")))
false ; should be true
To overcome this you need to serialize the bytes of the string using the required character encoding, which can be multi-byte per character.
There are a few 'issues' with this.
Remember that all numeric types are signed in the JVM.
There is no unsigned-byte.
In idiomatic java you would use the low byte of an integer and mask it like this wherever you used it.
int intValue = 0x80;
byte byteValue = (byte)(intValue & 0xff); -- use only low byte
System.out.println("int:\t" + intValue);
System.out.println("byte:\t" + byteValue);
-- output:
-- int: 128
-- byte: -128
clojure has (unchecked-byte) to effectively do the same.
For example, using UTF-8 you can do this:
(defn hexify [s]
(apply str (map #(format "%02x" %) (.getBytes s "UTF-8"))))
(defn unhexify [s]
(let [bytes (into-array Byte/TYPE
(map (fn [[x y]]
(unchecked-byte (Integer/parseInt (str x y) 16)))
(partition 2 s)))]
(String. bytes "UTF-8")))
; with the above implementation:
;=> (hexify "\u2195")
"e28695"
;=> (unhexify "e28695")
"↕"
;=> (= "\u2195" (unhexify (hexify "\u2195")))
true
Sadly the "idiom" appears to be using the Apache Commons Codec, e.g. as done in buddy:
(ns name-of-ns
(:import org.apache.commons.codec.binary.Hex))
(defn str->bytes
"Convert string to byte array."
([^String s]
(str->bytes s "UTF-8"))
([^String s, ^String encoding]
(.getBytes s encoding)))
(defn bytes->str
"Convert byte array to String."
([^bytes data]
(bytes->str data "UTF-8"))
([^bytes data, ^String encoding]
(String. data encoding)))
(defn bytes->hex
"Convert a byte array to hex encoded string."
[^bytes data]
(Hex/encodeHexString data))
(defn hex->bytes
"Convert hexadecimal encoded string to bytes array."
[^String data]
(Hex/decodeHex (.toCharArray data)))
I believe your unhexify function is as idiomatic as it can be. However, hexify can be written in a simpler way:
(defn hexify [s]
(format "%x" (new java.math.BigInteger (.getBytes s))))
Related
I'm looking for something like join but with the delimiter going in front of each string rather than just acting as a separator.
As a simple example, I'm looking for a less ugly version of this:
(def params [1 2 3 4])
(clojure.string/join (for [x params] (str "¶m=" x)))
result
"¶m=1¶m=2¶m=3¶m=4"
Calling (clojure.string/join coll) with no separator is the same as (apply str coll) which is a tiny win:
(apply str (for [x params] (str "¶m=" x)))
Then you might prefer map over for here:
(apply str (map #(str "¶m=" %) params))
interleave could work:
(apply str (interleave (repeat "¶m=") params))
You could refactor this to separate the prefix and interleave it with the strings:
(apply str
(interleave (repeat \&)
(map #(str "param=" %) params)))
You might like the look of threading:
(->> (map #(str "param=" %) params)
(interleave (repeat \&))
(apply str))
You could extract a function to do this more generally:
(defn prepend-join [separator & cs]
(apply str (apply interleave (repeat separator) cs)))
(prefix-join \& (map #(str "param=" %) params))
and in addition:
there is a handy function in clojure's core lib, namely clojure.pprint/cl-format
user> (clojure.pprint/cl-format nil "~{¶m=~a~}" [1 2 3 4 5])
;;=> "¶m=1¶m=2¶m=3¶m=4¶m=5"
talking about it's capabilities, it is just the top of the iceberg
Prepending an empty string to params looks pretty clean to me:
(clojure.string/join "¶m=" (cons "" params))
;;=> "¶m=1¶m=2¶m=3¶m=4"
Hi am learning clojure and trying to find the index of the vowels in a string here is what I tried
(def vowels [\a \e \i \o \u \y])
(let [word-index (interleave "aaded" (range))
indexs (for [ [x i] (vector word-index)
:when (some #{x} vowels)]
[i] )]
(seq indexs))
But this is giving me index "0" or nill what am doing wrong.
> (def vowels #{\a \e \i \o \u})
> (filter some? (map #(when (vowels %1) %2) "aaded" (range)))
(0 1 3)
You need to form the input correctly for the for comprehension:
(let [word-index (interleave "aaded" (range))
indexs (for [[x i] (partition 2 word-index)
:when (some #{x} vowels)]
i)]
(prn (seq indexs)))
;; => (0 1 3)
interleave will give a lazy sequence when we mapped that sequence to the vector of for loop, I think I missed the indexes. So changed the implementation as below.
(let [word-index (zipmap (range) "aaded")
indexs (for [ [i x] word-index
:when (some #{x} vowels)]
[i] )
]
(flatten indexs)
)
Which is working fine, if anyone has better implementation please share. It will be helpful for me thanks.
With every iteration of the for function, the same hash-set is formed repeatedly. So it's better to define it in the let block. Also, we can use the hash-set directly as a function and we don't need the some function for the same.
(let [word-index (zipmap (range) "aaded")
vowels-hash (into #{} [\a \e \i \o \u \y])
indexs (for [[i x] word-index
:when (vowels-hash x)]
[i])]
(flatten indexs))
a bit different approach with regex:
for all indices:
user> (let [m (re-matcher #"[aeiou]" "banedif")]
(take-while identity (repeatedly #(when (re-find m) (.start m)))))
;;=> (1 3 5)
for single index:
user> (let [m (re-matcher #"[aeiou]" "bfsendf")]
(when (re-find m) (.start m)))
;;=> 3
user> (let [m (re-matcher #"[aeiou]" "bndf")]
(when (re-find m) (.start m)))
;;=> nil
#jas has got this nailed down already. Adding my own to provide some comments on what happens in intermediary steps.
Use sets to check for membership. Then the question "is this a vowel?" will be fast.
(def vowels (set "aeiouy"))
vowels
;; => #{\a \e \i \o \u \y}
We can filter out the vowels, then get just the indexes
(defn vowel-indices-1 [word]
(->> (map vector (range) word) ; ([0 \h] [1 \e] [2 \l] ...)
(filter (fn [[_ character]] ; ([1 \e] [4 \o])
(contains? vowels character)))
(map first))) ; (1 4)
(vowel-indices-1 "hello!")
;; => (1 4)
... or we can go for a slightly more fancy with the :when keyword (didn't know about that, thanks!), in the style that you started!
(defn vowel-indices-2 [word]
(for [[i ch] (map vector (range) word)
:when (contains? vowels ch)]
i))
(vowel-indices-2 "hello!")
;; => (1 4)
I am writing a function that, for any given string, replaces any digits within that String with the same number of '.' characters.
Examples:
AT2X -> AT..X
QW3G45 -> QW...G.........
T3Z1 -> T...Z.
I've written the following Clojure function but I am getting an error I don't quite understand:
java.lang.ClassCastException: clojure.lang.LazySeq (in module: Unnamed Module) cannot be case to java.lang.Charsequence
I'm interpreting from the error that I need to force an evaluation of a lazy sequence back into a String (or CharSequence) but I can't figure out where to do so or if this is correct.
(defn dotify
;;Replaces digits with the same number of '.'s for use in traditional board formats
[FEN]
(let [values (doall (filter isDigit (seq FEN)))]
(fn [values]
(let [value (first values)]
(str/replace FEN value (fn dots [number]
(fn [s times]
(if (> times 0)
(recur (str s ".") (dec times)))) "" (Character/digit number 10)) value))
(recur (rest values))) values))
There is a standard clojure.string/replace function that may handle that case. Its last argument might be not just a string or a pattern but also a function that turns a found fragment into what you want.
Let's prepare such a function first:
(defn replacer [sum-str]
(let [num (read-string num-str)]
(apply str (repeat num \.))))
You may try it in this way:
user> (replacer "2")
..
user> (replacer "9")
.........
user> (replacer "22")
......................
user>
Now pass it into replace as follows:
user> (clojure.string/replace "a2b3c11" #"\d+" replacer)
a..b...c...........
Here's a way to do this using reduce:
(defn dotify [s]
(->> s
(reduce (fn [acc elem]
(if (Character/isDigit elem)
(let [dots (Integer/parseInt (str elem))]
(apply conj acc (repeat dots \.)))
(conj acc elem)))
[])
(apply str)))
(dotify "zx4g1z2h")
=> "zx....g.z..h"
And another version using mapcat:
(defn dotify-mapcat [s]
(apply str
(mapcat (fn [c]
(if (Character/isDigit c)
(repeat (Integer/parseInt (str c)) \.)
[c]))
s)))
There are some issues in your example:
Many of the internal forms are themselves functions, but it looks like you just want their bodies or implementations instead of wrapping them in functions.
It's hard to tell by the indentation/whitespace, but the entire function is just recur-ing, the fn above it is not being used or returned.
One of the arguments to str/replace is a function that returns a function.
It helps to break the problem down into smaller pieces. For one, you know you'll need to examine each character in a string and decide whether to just return it or expand it into a sequence of dots. So you can start with a function:
(defn expand-char [^Character c]
(if (Character/isDigit c)
(repeat (Integer/parseInt (str c)) \.)
[c]))
Then use that function that operates on one character at a time in a higher-order function that operates on the entire string:
(apply str (mapcat expand-char s))
=> "zx....g.z..h"
Note this is also ~5x faster than the examples above because of the ^Character type-hint in expand-char function.
You can do this with str/replace too:
(defn expand-char [s]
(if (Character/isDigit ^Character (first s))
(apply str (repeat (Integer/parseInt s) \.))
s))
(str/replace "zx4g1z2h" #"." expand-char)
=> "zx....g.z..h"
I'm trying to write a function that counts the number of vowels and consonants in a given string. The return value is a map with two keys, vowels and consonants. The values for each respective key are simply the counts.
The function that I have been able to develop so far is
(defn count-vowels-consenants [s]
(let [m (atom {"vowels" 0 "consenants" 0})
v #{"a" "e" "i" "o" "u"}]
(for [xs s]
(if
(contains? v (str xs))
(swap! m update-in ["vowels"] inc)
(swap! m update-in ["consenants"] inc)
))
#m))
however (count-vowels-consenants "sldkfjlskjwe") returns {"vowels":0 "consenants": 0}
What am I doing wrong?
EDIT: changed my input from str to s as str is a function in Clojure.
I think for is lazy so you're not going to actually do anything until you try to realize it. I added a first onto the for loop which realized the list and resulted in an error which you made by overwriting the str function with the str string. Ideally, you would just do this without the atom rigmarole.
(defn count-vowels-consonants [s]
(let [v #{\a \e \i \o \u}
vowels (filter v s)
consonants (remove v s)]
{:consonants (count consonants)
:vowels (count vowels)}))
if the atom is what you want, then use doseq instead of for and it will update the atom for everything in the string. also make sure you don't overwrite the str function by using it in your function binding.
if this side effecting scheme is inevitable (for sume educational reason, i suppose) just replace for with doseq which is a side effecting eager equivalent of for
(by the way: there is a mistake in your initial code: you use str as an input param name, and then try to use it as a function. So you are shadowing the def from the clojure.core, just try to avoid using params named like the core functions):
(defn count-vowels-consenants [input]
(let [m (atom {"vowels" 0 "consenants" 0})
v #{"a" "e" "i" "o" "u"}]
(doseq [s input]
(if (contains? v (str s))
(swap! m update-in ["vowels"] inc)
(swap! m update-in ["consenants"] inc)))
#m))
#'user/count-vowels-consenants
user> (count-vowels-consenants "asdfg")
;; {"vowels" 1, "consenants" 4}
otherwise you could do something like this:
user> (reduce #(update %1
(if (#{\a \e \i \o \u} %2)
"vowels" "consonants")
(fnil inc 0))
{} "qwertyui")
;;{"consonants" 5, "vowels" 3}
or
user> (frequencies (map #(if (#{\a \e \i \o \u} %)
"vowels" "consonants")
"qwertyui"))
;;{"consonants" 5, "vowels" 3}
or this (if you're good with having true/false instead of "vowels/consonants"):
user> (frequencies (map (comp some? #{\a \e \i \o \u}) "qwertyui"))
;;{false 5, true 3}
for is lazy as mentioned by #Brandon H. You can use loop recur if you want. Here I change for with loop-recur.
(defn count-vowels-consenants [input]
(let [m (atom {"vowels" 0 "consenants" 0})
v #{"a" "e" "i" "o" "u"}]
(loop [s input]
(when (> (count s) 0)
(if
(contains? v (first (str s) ))
(swap! m update-in ["vowels"] inc)
(swap! m update-in ["consenants"] inc)
))
(recur (apply str (rest s))))
#m))
The question, and every extant answer, assumes that every character is a vowel or a consonant: not so. And even in ASCII, there are lower and upper case letters. I'd do it as follows ...
(defn count-vowels-consonants [s]
(let [vowels #{\a \e \i \o \u
\A \E \I \O \U}
classify (fn [c]
(if (Character/isLetter c)
(if (vowels c) :vowel :consonant)))]
(map-v count (dissoc (group-by classify s) nil))))
... where map-v is a function that map's the values of a map:
(defn map-v [f m] (reduce (fn [a [k v]] (assoc a k (f v))) {} m))
For example,
(count-vowels-consonants "s2a Boo!")
;{:vowel 3, :consonant 2}
This traverses the string just once.
Being quite new to clojure I am still struggling with its functions. If I have 2 lists, say "1234" and "abcd" I need to make all possible ordered lists of length 4. Output I want to have is for length 4 is:
("1234" "123d" "12c4" "12cd" "1b34" "1b3d" "1bc4" "1bcd"
"a234" "a23d" "a2c4" "a2cd" "ab34" "ab3d" "abc4" "abcd")
which 2^n in number depending on the inputs.
I have written a the following function to generate by random walk a single string/list.
The argument [par] would be something like ["1234" "abcd"]
(defn make-string [par] (let [c1 (first par) c2 (second par)] ;version 3 0.63 msec
(apply str (for [loc (partition 2 (interleave c1 c2))
:let [ch (if (< (rand) 0.5) (first loc) (second loc))]]
ch))))
The output will be 1 of the 16 ordered lists above. Each of the two input lists will always have equal length, say 2,3,4,5, up to say 2^38 or within available ram. In the above function I have tried to modify it to generate all ordered lists but failed. Hopefully someone can help me. Thanks.
Mikera is right that you need to use recursion, but you can do this while being both more concise and more general - why work with two strings, when you can work with N sequences?
(defn choices [colls]
(if (every? seq colls)
(for [item (map first colls)
sub-choice (choices (map rest colls))]
(cons item sub-choice))
'(())))
(defn choose-strings [& strings]
(for [chars (choices strings)]
(apply str chars)))
user> (choose-strings "123" "abc")
("123" "12c" "1b3" "1bc" "a23" "a2c" "ab3" "abc")
This recursive nested-for is a very useful pattern for creating a sequence of paths through a "tree" of choices. Whether there's an actual tree, or the same choice repeated over and over, or (as here) a set of N choices that don't depend on the previous choices, this is a handy tool to have available.
You can also take advantage of the cartesian-product from the clojure.math.combinatorics package, although this requires some pre- and post-transformation of your data:
(ns your-namespace (:require clojure.math.combinatorics))
(defn str-combinations [s1 s2]
(->>
(map vector s1 s2) ; regroup into pairs of characters, indexwise
(apply clojure.math.combinatorics/cartesian-product) ; generate combinations
(map (partial apply str)))) ; glue seqs-of-chars back into strings
> (str-combinations "abc" "123")
("abc" "ab3" "a2c" "a23" "1bc" "1b3" "12c" "123")
>
The trick is to make the function recursive, calling itself on the remainder of the list at each step.
You can do something like:
(defn make-all-strings [string1 string2]
(if (empty? string1)
[""]
(let [char1 (first string1)
char2 (first string2)
following-strings (make-all-strings (next string1) (next string2))]
(concat
(map #(str char1 %) following-strings)
(map #(str char2 %) following-strings)))))
(make-all-strings "abc" "123")
=> ("abc" "ab3" "a2c" "a23" "1bc" "1b3" "12c" "123")
(defn combine-strings [a b]
(if (seq a)
(for [xs (combine-strings (rest a) (rest b))
x [(first a) (first b)]]
(str x xs))
[""]))
Now that I wrote it I realize it's a less generic version of amalloiy's one.
You could also use the binary digits of numbers between 0 and 16 to form your combinations:
if a bit is zero select from the first string otherwise the second.
E.g. 6 = 2r0110 => "1bc4", 13 = 2r1101 => "ab3d", etc.
(map (fn [n] (apply str (map #(%1 %2)
(map vector "1234" "abcd")
(map #(if (bit-test n %) 1 0) [3 2 1 0])))); binary digits
(range 0 16))
=> ("1234" "123d" "12c4" "12cd" "1b34" "1b3d" "1bc4" "1bcd" "a234" "a23d" "a2c4" "a2cd" "ab34" "ab3d" "abc4" "abcd")
The same approach can apply to generating combinations from more than 2 strings.
Say you have 3 strings ("1234" "abcd" "ABCD"), there will be 81 combinations (3^4). Using base-3 ternary digits:
(defn ternary-digits [n] (reverse (map #(mod % 3) (take 4 (iterate #(quot % 3) n))))
(map (fn [n] (apply str (map #(%1 %2)
(map vector "1234" "abcd" "ABCD")
(ternary-digits n)
(range 0 81))
(def c1 "1234")
(def c2 "abcd")
(defn make-string [c1 c2]
(map #(apply str %)
(apply map vector
(map (fn [col rep]
(take (math/expt 2 (count c1))
(cycle (apply concat
(map #(repeat rep %) col)))))
(map vector c1 c2)
(iterate #(* 2 %) 1)))))
(make-string c1 c2)
=> ("1234" "a234" "1b34" "ab34" "12c4" "a2c4" "1bc4" "abc4" "123d" "a23d" "1b3d" "ab3d" "12cd" "a2cd" "1bcd" "abcd")