Different values for MD 5 hash depending on technique - clojure

I'm trying to find a good way to hash a string. This method is working fine, but the results are not consistent with this website:
(defn hash-string
"Use java interop to flexibly hash strings"
[string algo base]
(let [hashed
(doto (java.security.MessageDigest/getInstance algo)
(.reset)
(.update (.getBytes string)))]
(.toString (new java.math.BigInteger 1 (.digest hashed)) base))
)
(defn hash-md5
"Generate a md5 checksum for the given string"
[string]
(hash-string string "MD5" 16)
)
When I use this, I do indeed get hashes. The problem is I'm trying a programming exercise at advent of code and it has its own examples of string hashes which offer a 3rd result different from the above 2!
How can one do an md5 in the "standard" way that is always expected?

Your MD5 operations are correct; you're just not displaying them properly.
Since an MD5 is 32 hexadecimal characters long, you need to format the string to pad it out correctly.
In other words, simply change this expression:
(.toString (new java.math.BigInteger 1 (.digest hashed)) base))
to one that uses format:
(format "%032x" (new java.math.BigInteger 1 (.digest hashed)))))

Related

Meaning of # in clojure

In clojure you can create anonymous functions using #
eg
#(+ % 1)
is a function that takes in a parameter and adds 1 to it.
But we also have to use # for regex
eg
(clojure.string/split "hi, buddy" #",")
Are these two # related?
There are also sets #{}, fully qualified class name constructors #my.klass_or_type_or_record[:a :b :c], instants #inst "yyyy-mm-ddThh:mm:ss.fff+hh:mm" and some others.
They are related in a sence that in these cases # starts a sequence recognisible by clojure reader, which dispatches every such instance to an appropriate reader.There's a guide that expands on this.
I think this convention exists to reduce the number of different syntaxes to just one and thus simplify the reader.
The two uses have no (direct) relationship.
In Clojure, when you see the # symbol, it is a giant clue that you are "talking" to the Clojure Reader, not to the Clojure Compiler. See the full docs on the Reader here: https://clojure.org/reference/reader.
The Reader is responsible for converting plain text from a source file into a collection of data structures. For example, comparing Clojure to Java we have
; Clojure ; Java
"Hello" => new String( "Hello" )
and
[ "Goodbye" "cruel" "world!" ] ; Clojure vector of 3 strings
; Java ArrayList of 3 strings
var msg = new ArrayList<String>();
msg.add( "Goodbye" );
msg.add( "cruel" );
msg.add( "world!" );
Similarly, there are shortcuts that the Reader recognizes even within Clojure source code (before the compiler converts it to Java bytecode), just to save you some typing. These "Reader Macros" get converted from your "short form" source code into "standard Clojure" even before the Clojure compiler gets started. For example:
#my-atom => (deref my-atom) ; not using `#`
#'map => (var map)
#{ 1 2 3 } => (hash-set 1 2 3)
#_(launch-missiles 12.3 45.6) => `` ; i.e. "nothing"
#(+ 1 %) => (fn [x] (+ 1 x))
and so on. As the # or deref operator shows, not all Reader Macros use the # (hash/pound/octothorpe) symbol. Note that, even in the case of a vector literal:
[ "Goodbye" "cruel" "world!" ]
the Reader creates a result as if you had typed:
(vector "Goodbye" "cruel" "world!" )
Are these two # related?
No, they aren't. The # literal is used in different ways. Some of them you've already mentioned: these are an anonymous function and a regex pattern. Here are some more cases:
Prepending an expression with #_ just wipes it from the compiler as it has never been written. For example: #_(/ 0 0) will be ignored on reader level so none of the exception will appear.
Tagging primitives to coerce them to complex types, for example #inst "2019-03-09" will produce an instance of java.util.Date class. There are also #uuid and other built-in tags. You may register your own ones.
Tagging ordinary maps to coerce them to types maps, e.g. #project.models/User {:name "John" :age 42} will produce a map declared as (defrecord User ...).
Other Lisps have proper programmable readers, and consequently read macros. Clojure doesn't really have a programmable reader - users cannot easily add new read macros - but the Clojure system does internally use read macros. The # read macro is the dispatch macro, the character following the # being a key into a further read macro table.
So yes, the # does mean something; but it's so deep and geeky that you do not really need to know this.

Common Lisp Applying Regex-like Patterns to Keys in PLIST

I am wondering if it is possible to apply Regex-like pattern matching to keys in a plist.
That is, suppose we have a list like this (:input1 1 :input2 2 :input3 3 :output1 10 :output2 20 ... :expand "string here")
The code I need to write is something along the lines of:
"If there is :expand and (:input* or :output*) in the list's keys, then do something and also return the :expand and (:output* or :input*)".
Obviously, this can be accomplished via cond but I do not see a clear way to write this elegantly. Hence, I thought of possibly using a Regex-like pattern on keys and basing the return on the results from that pattern search.
Any suggestions are appreciated.
Normalize your input
A possible first step for your algorithm that will simplify the rest of your problem is to normalize your input in a way that keep the same information in a structured way, instead of inside symbol's names. I am converting keys from symbols to either symbols or lists. You could also define your own class which represents inputs and outputs, and write generic functions that works for both.
(defun normalize-key (key)
(or (cl-ppcre:register-groups-bind (symbol number)
("^(\\w+)(\\d+)$" (symbol-name key))
(list (intern symbol "KEYWORD")
(parse-integer number)))
key))
(defun test-normalize ()
(assert (eq (normalize-key :expand) :expand))
(assert (equal (normalize-key :input1) '(:input 1))))
The above normalize-key deconstructs :inputN into a list (:input N), with N parsed as a number. Using the above function, you can normalize the whole list (you could do that recursively too for values, if you need it):
(defun normalize-plist (plist)
(loop
for (key value) on plist by #'cddr
collect (normalize-key key)
collect value))
(normalize-plist
'(:input1 1 :input2 2 :input3 3 :output1 10 :output2 20 :expand "string here"))
=> ((:INPUT 1) 1
(:INPUT 2) 2
(:INPUT 3) 3
(:OUTPUT 1) 10
(:OUTPUT 2) 20
:EXPAND "string here")
From there, you should be able to implement your logic more easily.

How this code translates a number to English?

I'm new to Clojure and found there's a piece of code like following
user=> (def to-english (partial clojure.pprint/cl-format nil
"~#(~#[~R~]~^ ~A.~)"))
#'user/to-english
user=> (to-english 1234567890)
"One billion, two hundred thirty-four million, five hundred sixty-seven
thousand, eight hundred ninety"
at https://clojuredocs.org/clojure.core/partial#example-542692cdc026201cdc326ceb. I know what partial does and I checked clojure.pprint/cl-format doc but still don't understand how it translates an integer to English words. Guess secret is hidden behind "~#(~#[~R~]~^ ~A.~)" but I didn't find a clue to read it.
Any help will be appreciated!
The doc mentions it, but one good resource is A Few FORMAT Recipes from Seibel's Practical Common Lisp.
Also, check §22.3 Formatted Output from the HyperSpec.
In Common Lisp:
CL-USER> (format t "~R" 10)
ten
~#(...~^...) is case conversion, where the # prefix means to capitalize (upcase only the first word). It contains an escape upward operation ~^, which in this context marks the end of what is case-converted. It also exits the current context when there are no more argument available.
~#[...] is conditional format: the inner format is applied on a value only if it is non nil.
The final ~A means that the function should be able to accept one more argument and print it.
In fact, your example looks like the one in §22.3.9.2:
If ~^ appears within a ~[ or ~( construct, then all the commands up to
the ~^ are properly selected or case-converted, the ~[ or ~(
processing is terminated, and the outward search continues for a ~{ or
~< construct to be terminated. For example:
(setq tellstr "~#(~#[~R~]~^ ~A!~)")
=> "~#(~#[~R~]~^ ~A!~)"
(format nil tellstr 23) => "Twenty-three!"
(format nil tellstr nil "losers") => " Losers!"
(format nil tellstr 23 "losers") => "Twenty-three losers!"

Evaluate math string in Clojure

I need to implement a function called eval-math-string in Clojure, which takes a math string as input and evaluates it:
(eval-math-string "7+8/2") => 11
So I've managed to break apart an expression using re-seq, and now I want to evaluate it using Incanter. However, I have an expression like ("7" "+" "8" "/" "2"), but Incanter needs an expression like ($= 7 + 8 / 2), where $= is the incanter keyword. How can I feed the list of one-character strings into a list including $= so that it executes properly. If the arguments are strings, the function won't work, but I can't convert +, *, / etc. to numbers, so I'm a little stuck.
Does anyone know how I can do this, or if there is a better way to do this?
Incanter's $= macro just calls the infix-to-prefix function, so all you need to do is convert your list of strings to a list of symbols and numbers, then call infix-to-prefix.
I'm going to assume that the input is just a flat list of strings, each of which represents either an integer (e.g. "7", "8", "2") or a symbol (e.g. "+", "/"). If that assumption is correct, you could write a conversion function like this:
(defn convert [s]
(try
(Long/parseLong s)
(catch NumberFormatException _
(symbol s))))
Example:
(infix-to-prefix (map convert ["7" "+" "8" "/" "2"]))
Of course, if you're just writing a macro on top of $=, there's no need to call infix-to-prefix, as you would just be assembling a list with $= as the first item. I'm going to assume that you already have a function called math-split that can transform something like "7+8/2" into something like ["7" "+" "8" "/" "2"], in which case you can do this:
(defmacro eval-math-string [s]
`($= ~#(map convert (math-split s))))
Example:
(macroexpand-1 '(eval-math-string "7+8/2"))
;=> (incanter.core/$= 7 + 8 / 2)

Using Clojure do-template to set prepared-statement columns

I have defined the following code to allow me to set column values in a java.sql.PreparedStatement. Is this code reasonable/idiomatic? How could it be improved?
(use '(clojure.template :only [do-template]))
; (import all java types not in java.lang)
(defprotocol SetPreparedStatement
(set-prepared-statement [this prepared-statement index]))
(do-template [type-name set-name]
(extend-type type-name
SetPreparedStatement
(set-prepared-statement [this prepared-statement index]
(set-name prepared-statement index this)))
BigDecimal .setBigDecimal
Boolean .setBoolean
Byte .setByte
Date .setDate
Double .setDouble
Float .setFloat
Integer .setInt
Long .setLong
Object .setObject
Short .setShort
Time .setTime
Timestamp .setTimestamp)
; Sample use
(set-prepared-statement 42 some-prepared-statement 1)
Your example looks as close to idiomatic Clojure as I can tell :)
It could perhaps benefit from abstracting the type mapping out if you have situations where you will be creating more than one template though if your creating just this one then this looks like excellent clojure to me.