Dynamic variables in Clojure libraries - clojure

TL;DR: Is the following a good pattern for a library?
(def ^{:dynamic true} *var*)
(defn my-fn [{:keys [var]}]
(do-smth (or var *var*)))
--
Say I want to write a sentiment analysis library.
Is it good design in get-sentiment fn to accept optional sentiment labels but provide default one as dynamic var?
(def ^:dynamic *sentiment-level-labels*
["Very Negative" "Negative" "Neutral" "Positive" "Very Positive"])
;;...
(defn get-sentiment-scores
"Takes text and gives back a 0 to 4 sentiment score for each sentences."
[text]
;;...)
(defn get-sentiment
"Gives back a sentiment map with sentences scores,
average score, rounded score and labeled score.
Can accepts custom sentiment level labels under :labels opt."
[text & {:keys [labels]}]
(let [scores (get-sentiment-scores text)
average-score (get-average scores)
rounded-score (Math/round average-score)
label (get (or labels *sentiment-level-labels*) rounded-score)]
{:scores scores
:average-score average-score
:rounded-score rounded-score
:label label}))
Clojure library coding standards official page says:
If you present an interface that implicitly passes a parameter via
dynamic binding (e.g. db in sql), also provide an identical interface
but with the parameter passed explicitly.
https://dev.clojure.org/display/community/Library+Coding+Standards
In my example, I provided only one interface but with opt argument.
Is this okay? Are there better ways to handle this?
Thank you!

Dynamic vars are full or pitfalls. They push your API code towards implicit environmental coupling, and often force your calling code to add a lot of (binding ...) clauses, which kind of defeats the purpose of concision for using Dynamic vars in the first place. They also lead to tricky edge cases if control is passed from one thread to another.
In your case, I would recommend simply passing the labels in a params map argument:
(def default-sentiment-level-labels
["Very Negative" "Negative" "Neutral" "Positive" "Very Positive"])
(defn get-sentiment
"Gives back a sentiment map with sentences scores,
average score, rounded score and labeled score.
Can accepts custom sentiment level labels under :labels opt."
[text {:as params, :keys [labels] :or {labels default-sentiment-labels}}]
...))
Note that the usage of a map can be interesting, because a map is opaque to intermediaries: you can have various components of your algorithm read only from the params map the keys that concern them.

Related

Associate result of operation to the hashmap value

I have 2 maps:
(def look {"directions" :look
"look" :look
"examine room" :look
})
(def quit {"exit game" :quit
"quit game" :quit
"exit" :quit
"quit" :quit
})
which are merged into one map by:
(defn actions [] (merge look quit))
and then I try to associate its result (which is a hash-map) into value of another map:
(assoc {} :1 actions)
but instead of expected result which should be:
{1: {"directions" :look, "look" :look ...
I receive
{:1 #object[fun_pro.core$actions 0x20a8027b "fun_pro.core$actions#20a8027b"]}
which is, as I understand reference to the action object.
What should I do to receive expected result? I tried also use of unquote-splicing but I'm not advanced enough to use macros yet and couldn't make it works.
EDIT1:
OK, It seem I found the solution. Instead of using (defn actions...) I should use (def actions...).
EDIT2:
To clarify why I use such a structure.
As I said in comment below, I will use this maps to compare it with the answer provided by user to find which command to use. For example if user type "show me directions", it will trigger function to display directions based on keyword "directions". The same result will be if user will ask "I look around the room" or "I examine room to find road", based on keywords "look" and "examine room".
It will be done by first splitting user input into set of strings and find if there is common word with keys from my map (turned into set). So input "show me directions" will be processed into set #{"show" "me" "directions"}. Then I will use clojure.set/intersection to find if there is common element with set of map keywords and trigger function accordingly to result (I have already coded algorithm for that).
Of course I'm open for any suggestions if there is better solution for it.
OK, It seem I found the solution on my own. Instead of using (defn actions...) I should use (def actions...).
This results of desired output.

Opportunity to use a transducer?

Using Clojure, I'm pulling some data out of a SQLite DB. It will arrive in the form of a list of maps. Here is an abbreviated sample of what the data looks like.
(
{:department-id 1 :employee-firstname "Fred" :employee-lastname "Bloggs"}
{:department-id 1 :employee-firstname "Joe" :employee-lastname "Bloggs"}
{:department-id 2 :employee-firstname "John" :employee-lastname "Doe"}
...
)
I would like to reshape it into something like this:
(
{:department-id 1 :employees [{:employee-firstname "Joe" :employee-lastname "Bloggs"} {:employee-firstname "Fred" :employee-lastname "Bloggs"}]}
{:department-id 2 :employees [{:employee-firstname "John" :employee-lastname "Doe"}]
...
)
I know I could a write a function that dealt with the departments and then the employees and "glued" them back together to achieve the shape I want. In fact I did just that in the REPL.
But I've heard a bit about transducers recently and wondered was this an opportunity to use one.
If it is, what would the code look like?
I'll have a go, but like many of us, I'm still wrapping my head around this as well.
From my reading on transducers, it would seem the real benefit is in avoiding the need to create intermediate collections, thereby increasing efficiency. This means that to answer your question, you really need to look at what your code will be doing and how it is structure.
For example, if you had something like
(->>
(map ....)
(filter ..)
(map ..)
(map ..))
the functions are being run in sequence with new collections being created after each to feed into the next. However, with transducers, you would end up with something like
(->>
(map ...)
(map ..)
(filter ..)
(map ...))
where the functions being applied to the data are applied in a pipline fashion on each item from the original collection and you avoid the need to generate the intermediate collections.
In your case, I'm not sure it will help. This is partially because I don't know what other transformations you are applyinig, but mainly because what you are wanting requires a level of state tracking i.e. the grouping of the employee data. This is possible, but I believe it makes it a little harder.

Dealing with database reads in Clojure

I am trying to 'purify' some of my Clojure functions. I would like to get to a point where all my side-effecting code is explicitly declared in one function. It's easy to get some data at the start and to write it to a db at the end and just have a pure function transforming in between. However, the usual situation is that the transformation function requires another DB read somewhere in the middle of the logic:
(defn transform-users
[users]
(let [ids (map :id users)
profiles (db/read :profiles ids)]
(profiles->something profiles)))
(->> (db/read :users)
(transform-users)
(db/write :something)
Obviously this is a very simple example but the point is, how do I get the side-effecting db/read function out of there, how can I make transform-users pure (and as a benefit, easily testable)?
One thing you could do here would be a dependency-injection-like approach of supplying the (potentially) side-effectful function as an optional parameter, e.g.:
(defn transform-users
[users & {:keys [ids->profiles]
:or {ids->profiles #(db/read :profiles %)}]
(let [ids (map :id users)
profiles (ids->profiles ids)]
(profiles->something profiles)))
This should be easily testable since you can mock the injected functions without a lot of effort. And as a bonus, by supplying the default value, you're documenting what you're expecting and making the function convenient to call.
Why couple the reading of the profiles with transforming profiles?
(->> (db/read :users)
(map :id)
(db/read :profiles)
(profile->something)
(db/write :something)
(This also exposes the fact that you are doing two round trips to the db. Where is db/read :user-profiles ?)
(->> (db/read :user-profiles)
(profile->something)
(db/write :something)
or perhaps:
(->> (read-profiles-from-users)
(profile->something)
(db/write :something)

Splice a collection inside another

For work, I want to describe the format of a standard medical formular (used to report drugs side-effects) the most concise way. (Roughly, to render it afterwards through hiccup but not only, that's why I don't write it directly as a hiccup structure)
For instance, part of the description would be:
{"reportertitle" [:one-of "Dr" "Pr" "Mrs" "Mr"] ; the reporter is usually the physician
"reportergivenname" :text
"reporterfamilyname" :text
"reporterorganization" :text
"reporterdepartment" :text
....
"literaturereference" :text
"studyname" :text
....}
The keys are standard names, I cannot change them, but I'd like to be able to easily factorize things: for instance the prefix "reporter" is highly used throughout the map, I would like to be able to factorize it, for instance by doing:
{ (prefix "reporter"
"title" [:one-of "Dr" "Pr" "Mrs" "Mr"]
"givenname" :text
"familyname" :text
"organization" :text
"department" :text)
.....
"literaturereference" :text
"studyname" :text
....}
But this cannot work, because I think I cannot "integrate" (splice, I believe is the correct term) the result of 'prefix', be it a function or a macro, inside the outer map.
Is there a solution to achieve this while maintaining a high level of declarativity/conciseness? (the whole form is huge and might be read by non-developers)
(As I'm new to Clojure, pretty much every design suggestion is welcome ;) )
Thanks!
You are right in that a macro cannot tell eval to splice its result into the outer expression. A straightforward way around it would be to wrap the whole map definition in a macro that recognizes the prefix expressions and translates them into appropriate key-value sequences inside the resulting map definition.
You can also do it with functions only by just gluing the submaps with merge:
(defn pref-keys [p m] (apply hash-map (apply concat (for [[k v] m] [(str p k) v])))))
(merge
(pref-keys "reporter"
{"title" [...]
"givenname" :text
...})
{"literaturereference" :text
"studyname" :text})
Which might be a bit more verbose but probably also a bit more readable.
Edit: There is one more limitation: map literals are created before any macros (inside or outside ones) are evaluated. A macro whose argument is a map literal will get a map, not some form whose evaluation would eventually produce the map. Of course the keys and values in this map are unevaluated forms, but the map itself is a proper map (IPersistentMap).
In particular this means that the literal needs to contain an even number of forms, so this:
(my-smart-macro { (prefix "reporter" ...) } )
will fail before my-smart-macro has a chance to expand the prefix. On the other hand, this will succeed:
(another-macro { (/ 1 0) (/ 1 0) })
... provided the macro filters out the invalid arithmetic expressions from its input map.
This means that you probably do not want to pass a map literal to the macro.
In advance, I should say that this answer may not at all be what you are looking for. It would be a way of doing things that would totally alter your data structure, and you seem to maybe be saying that that's not something you can do. Anyways, I'm suggesting it because I think it would be a good change to your data structure.
So, here's how I propose you re-envision your data:
{:reporter {:title "Dr, Pr, Mrs, or Mr here"
:given-name "text here"
:family-name "text here"
:organization "text here"
:department "text here"
...}
:literature-reference "text here"
:study-name "text here"
...}
There are two changes I'm putting forth here: one is structural and the other is "cosmetic". The structural one is to nest another map in there for the reporter-related stuff. I personally think this makes the data clearer, and it is no less accessible. Instead of doing something like (get *data* "reportertitle") to access it, and (assoc *data* "reportertitle" *new-title*) to make a new version of it, you would instead to (get-in *data* [:reporter :title]) and (assoc-in *data* [:reporter :title]).
The cosmetic change is to turn those string-based keys into Clojure keywords. My main reasons for suggesting this are that it would be more idiomatic and that it would be potentially clearer to read your code. For a better discussion on why to use keywords see maybe here or here.
Now, I realize everything I've said pre-supposes that you actually can change how your data is structured and how the keywords are named. You said "The keys are standard names, I cannot change them", and this seems to indicate that this type of solution wouldn't work for you. However, maybe you could inter-convert between the two forms. If you are importing this data from somewhere and it already has the format that you give above, you would convert it into the nested-map-with-keywords form, and keep it that way while you did whatever you did with it. Then, when you export the data to actually be outputted or used (or whatever ultimate end it serves), you would convert it back to the form as you have it above.
I should say that I, personally, do not at all like this "inter-conversion" idea. I think it divides the notions of "code" and "data", which seems like such a shame considering it would be done only to have the code "look and feel nicer" than the data. That being said, I'm proposing it in case it sounds good to you.

Can I create mutable state inside Clojure records?

I am considering using Clojure records to map to changing entities in my program. Are they mutable? Or do you need to use extra refs within the records? I am a little confused about this
It's well worth watching Rich Hickey's fantastic video on identity and state.
Records are designed to be immutable and store the state of something as a value.
To model the state of a changing entity, I'd recommend using a ref that refers to an immutable value that represents the current state. You can use records for the immutable state, but often it's simpler just to use a simple map.
A simple example, where the mutable state is a scoreboard for a game:
; set up map of current scores for each player
(def scores
(ref
{:mary 0
:joe 0 }))
; create a function that increments scores as a side effect
(defn add-score [player amount]
(dosync
(alter scores update-in [player] + amount)))
; add some scores
(add-score :mary (rand-int 10))
(add-score :joe (rand-int 10))
; read the scores
#scores
=> {:mary 6, :joe 1}
I have found that I much more commonly put records in refs than refs in records. mikira's advice to use a simple map sounds very good.
Start with a map and switch to something less flexable when you have to.