Clojure. What is 'banana? How is that different from "banana"? - clojure

Follow along with me at the Clojure REPL:
(def apos 'banana) ; "apostropied" banana.
=> #'thic.core/apos
apos
=> banana
(def quo "banana") ; "quoted" banana.
=> #'thic.core/quo
quo
=> "banana"
Clearly banana and "banana" are not the same. I GET that "banana" is a String, but what is banana?
When I 'type' them, I get:
type apos
=> #object[clojure.core$type 0x69085742 "clojure.core$type#69085742"]
=> banana
type quo
=> #object[clojure.core$type 0x69085742 "clojure.core$type#69085742"]
=> "banana"
So banana and "banana" are the same TYPE of thing?
To add to my confusion, at the REPL I get this:
(type apos) ; Add a pair of parents.
=> clojure.lang.Symbol ; apos becomes a Symbol.
(type quo) ; and
=> java.lang.String ; quo becomes a String.
What am I seeing here?
banana (without the quotes) is a "symbol" and "banana" (with the quotes) is a "string"?
apos is a Symbol and quo is a String?
After many attempts, I am still hung up on Symbol vs. Symbol Value. SO 20th Century. :-)

You may wish to consult this list of Clojure documentation sources. In particular, see
Clojure for the Brave & True
Getting Clojure
As for your question, 'foo is the symbol composed of the 3 letters foo, while "foo" is a string composed of the 3 letters foo. You normally don't need to use symbols as data in your code, unless you are doing something advanced.
Clojure also has keywords like :foo. It is composed of 3 letters foo. The colon acts like the quote in front of a symbol.
A string is used just like in other languages. A symbol or keyword is not allowed to have spaces, so they don't need a trailing quote or colon, and Clojure leaves it off for simplicity.
When you see a plain symbol foo in the code (without a leading quote) it acts as a variable just like in Java, etc.
A string is meant to contain user data of any type. A keyword is used to represent program control information like :left or :right, instead of a "magic" number, so these are similar to an enum in Java.

Related

Mapping a string using a map

(def conversions {"G" "C"
"C" "G"
"T" "A"
"A" "U"})
(defn to-rna [dna]
(map conversions dna)
)
(conversions "G") ;; Gives "C"
(to-rna "GC") ;; Gives (nil nil)
I'm attempting to do an exercise where I convert letters. I have a working solution, but I don't like it. I feel like the above ought to work, but evidently I'm wrong, because it doesn't.
Could someone explain to me why this is, and how I might properly achieve this?
When mapping over a string, it will treat the string as a sequence of characters. So, your code ends up looking for a \G and a \C entry in the map, which both return nil.
As dpassen says, you need to put a java.lang.Character in the map, not a length-1 string. Try this:
(def conversions { \G \C
\C \G
\T \A
\A \U })
I'm just starting learning Clojure myself so please take this answer with caution.
In addition to what's been already suggested, I would put the conversions map into a let form to keep your function "isolated". (As is, your function relies on a conversions being defined outside of its scope).
I also read (can't remember where exactly) that a common naming convention when writing functions that "convert" X to Y should be named as follow: x->y.
Finally I'd use a threading macro for improved readability.
(defn dna->rna [dna]
(let [conversions {\G \C
\C \G
\T \A
\A \U}]
(->> dna
(map conversions)
(string/join ""))))
(dna->rna "GC")
;; "CG"
FYI, Clojure has clojure.string/escape and clojure.string/replace that you might want to look at. escape is probably most similar to what you are doing.

Why does special characters in my variable disappear on doing an lindex in TCL?

I have a list in my application that i work on.. Its basically like this:
$item = {text1 text2 text3}
Then I pick up the first member in the list with:
lindex $item 0
On doing this text1 which used to be (say) abcdef\12345 becomes abcdef12345.
But its very important for me to not lose this \ . Why is it disappearing. THere are other characters like - and > which don't disappear. Please note that I cannot escape the \ in the text beforehand. If there's anything I can do before operating on the $item with lindex, please suggest.
The problem is that \ is a Tcl list metasyntax character, unlike -, > or any alphanumeric. You need to convert your string into a proper Tcl list before using lindex (or any other list-consuming operation) on it. To do that, you need to understand exactly what you mean by “words” in your input data. If your input data is a sequences of non-whitespace characters separated by single whitespace characters, you can use split to do the conversion to a list:
set properList [split $item]
# Now we can use it...
set theFirstWord [lindex $properList 0]
If you've got a different separator, split takes an optional extra character to say what to split by. For example, to split by colons (:) you do:
set properList [split $item ":"]
However, if you have other sorts of splitting rules, this doesn't work so well. For example, if you can split by multiple whitespace characters, it's actually better to use regexp (with the -all -inline options) to do the word-identification:
# Strictly, this *chooses* all sequences of one or more non-whitespace characters
set properList [regexp -all -inline {\S+} $item]
You can also do splitting by multi-character sequences, though in that case it is most easily done by mapping (with string map) the multi-character sequence to a single rare character first. Unicode means that there are lots of such characters to pick…
# NUL, \u0000, is a great character to pick for text, and terrible for binary data
# For binary data, choose something beyond \u00ff
set properList [split [string map {"BOUNDARY" "\u0000"} $item] "\u0000"]
Even more complex options are possible, but that's when you use splitx from Tcllib.
package require textutil::split
# Regular expression to describe the separator; very sophisticated approach
set properList [textutil::split::splitx $item {SPL+I*T}]
In tcl Lists can be created in several ways:
by setting a variable to be a list of values
set lst {{item 1} {item 2} {item 3}}
with the split command
set lst [split "item 1.item 2.item 3" "."]
with the list command.
set lst [list "item 1" "item 2" "item 3"]
And an individual list member can be accessed with the lindex command.
set x "a b c"
puts "Item 2 of the list {$x} is: [lindex $x 2]\n"
This will give output:
Item 2 of the list {a b c} is: c
And With respect to the question asked
You need to define the variable like this abcdef\\12345
In order to make this clear try to run the following command.
puts "\nI gave $100.00 to my daughter."
and
puts "\nI gave \$100.00 to my daughter."
The second one will give you the proper result.
If you don't have the option to change the text, try to save the text in curly braces, as mentioned in the first example.
set x {abcd\12345}
puts "A simple substitution: $x\n"
Output:
A simple substitution: abcd\12345
set y [set x {abcdef\12345}]
And check for this output:
puts "Remember that set returns the new value of the variable: X: $x Y: $y\n"
Output:
Remember that set returns the new value of the variable: X: abcdef\12345 Y: abcdef\12345

"Can't let qualified name" when using clojure.core.match

I'm using clojure.core.match and seeing the following error:
Can't let qualified name
My code resembles:
(match [msg-type]
[MsgType/TYPE_1] (do-type-1-thing)
[MsgType/TYPE_2] (do-type-2-thing))
Where MsgType/TYPE_1 comes from a Java class:
public class MsgType {
public static final String TYPE_1 = "1";
public static final String TYPE_2 = "2";
}
What does this error mean, and how can I work around it?
The problem seems related to macro name binding, though I don't understand it deeply as I'm quite new to macros.
Originally I hoped using case rather than match would prove a viable workaround:
(case msg-type
MsgType/TYPE_1 (do-type-1-thing)
MsgType/TYPE_2 (do-type-2-thing))
However the above doesn't work. case matches on the symbol MsgType/TYPE_n, not the evaluation of that symbol.
The best I've found so far is to convert the value coming in to a keyword and match that way:
(def type->keyword
{MsgType/TYPE_1 :type-1
MsgType/TYPE_2 :type-2})
(case (type->keyword msg-type)
:type-1 (do-type-1-thing)
:type-2 (do-type-2-thing))
In general, pattern matching is not the right tool for comparing one variable with another. Patterns are supposed to be either literals such as 1 or :a, destructuring expressions or variables to be bound. So, for example, take this expression:
(let [a 1
b 2]
(match [2]
[a] "It matched A"
[b] "It matched B"))
You might expect it to yield "It matched B" since the variable b is equal to 2, but in fact it will bind the value 2 to a new variable named a and yield "It matched A".
I think you're looking for condp =. It's basically what you wish case would be.
(condp = msg-type
MsgType/TYPE_1 (do-type-1-thing)
MsgType/TYPE_2 (do-type-2-thing))

Iterative procedure for unit extraction

Sometimes there is a range of values followed by a unit of measurement. The input will be a string of text containing digits followed by units to be extracted by a function. Given a string of text that contains a number followed by a unit the following can extract the number and unit as a nested vector:
(def aa ["meter" "kilometer"])
(def bb (clojure.string/join "|" aa))
(def cc (str "(\\d+)\\s*(" bb ")"))
(def dd (re-pattern cc))
(defn foostring [strings]
(into [] (map into [] (map (fn [[_ count unit]] {:count count, :unit unit})
(re-seq dd strings)))))
For example let's try the input:
(foostring "Today I sprinted 40 meters.")
The output will be:
[[[:count 40] [:unit meter]]]
However I am unable to extract a range of numbers followed by a unit such as the following example:
(foostring "Today I sprinted between 80-90 meters.")
The function will pick out 90 for count and meter for units. However, I am trying to pick up the range of numbers in front of the unit.
The idea I believe can extract such patterns will look recursively for "near-neighbors." Namely, the function finds units, then looks to the left of the unit for digits. In the process of "looking left" the function searches for possibly a single digit such as the mentioned example, a digit follow by a punctuation, i.e. slash - or a word. Expanding on the last search let me provide an example:
(foostring "Today I ran between 80 to 90 meters.")
Or, the colloquial
(foostring "There were 80 90 Yeti running through the forest.")
Although the Yeti example is odd, when written, it captures an idea of people's speech being translated to text. An example of when this might happen is in the process of quoting someone for an article.
The idea I believe can extract such patterns will look recursively for "near-neighbors."
If you really mean recursively, then you've surely left the realm of regular expressions. If you don't get too crazy with your expressions you can use context free EBNF.
(require '[instaparse.core :as insta])
(def foostring
(insta/parser
"<S> = Expr+
Expr = <Stuff> Number+ {<[' '] [Preposition] [' ']> Number} <' '> Unit <Stuff>;
Bleh = #'[a-z A-Z.,]+';
Stuff = {Bleh}
Preposition = 'between'|'to'|'-';
Unit = 'meter'|'kilometer'|'Yeti'|'sandwiches';
Number = #'[0-9]+'"))
If you don't have a set list of units/prepositions, define as e.g. any word.
(foostring "Today I sprinted 40 meters while eating 2 3 4 sandwiches, running from 80-90 Yeti.")
=>
([:Expr [:Number "40"] [:Unit "meter"]]
[:Expr [:Number "2"] [:Number "3"] [:Number "4"] [:Unit "sandwiches"]]
[:Expr [:Number "80"] [:Number "90"] [:Unit "Yeti"]])
Try this:
(?i)(?<lowerBound>\d+)(?:\s*(?:-|to)\s*(?<upperBound>\d+))?\s+(?<unit>meters?|kilometers?|...)
Description
Demo
http://fiddle.re/k20ff
(Choose Java since Clojure share the same flavor with Java)

What is the 'reword' function in Rebol and how do I use it?

I saw someone mention the reword function today, but documentation for it is very brief. It looks like shell script environment variable substitution, or maybe regex substitution, but different. How do I use this function and what kind of gotchas am I going to run into?
Here there be dragons!
The reword function is a bit of an experiment to add shell-style string interpolation to Rebol in a way that works with the way we do things. Unlike a lot of Rebol's series functions, it really is optimized for working on just string types, and the design reflects that. The current version is a design prototype, meant to eventually be redone as a native, but it does work as designed so it makes sense to talk about how it works and how to use it.
What does reword do?
Basically this:
>> reword "$a is $b." [a "This" b "that"]
== "This is that."
It takes a template string, it searches for the escape sequences, and replaces those with the corresponding substitution values. The values are passed to the function as well, as an object, a map, or a block of keys and values. The keys can be pretty much anything, even numbers:
>> reword "$1 is $2." [1 "This" 2 "that"]
== "This is that."
The keys are converted to strings if they aren't strings already. Keys are considered to be the same if they would be converted to the same string, which is what happens when you do something like this:
>> reword "A $a is $a." [a "fox" "a" "brown"]
== "A brown is brown."
It's not positional like regex replacement, it's keyword based. If you have a key that is specified more than once in the values block, the last value for that key is the one that gets used, as we just saw. Any unset or none values are just skipped, since those have no meaning when putting stuff into a string.
You can use other escape flags too, even multi-character ones:
>> reword/escape "A %%a is %%b." [a "fox" b "brown"] "%%"
== "A fox is brown."
Or even have no escape flag at all, and it will replace the key everywhere:
>> reword/escape "I am answering you." [I "Brian" am "is" you "Adrian"] none
== "Brian is answerBrianng Adrian."
Whoops, that didn't work. This is because the keys aren't case-sensitive, and they don't need to be surrounded by spaces or other such delimiters. But, you can put spaces in the keys themselves if you specify them as strings, so this works better:
>> reword/escape "I am answering you." ["I am" "Brian is" you "Adrian"] none
== "Brian is answering Adrian."
Still, doing reword templates without escape characters tends to be tricky and a little bit slower, so it's not done as often.
There's an even better trick though...
Function replacement
Where reword gets really interesting is when you use a function as a replacement value, since that function gets called with every rewording. Say, you wanted to replace with a counter:
>> reword "$x then $x and $x, also $x" object [y: 1 x: does [++ y]]
== "1 then 2 and 3, also 4"
Or maybe even the position, since it can take the string position as a parameter:
>> reword "$x then $x and $x, also $x" object [x: func [s] [index? s]]
== "1 then 9 and 16, also 25"
Wait, that doesn't look right, those numbers seem off. That is because the function is returning the indexes of the template string, not the result string. Good to keep that in mind when writing these functions. The function doesn't even have to just be assigned to one key, it can detect or use it:
>> reword "$x or $y" object [x: y: func [s] [ajoin ["(it's " copy/part s 2 ")"]]]
== "(it's $x) or (it's $y)"
See, template variables, escapes and all. And the function can have side effects, like this line counter:
>> reword/escape "Hello^/There^/nl" use [x] [x: 0 map reduce ["^/" does [++ x "^/"] "nl" does [x]]] ""
== "Hello^/There^/2"
It even comes with the /into option, so you can use it to build strings in stages.
But the big gotcha for someone coming from a language with interpolation build in, is...
Why a values block, why not just use variables like a normal language?
Because Rebol just doesn't work that way. Rebol doesn't have lexical binding, it does something else, so in a string there is just no way to know where to get the values of variables from without saying so. In one of those shell languages that has interpolation, it would be the equivalent to having to pass a reference to the environment as a whole to the interpolation function. But hey, we can do just that in Rebol:
>> use [x] [x: func [s] [index? s] reword "$x then $x and $x, also $x" bind? 'x]
== "1 then 9 and 16, also 25"
That bind? method will work in use, binding loops and functions. If you are in an object, you can also use self:
>> o: object [x: func [s] [index? s] y: func [s] [reword s self]]
== make object! [
x: make function! [[s][index? s]]
y: make function! [[s][reword s self]]
]
>> o/y "$x then $x and $x, also $x"
== "1 then 9 and 16, also 25"
But be careful, or you can end up doing something like this:
>> o/y "$x then $x and $x, also $x, finally $y"
** Internal error: stack overflow
Dragons! That's one good reason to keep your variables and your replacement keys separate...