Clojure Parse String - regex

I have the following string
layout: default
title: Envy Labs
What i am trying to do is create map from it
layout->default
title->"envy labs"
Is this possible to do using sequence functions or do i have to loop through each line?
Trying to get a regex to work with and failing using.
(apply hash-map (re-split #": " meta-info))

user> (let [x "layout: default\ntitle: Envy Labs"]
(reduce (fn [h [_ k v]] (assoc h k v))
{}
(re-seq #"([^:]+): (.+)(\n|$)" x)))
{"title" "Envy Labs", "layout" "default"}

The _ is a variable name used to indicate that you don't care about the value of the variable (in this case, the whole matched string).

I'd recommend using clojure-contrib/duck-streams/read-lines to process the lines then split the fields from there. I find this method is usually more robust to errors in the file.

Related

Sum of numbers in a list from user input

Question from a total newbie with Clojure. Task is pretty simple, but I'm having hard time finding the best way to do this - I need to set up input, where user could give me a list (user should determine how long) of natural numbers and the program should just return a sum of these numbers.
Maybe this is totally wrong already:
(defn inputlist[naturallist]
(println "Enter list of natural numbers:")
(let[naturallist(read-line)] ))
Here is one way of doing it:
> lein new app demo
> cd demo
Edit the project.clj and src/demo/core.clj so they look as follows:
> cat project.clj
(defproject demo "0.1.0-SNAPSHOT"
:description "FIXME: write description"
:url "http://example.com/FIXME"
:license {:name "Eclipse Public License"
:url "http://www.eclipse.org/legal/epl-v10.html"}
:dependencies [[org.clojure/clojure "1.9.0"]
[org.clojure/tools.reader "1.1.3.1"] ]
:main ^:skip-aot demo.core
:target-path "target/%s"
:profiles {:uberjar {:aot :all}})
> cat src/demo/core.clj
(ns demo.core
(:require
[clojure.tools.reader.edn :as edn]
[clojure.string :as str] ))
(defn -main []
(println "enter numbers")
(let [input-line (read-line)
num-strs (str/split input-line #"\s+")
nums (mapv edn/read-string num-strs)
result (apply + nums) ]
(println "result=" result)))
with result
> lein run
enter numbers
1 2 3 <= you type this, then <enter>
input-line => "1 2 3"
num-strs => ["1" "2" "3"]
nums => [1 2 3]
result => 6
You may wish to start looking at some beginner books as well:
Brave Clojure
Living Clojure
Programming Clojure
The Clojure CheatSheet
The quickest way to do this I can think of is
#(apply + (map #(Integer/parseInt %) (re-seq #"\d+" (read-line))))
This defines an anonymous function which:
(read-line) - reads a line of text, hopefully containing numbers separated by non-numeric characters. So you type in something like "123 456 789"
(re-seq #"\d+" ...) - uses the regular expression \d+ to search for strings of consecutive digits. Each string of consecutive digits is added to a sequence, which is subsequently returned. So, for example, if you type in 123, 456, 789 the re-seq function will return the sequence '("123" "456" "789").
(map #(Integer/parseInt %) ...) - invokes the anonymous function #(Integer/parseInt %) for each element in the list returned by the re-seq invocation, creating another list of the results. So if the input is '("123" "456" "789") the output will be '(123 456 789).
(apply + ...) - applies the + function to the list of numbers, summing them up, and returns the sum.
Et voila! The end result is, you type in a string of numbers, separated by non-numeric characters, you get back the sum of those numbers. If you want to be a little neater, promote code re-use, and in general make this a bit more useful you could break this up into separate functions:
(defn parse-string-list [s]
(re-seq #"\d+" s))
(defn convert-seq-of-int-strings [ss]
(map #(Integer/parseInt %) ss))
(defn sum-numbers-in-seq [ss]
(apply + ss))
Invoking this in a Lisp-y way would look something like
(sum-numbers-in-seq (convert-seq-of-int-strings (parse-string-list (read-line))))
or, in a more Clojure-y way
(-> (read-line)
(parse-string-list)
(convert-seq-of-int-strings)
(sum-numbers-in-seq))
Best of luck.
Welcome to Clojure—and StackOverflow!
Here's how to do it:
(defn input-numbers-and-sum []
(print "Enter list of natural numbers: ")
(flush)
(->> (clojure.string/split (read-line) #"\s+")
(map #(Integer/parseInt %))
(reduce +)))
Here's how it works:
Calling print rather than println avoids printing a newline character at the end of the line. This way, the user's input will appear on the same line as your prompt.
Since there was no newline, you have to call flush to force the output buffer containing the prompt to be printed.
split splits what the user typed into a sequence of strings, divided where a regular expression matches. You have to say clojure.string/split rather than just split because split is not in Clojure's core library. clojure.string/ specifies the library. #"\s+" is a regular expression that matches any number of consecutive whitespace characters. So, if your user types "  6 82   -15   ", split will return ["6" "82" "-15"].
map calls the standard Java library function Integer.parseInt on each of those strings. Integer/parseInt is Clojure's Java interop syntax for calling a static method of a Java class. The #(...) is terse syntax that defines an anonymous function; the % is the argument passed to that function. So, given the sequence of strings above, this call to map will return a sequence of integers: [6 82 -15].
reduce calls the + function repeatedly on each element of the sequence of integers, passing the sum so far as an argument along with the next integer. map and reduce actually take three arguments; the next paragraph tells how the third paragraph gets filled in.
->> is the "thread-last macro". It rewrites the code inside it, to pass the output of each expression but the last as the last argument of the following expression. The result is:(reduce + (map #(Integer/parseInt %) (clojure.string/split (read-line) #"\s+")))Most people find the version with ->> much easier to read.
That might seem like a lot to do something very simple, but it's actually bread and butter once you're used to Clojure. Clojure is designed to make things easy to combine; map, reduce, and ->> are especially useful tools for hooking other functions together.
I've included links to the documentation. Those are worth a look; many contain typical examples of use.
There are other ways to parse numbers, of course, some of which are shown in the answers to this question. What I've written above is an "idiomatic" way to do it. Learn that, and you'll know a lot of the everyday, must-know techniques for programming in Clojure.
Yep, readline is the correct way to do it.
But each element from readlines is essentially an instance of java.lang.Character , and since you want the sum, you'd prefer to convert them to integer before summing the elements of list.
(defn input-list
[]
(print "Enter list of natural numbers")
(let [nums (read-line)]
(reduce + (map #(Integer/parseInt %) (clojure.string/split nums #"\s+")))
This might not be the most idiomatic way to do it, but feel free to tweak it.
Also, please do clean up on your variable/function names.
Edit : (Integer/parseInt %) might cause an error if used directly since the input is an instance of characters, not string. So we can use clojure.string/split to convert user input to a sequence of strings, and use Integer/parseInt % for conversion.
In fact, a more readable version can be written using thread-first macros :
(defn input-list []
(print "Enter list of natural numbers: ")
(->> (clojure.string/split (read-line) #"\s+")
(map #(Integer/parseInt %))
(reduce +)))
This is a more clojurey way to do it.
If you don't care about negative scenarios (wrong input, syntax issues), the quickest solution would be to evaluate the user's input putting it into parens:
(defn sum-ints []
(let [input (read-line)
ints (read-string (str "(" input ")"))]
(reduce + 0 ints)))
Usage:
user=> (sum-ints)
1 2 3 4 5
15
The read-string function evaluates an text expression (1 2 3 4 5) in that case. Since numeral literals turn into numbers, the result will be just a list of numbers.

map-indexed alternative for reducers

Is there a map-indexed alternative for clojure.core.reducers? I would like something that would work lazily like r/map (without constructing new sequence).
I suspect that what you really want to use is a transducer, since map-indexed has a 1-arity version (as does map, filter, and many other core functions) that returns a transducer. Transducers are composable, and do not create an intermediate sequence. Here is a short example:
(def xf (comp
(map-indexed (fn [i value] [i value]))
(filter (fn [[i value]] (odd? i)))
(map second)))
This says: generate an indexed vector using map-indexed, filter out only the vectors whose index is odd, and get the second element. It's a long-winded way of saying (filter odd? collection) but it's only for example purposes.
You can use this with into:
(into [] xf "ThisIsATest")
=> [\h \s \s \T \s]
or you can use the transduce function and apply str to the result:
(transduce xf str "ThisIsATest")
=> "hssTs"

Clojure: pass value if it passes predicate truth test

Is it possible to remove the let statement / avoid the intermediate 'x' in the following code?:
(let [x (f a)]
(when (pred? x) x))
I bumped into this problem in the following use case:
(let [coll (get-collection-somewhere)]
(when (every? some? coll) ; if the collection doesn't contain nil values
(remove true? coll))) ; remove all true values
So if the collection is free of nil values, only not-true values remain, like numbers, strings, or whatever.
So, I'm looking for something like this:
(defn pass-if-true [x pred?]
(when (pred? x) x))
Assuming that you don't want to define that pass-if-true function, the best you can do is an anonymous function:
(#(when (every? some? %)
(remove true? %))
(get-collection-somewhere))
You could also extract the predicate and transformation into parameters:
(#(when (%1 %3) (%2 %3))
(partial every? some?)
(partial remove true?)
(get-collection-somewhere))
The let form is necessary to prevent your collection-building function from running twice:
(f a) or (get-collection-somewhere)
This is a typical idiom and you are doing it correctly.
Of course, you don't need the let if you already have the collection and are not building inside this expression.
However, you may wish to see when-let:
https://clojuredocs.org/clojure.core/when-let
It can save some keystrokes in some circumstances, but this isn't one of them.

Beginner in clojure: Tokenizing lists of different characters

So I know this isn't the best method of solving this issue, but I'm trying to go through a list of lines from an input file, which end up being expressions. I've got a list of expressions, and each expression has it's own list thanks to the split-the-list function. My next step is to replace characters with id, ints with int, and + or - with addop. I've got the regexes to find whether or not my symbols match any of those, but when I try and replace them, I can only get the last for loop I call to leave any lasting results. I know what it stems down to is the way functional programming works, but I can't wrap my head around the trace of this program, and how to replace each separate type of input and keep the results all in one list.
(def reint #"\d++")
(def reid #"[a-zA-Z]+")
(def readdop #"\+|\-")
(def lines (into () (into () (clojure.string/split-lines (slurp "input.txt")) )))
(defn split-the-line [line] (clojure.string/split line #" " ))
(defn split-the-list [] (for [x (into [] lines)] (split-the-line x)))
(defn tokenize-the-line [line]
(for [x line] (clojure.string/replace x reid "id"))
(for [x line] (clojure.string/replace x reint "int"))
(for [x line] (clojure.string/replace x readdop "addop")))
(defn tokenize-the-list [] (for [x (into [] (split-the-list) )] (tokenize-the-line x)))
And as you can probably tell, I'm pretty new to functional programming, so any advice is welcome!
You're using a do block, which evaluates several expressions (normally for side effects) and then returns the last one. You can't see it because fn (and hence defn) implicitly contain one. As such, the lines
(for [x line] (clojure.string/replace x reid "id"))
(for [x line] (clojure.string/replace x reint "int"))
are evaluated (into two different lazy sequences) and then thrown away.
In order for them to affect the return value, you have to capture their return values and use them in the next round of replacements.
In this case, I think the most natural way to compose your replacements is the threading macro ->:
(for [x line]
(-> x
(clojure.string/replace reid "id")
(clojure.string/replace reint "int")
(clojure.string/replace readdop "addop")))
This creates code which does the reid replace with x as the first argument, then does the reint replace with the result of that as the first argument and so on.
Alternatively you could do this by using comp to compose anonymous functions like (fn [s] (clojure.string/replace s reid "id") (partial application of replace). In the imperative world we get pretty used to running several procedures that "bash the data in place" - in the functional world you more often combine several functions together to do all the operations and then run the result.

use 'for' inside 'let' return a list of hash-map

Sorry for the bad title 'cause I don't know how to describe in 10 words. Here's the detail:
I'd like to loop a file in format like:
a:1 b:2...
I want to loop each line, collect all 'k:v' into a hash-map.
{ a 1, b 2...}
I initialize a hash-map in a 'let' form, then loop all lines with 'for' inside let form.
In each loop step, I use 'assoc' to update the original hash-map.
(let [myhash {}]
(for [line #{"A:1 B:2" "C:3 D:4"}
:let [pairs (clojure.string/split line #"\s")]]
(for [[k v] (map #(clojure.string/split %1 #":") pairs)]
(assoc myhash k (Float. v)))))
But in the end I got a lazy-seq of hash-map, like this:
{ {a 1, b 2...} {x 98 y 99 z 100 ...} }
I know how to 'merge' the result now, but still don't understand why 'for' inside 'let' return
a list of result.
What I'm confused is: does the 'myhash' in the inner 'for' refers to the 'myhash' declared in the 'let' form every time? If I do want a list of hash-map like the output, is this the idiomatic way in Clojure ?
Clojure "for" is a list comprehension, so it creates list. It is NOT a for loop.
Also, you seem to be trying to modify the myhash, but Clojure's datastructures are immutable.
The way I would approach the problem is to try to create a list of pair like (["a" 1] ["b" 2] ..) and the use the (into {} the-list-of-pairs)
If the file format is really as simple as you're describing, then something much more simple should suffice:
(apply hash-map (re-seq #"\w+" (slurp "your-file.txt")))
I think it's more readable if you use the ->> threading macro:
(->> "your-file.txt" slurp (re-seq #"\w+") (apply hash-map))
The slurp function reads an entire file into a string. The re-seq function will just return a sequence of all the words in your file (basically the same as splitting on spaces and colons in this case). Now you have a sequence of alternating key-value pairs, which is exactly what hash-map expects...
I know this doesn't really answer your question, but you did ask about more idiomatic solutions.
I think #dAni is right, and you're confused about some fundamental concepts of Clojure (e.g. the immutable collections). I'd recommend working through some of the exercises on 4Clojure as a fun way to get more familiar with the language. Each time you solve a problem, you can compare your own solution to others' solutions and see other (possibly more idomatic) ways to solve the problem.
Sorry, I didn't read your code very thorougly last night when I was posting my answer. I just realized you actually convert the values to Floats. Here are a few options.
1) partition the sequence of inputs into key/val pairs so that you can map over it. Since you now how a sequence of pairs, you can use into to add them all to a map.
(->> "kvs.txt" slurp (re-seq #"\w") (partition 2)
(map (fn [[k v]] [k (Float. v)])) (into {}))
2) Declare an auxiliary map-values function for maps and use that on the result:
(defn map-values [m f]
(into {} (for [[k v] m] [k (f v)])))
(->> "your-file.txt" slurp (re-seq #"\w+")
(apply hash-map) (map-values #(Float. %)))
3) If you don't mind having symbol keys instead of strings, you can safely use the Clojure reader to convert all your keys and values.
(->> "your-file.txt" slurp (re-seq #"\w+")
(map read-string) (apply hash-map))
Note that this is a safe use of read-string because our call to re-seq would filter out any hazardous input. However, this will give you longs instead of floats since numbers like 1 are long integers in Clojure
Does the myhash in the inner for refer to the myhash declared in the let form every time?
Yes.
The let binds myhash to {}, and it is never rebound. myhash is always {}.
assoc returns a modified map, but does not alter myhash.
So the code can be reduced to
(for [line ["A:1 B:2" "C:3 D:4"]
:let [pairs (clojure.string/split line #"\s")]]
(for [[k v] (map #(clojure.string/split %1 #":") pairs)]
(assoc {} k (Float. v))))
... which produces the same result:
(({"A" 1.0} {"B" 2.0}) ({"C" 3.0} {"D" 4.0}))
If I do want a list of hash-map like the output, is this the idiomatic way in Clojure?
No.
See #DaoWen's answer.