Read in expressions - EOF while reading - in Clojure - clojure

I'd like to read in a (longer) file in Closure that contains proper LISP syntax. So I think the best way to do so would be to avoid strings. I tried to use the following function:
(defn my-reader
[filename]
(with-open [r (java.io.PushbackReader. (reader filename))]
(binding [*read-eval* false]
(loop [expr (read r :eof :end)]
(if (= :end expr)
(print "The End")
(do
(println expr)
(recur (read r :eof :end))))))))
However, this throws a EOF while reading error.
As a workaround, I've built:
(defn my-reader
[filename]
(with-open [r (java.io.PushbackReader. (reader filename))]
(binding [*read-eval* false]
(loop [expr (read r :eof :end)]
(if (= -1 expr)
(print "The End")
(recur (try
(print (read r :eof :end))
(catch Exception e -1))))))))
But that's pretty ugly. My attempt with the help of How to use clojure.edn/read to get a sequence of objects in a file? was:
(defn my-reader
[filename]
(with-open [r (java.io.PushbackReader. (reader filename))]
(binding [*read-eval* false]
(let [expr (repeatedly (partial read r :eof :theend))]
(dorun (map println (take-while (partial not= :theend) expr)))))))
And didn't work neither (EOF while reading). However, the solution in the other thread works fine (and edn seem to be the preferred way to read in files). Nevertheless, I'd like to know why my attempts do not work.

The semantics of clojure.core/read are a bit different than clojure.tools.reader.edn/read.
The arity-3 version takes arguments
stream the input stream
eof-error? a boolean specifying whether you want read to throw an exception at EOF
eof-value a sentinel value to return when EOF is reached
So, I think you want to replace (read r :eof :end) with (read r false :end) as keywords are truthy.

Related

Understanding core.async merge, in Clojure vs ClojureScript

I'm experimenting with core.async on Clojure and ClojureScript, to try and understand how merge works. In particular, whether merge makes any values put on input channels available to take immediately on the merged channel.
I have the following code:
(ns async-merge-example.core
(:require
#?(:clj [clojure.core.async :as async] :cljs [cljs.core.async :as async])
[async-merge-example.exec :as exec]))
(defn async-fn-timeout
[v]
(async/go
(async/<! (async/timeout (rand-int 5000)))
v))
(defn async-fn-exec
[v]
(exec/exec "sh" "-c" (str "sleep " (rand-int 5) "; echo " v ";")))
(defn merge-and-print-results
[seq async-fn]
(let [chans (async/merge (map async-fn seq))]
(async/go
(while (when-let [v (async/<! chans)]
(prn v)
v)))))
When I try async-fn-timeout with a large-ish seq:
(merge-and-print-results (range 20) async-fn-timeout)
For both Clojure and ClojureScript I get the result I expect, as in, results start getting printed pretty much immediately, with the expected delays.
However, when I try async-fn-exec with the same seq:
(merge-and-print-results (range 20) async-fn-exec)
For ClojureScript, I get the result I expect, as in results start getting printed pretty much immediately, with the expected delays. However for Clojure even though the sh processes are executed concurrently (subject to the size of the core.async thread pool), the results appear to be initially delayed, then mostly printed all at once! I can make this difference more obvious by increasing the size of the seq e.g. (range 40)
Since the results for async-fn-timeout are as expected on both Clojure and ClojureScript, the finger is pointed at the differences between the Clojure and ClojureScript implementation for exec..
But I don't know why this difference would cause this issue?
Notes:
These observations were made in WSL on Windows 10
The source code for async-merge-example.exec is below
In exec, the implementation differs for Clojure and ClojureScript due to differences between Clojure/Java and ClojureScript/NodeJS.
(ns async-merge-example.exec
(:require
#?(:clj [clojure.core.async :as async] :cljs [cljs.core.async :as async])))
; cljs implementation based on https://gist.github.com/frankhenderson/d60471e64faec9e2158c
; clj implementation based on https://stackoverflow.com/questions/45292625/how-to-perform-non-blocking-reading-stdout-from-a-subprocess-in-clojure
#?(:cljs (def spawn (.-spawn (js/require "child_process"))))
#?(:cljs
(defn exec-chan
"spawns a child process for cmd with args. routes stdout, stderr, and
the exit code to a channel. returns the channel immediately."
[cmd args]
(let [c (async/chan), p (spawn cmd (if args (clj->js args) (clj->js [])))]
(.on (.-stdout p) "data" #(async/put! c [:out (str %)]))
(.on (.-stderr p) "data" #(async/put! c [:err (str %)]))
(.on p "close" #(async/put! c [:exit (str %)]))
c)))
#?(:clj
(defn exec-chan
"spawns a child process for cmd with args. routes stdout, stderr, and
the exit code to a channel. returns the channel immediately."
[cmd args]
(let [c (async/chan)]
(async/go
(let [builder (ProcessBuilder. (into-array String (cons cmd (map str args))))
process (.start builder)]
(with-open [reader (clojure.java.io/reader (.getInputStream process))
err-reader (clojure.java.io/reader (.getErrorStream process))]
(loop []
(let [line (.readLine ^java.io.BufferedReader reader)
err (.readLine ^java.io.BufferedReader err-reader)]
(if (or line err)
(do (when line (async/>! c [:out line]))
(when err (async/>! c [:err err]))
(recur))
(do
(.waitFor process)
(async/>! c [:exit (.exitValue process)]))))))))
c)))
(defn exec
"executes cmd with args. returns a channel immediately which
will eventually receive a result map of
{:out [stdout-lines] :err [stderr-lines] :exit [exit-code]}"
[cmd & args]
(let [c (exec-chan cmd args)]
(async/go (loop [output (async/<! c) result {}]
(if (= :exit (first output))
(assoc result :exit (second output))
(recur (async/<! c) (update result (first output) #(conj (or % []) (second output)))))))))
Your Clojure implementation uses blocking IO in a single thread. You are first reading from stdout and then stderr in a loop. Both do a blocking readLine so they will only return once they actually finished reading a line. So unless your process creates the same amount of output to stdout and stderr one stream will end up blocking the other one.
Once the process is finished the readLine will no longer block and just return nil once the buffer is empty. So the loop just finishes reading the buffered output and then finally completes explaining the "all at once" messages.
You'll probably want to start a second thread that deals reading from stderr.
node does not do blocking IO so everything happens async by default and one stream doesn't block the other.

Try-with-resources in Clojure

Does Clojure have an equivalent of Java's try-with-resources construct?
If not, what is the normal way to handle this idiom in Clojure code?
The pre-Java-7 idiom for safely opening and closing resources is verbose enough that they actually added support for try-with-resources to the language. It seems strange to me that I can't find a macro for this use case in the standard Clojure library.
An example to a mainstream Clojure-based project repository—showing how this issue is handled in practice—would be very helpful.
You can use with-open to bind a resource to a symbol and make sure the resource is closed once the control flow left the block.
The following example is from clojuredocs.
(with-open [r (clojure.java.io/input-stream "myfile.txt")]
(loop [c (.read r)]
(when (not= c -1)
(print (char c))
(recur (.read r)))))
This will be expanded to the following:
(let [r (clojure.java.io/input-stream "myfile.txt")]
(try
(loop [c (.read r)]
(when (not= c -1)
(print (char c))
(recur (.read r))))
(finally (.close r))))
You can see that a let block is created with a try-finally to the call .close() method.
you can do something more close to java, making up some macro on top of with-open. It could look like this:
(defmacro with-open+ [[var-name resource & clauses] & body]
(if (seq clauses)
`(try (with-open [~var-name ~resource] ~#body)
~#clauses)
`(with-open [~var-name ~resource] ~#body)))
so you can pass additional clauses alongside the binding.
(with-open+ [x 111]
(println "body"))
expands to simple with-open:
(let*
[x 111]
(try (do (println "body")) (finally (. x clojure.core/close))))
while additional clauses lead to wrapping the it in try-catch:
(with-open+ [x 111
(catch RuntimeException ex (println ex))
(finally (println "finally!"))]
(println "body"))
expands to
(try
(let*
[x 111]
(try (do (println "body")) (finally (. x clojure.core/close))))
(catch RuntimeException ex (println ex))
(finally (println "finally!")))
But still it is quite an opinionated solution.

Clojure - Is it possible to increment a variable within a doseq statement?

I am trying to iterate over a list of files in a given directory, and add an incrementing variable i = {1,2,3.....} to their names.
Here is the code I have for iterating through the files and changing each file's name:
(defn addCounterToExtIn [d]
(def i 0)
(doseq [f (.listFiles (file d)) ] ; make a sequence of all files in d
(if (and (not (.isDirectory f)) ; if file is not a directry and
(= '(\. \i \n) (take-last 3 (.getName f))) ) ; if it ends with .in
(fs/rename f (str d '/ i (.getName f)))))) ; add i to start of its name
I don't know how can I increment i as doseq iterates through each file. Alternatively, is there a better loop to use to achieve the desired result?
use file-seq and map-indexed:
(require '[clojure.java.io :as io])
(dorun
(->>
(file-seq (io/file "/home/eduard/Downloads"))
(filter #(re-find #".+\.pdf$" (.getName %)))
(map-indexed (fn [i v] [i v]))))
Change function in map-indexed to rename and you're done.
The sample output for pdf files:
([0 #<File /home/eduard/Downloads/some.pdf>] ...)
This is the first approach off the top of my head. It's not ideal, but certainly more idiomatic than what the question proposes.
(def rename-one-file! [file counter]
(if (and (not (.isDirectory file))
(= ".in" (str (take-last 3 (.getName file)))))
(fs/rename file (file (parent dir)
(str counter (.getName file)))))
(defn iterate-files-with-counter [fn dir]
(loop [counter 0
remaining-files (.listFiles (file dir))]
(let [current-file (first remaining-files)]
(fn file counter)
(recur (+ counter 1) (rest remaining-files))))
(def add-counter-to-ext-in-dir
(partial iterate-files-with-counter rename-one-file!))
Note that the work of actually performing the rename was split off from the work of iterating over the files. Having a large number of small functions is better than than a small number of large functions in general, and making those functions reusable / independent unless you choose to use them together is even better than that.

How i can deserialize record structure from file, already saved to file with print-dup?

I'm have a following code:
(use 'clojure.java.io)
(defrecord Member [id name salary role])
(defrecord Role [id name])
(def member-records (ref ()))
(defn add-member [member]
(dosync (alter member-records conj member)))
;;Test-data -->
(def dev-r(->Role 1 "Developer"))
(def test-member1(->Member 1 "Kirill" 70000.00 dev-r))
;;Test-data <--
(defn save-data-2-file []
(with-open [wrtr (writer "C:/Platform/Work/test.cdf")]
(print-dup #member-records wrtr)))
(defn process-line [line]
(println line))
;;Test line content
;;#BTC.pcost.Member{:id 1, :name "Kirill", :salary 70000.0, :role #BTC.pcost.Role{:id 1, :name "Developer"}})
(defn load-data-from-file []
(with-open [rdr (reader "C:/Platform/Work/test.cdf")]
(doseq [line (line-seq rdr)]
(process-line line))))
I'm want to recreate records after reading file, but i can not understand how i can make it. Yes, i'm know that i can parse text and fill my structure by the elements of parsed line, but it's will be difficult, cause i'm have alot structs like "Member" and "Role". Can anyone to suggest me a way, that i can do?
You can use read-string, and slurp, to pull the records out of the file. read-string is limited to reading the first form of a string, but, from your sample, you are only storing a single form, as a list of records.
(defn load-data-from-file [file]
(read-string (slurp file)))
Lazy Reading
If you need more than the first form, or cannot read the entire stream into memory, you can use read directly, to make a lazy reader.
(defn lazy-read
([rdr] (let [eof (Object.)] (lazy-read rdr (read rdr false eof) eof)))
([rdr data eof]
(if (not= eof data)
(cons data (lazy-seq (lazy-read rdr (read rdr false eof) eof))))))
(defn load-all-data [file]
(with-open [rdr (java.io.PushbackReader. (reader file))]
(doall (lazy-read rdr))))
(load-all-data "C:/Platform/Work/test.cdf")
Security
Also, it is good to mention security when loading code with read-string or read. You should only use them with trusted sources, because, using #= or a Java constructor, the source can execute arbitrary code inside your application. For a longer explanation, take a look at the documentation for read.
Setting *read-eval* to false would prevent the issue, but it would also prevent the reconstruction of the records in your sample. To avoid the issue all together, you can use the clojure.edn/read and clojure.edn/read-string functions, with a whitelist of readers.
(defn edn-read [eof rdr]
(clojure.edn/read {:eof eof :readers {'BTC.pcost.Role map->Role
'BTC.pcost.Member map->Member}}
rdr))
(defn lazy-edn-read
([rdr] (let [eof (Object.)] (lazy-edn-read rdr (edn-read eof rdr) eof)))
([rdr data eof]
(if (not= eof data)
(cons data (lazy-seq (lazy-edn-read rdr (edn-read eof rdr) eof))))))
(defn load-all-data [file]
(with-open [rdr (java.io.PushbackReader. (reader file))]
(doall (take-while (complement nil?) (lazy-edn-read rdr)))))
(load-all-data "C:/Platform/Work/test.cdf")
You can use read.
This function will read one object from a file:
(defn load-data-from-file [filename]
(with-open [rdr (java.io.PushbackReader. (reader filename))]
(read rdr)))
Or this will read all objects from the file:
(defn load-all-data-from-file [filename]
(let [eof (Object.)]
(with-open [rdr (java.io.PushbackReader. (reader filename))]
(doall
(take-while #(not= % eof)
(repeatedly #(read rdr nil eof)))))))
Here's the API documentation for read.
This is a small variation that will read all objects from a string:
(defn load-all-data-from-string [string]
(let [eof (Object.)]
(with-open [rdr (-> string java.io.StringReader. java.io.PushbackReader.)]
(doall
(take-while #(not= % eof)
(repeatedly #(read rdr nil eof)))))))
This is, as far as I know, not possible to do using read-string. Instead we use read with a java.io.StringReader.

How to use clojure.edn/read to get a sequence of objects in a file?

Clojure 1.5 introduced clojure.edn, which includes a read function that requires a PushbackReader.
If I want to read the first five objects, I can do:
(with-open [infile (java.io.PushbackReader. (clojure.java.io/reader "foo.txt"))]
(binding [*in* infile]
(let [edn-seq (repeatedly clojure.edn/read)]
(dorun (take 5 (map println edn-seq))))))
How can I instead print out all of the objects? Considering that some of them may be nils, it seems like I need to check for the EOF, or something similar. I want to have a sequence of objects similar to what I would get from line-seq.
Use :eof key
http://clojure.github.com/clojure/clojure.edn-api.html
opts is a map that can include the following keys: :eof - value to
return on end-of-file. When not supplied, eof throws an exception.
edit: sorry, that wasn't enough detail! here y'go:
(with-open [in (java.io.PushbackReader. (clojure.java.io/reader "foo.txt"))]
(let [edn-seq (repeatedly (partial edn/read {:eof :theend} in))]
(dorun (map println (take-while (partial not= :theend) edn-seq)))))
that should do it
I looked at this again. Here is what I came up with:
(defn edn-seq
"Returns the objects from stream as a lazy sequence."
([]
(edn-seq *in*))
([stream]
(edn-seq {} stream))
([opts stream]
(lazy-seq (cons (clojure.edn/read opts stream) (edn-seq opts stream)))))
(defn swallow-eof
"Ignore an EOF exception raised when consuming seq."
[seq]
(-> (try
(cons (first seq) (swallow-eof (rest seq)))
(catch java.lang.RuntimeException e
(when-not (= (.getMessage e) "EOF while reading")
(throw e))))
lazy-seq))
(with-open [stream (java.io.PushbackReader. (clojure.java.io/reader "foo.txt"))]
(dorun (map println (swallow-eof (edn-seq stream)))))
edn-seq has the same signature as clojure.edn/read, and preserves all of the existing behavior, which I think is important given that people may use the :eof option in different ways. A separate function to contain the EOF exception seemed like a better choice, though I'm not sure how best to capture it since it shows up just as a java.lang.RuntimeException.