I am parsing a big csv file and I am using the first line of it as the keys for the records. So for a csv file like:
header1,header2
foo,bar
zoo,zip
I end up with a lazy seq like:
({:header1 "foo" :header2 "bar"},
{:header1 "zoo" :header2 "zip"})
The code working fine, but I am not sure if in the following function I am holding the head of "lines" or not.
(defn csv-as-seq [file]
(let [rdr (clojure.java.io/reader file)]
(let [lines (line-seq rdr)
headers (parse-headers (first lines))]
(map (row-mapper headers) (rest lines)))))
Can somebody please clarify?
Yes, this expression syntactically says to hold the head
(let [lines (line-seq rdr)
though in this case you should get away with it because their are no references to
lines and headers after the call to map and the Clojure compiler starting with 1.2.x includes a feature called locals clearing: it sets any locals not used after a function call to nil in the preamble to the function call. In this case it will set lines and headers to nil in the local context of the function and they will be GCd as used. This is one of the rare cases where clojure produces bytecode that cannot be expressed in java.
Related
So, this is what I want to do
(def body `(prn sth))
(defn f [sth] body)
(f "hello")
; can it prn hello here?
Is this possible?
If you want to "take a data structure and embed it in code to be executed", then you can do something like this.
You will want to tweak the body to embed like this:
(def body `(prn ~'sth))
that is, prefixing the sth local variable with ~' so that it will not be namespaced. Then you need a macro that will embed the code for you:
(defmacro insert-body [body]
(eval body))
Using this macro inside the f function to embed the body and putting things together, you get this code:
(defmacro insert-body [body]
(eval body))
(def body `(prn ~'sth))
(defn f [sth] (insert-body body))
You can now call f with an argument and it will work as expected:
> (f "hello")
"hello"
nil
The function macroexpand comes in handy to test that the macro does what it is supposed to be doing:
(macroexpand `(insert-body body))
;; => (clojure.core/prn sth)
But it is not clear to me what you are trying to accomplish or what you would gain from writing your code this way. Whatever you want to accomplish eventually, there is most likely a better way to accomplish it than what I suggest here. I am just providing a specific answer to your question, nothing more, nothing less. So if you clarify your question and give more details, it will also be possible to provide a better answer that addresses you actual problem.
I have written data like that in a file (kind of)
{:a 25 :b 28}
{:a 2 :b 50}
...
I want to have a lazy sequence of these maps.
There are around 40 millions of lines. I can also write chunks of 10000 but I do not thnik it will change the way the functions are written (mapcat instead of map)
To read it, I wrote
(with-open [affectations (io/reader "dev-resources/affectations.edn")]
(map read-string affectations))
The problem is that Clojure tells
Don't know how to create ISeq from : java.io.BufferedReader
To be honest I understand nothing on the java.io namespace.
I would like to have a lazy sequence of the data in the file but I do not know how to turn the stream into strings and then collections.
Any idea ?
Is this read-line ?
Thanks
You are passing java.io.BufferedReader to map whereas map expects a seq.
You need to use line-seq to produce a (lazy) seq of lines from your file:
(with-open [affectations (io/reader "dev-resources/affectations.edn")]
(map read-string (lazy-seq affectations)))
Remember, that you need to force all your side effects on the data read from a resource opened in with-open within its scope, otherwise you will get errors.
One option is to just force the whole seq of text lines from your files and return it using doall. However, this solution could read all your data into memory which doesn't seem practical.
I guess you need to execute some logic for each of the line from the file and you don't need to keep all those parsed collections in memory. In such case you could pass a function representing that logic into your function handling reading your file:
(defn process-file [filename process-fn]
(with-open [reader (io/reader filename)]
(doseq [line (line-seq reader)]
(-> line
(read-string)
(process-fn)))))
This function will read your file line by line converting each of it individually using read-string and calling your process-fn function. process-file will return nil.
Background
I've written a hack for Emacs that lets me send a Clojure form from an editor buffer to a REPL buffer. It's working fine, except that if the two buffers are in different namespaces the copied text doesn't usually make sense, or, worse, it might make sense but have a different meaning to that in the editor buffer.
I want to transform the text so that it makes sense in the REPL buffer.
A Solution in Common Lisp
In Common Lisp, I could do this using the following function:
;; Common Lisp
(defun translate-text-between-packages (text from-package to-package)
(let* ((*package* from-package)
(form (read-from-string text))
(*package* to-package))
(with-output-to-string (*standard-output*)
(pprint form))))
And a sample use:
;; Common Lisp
(make-package 'editor-package)
(make-package 'repl-package)
(defvar repl-package::a)
(translate-text-between-packages "(+ repl-package::a b)"
(find-package 'editor-package)
(find-package 'repl-package))
;; => "(+ A EDITOR-PACKAGE::B)"
The package name qualifications in the input string and the output string are different—exactly what's needed to solve the problem of copying and pasting text between packages.
(BTW, there's stuff about how to run the translation code in the Common Lisp process and move stuff between the Emacs world and the Common Lisp world, but I'm ok with that and I don't particularly want to get into it here.)
A Non-Solution in Clojure
Here's a direct translation into Clojure:
;; Clojure
(defn translate-text-between-namespaces [text from-ns to-ns]
(let [*ns* from-ns
form (read-string text)
*ns* to-ns]
(with-out-str
(clojure.pprint/pprint form))))
And a sample use:
;; Clojure
(create-ns 'editor-ns)
(create-ns 'repl-ns)
(translate-text-between-namespaces "(+ repl-ns/a b)"
(find-ns 'editor-ns)
(find-ns 'repl-ns))
;; => "(+ repl-ns/a b)"
So the translation function in Clojure has done nothing. That's because symbols and packages/namespaces in Common Lisp and Clojure work differently.
In Common Lisp symbols belong to a package and the determination of a symbol's package happens at read time.
In Clojure, for good reasons, symbols do not belong to a namespace and the determination of a symbol's namespace happens at evaluation time.
Can This Be Done in Clojure?
So, finally, my question: Can I convert Clojure code from one namespace to another?
I don't understand your use case, but here is a way to transform symbols from one namespace to another.
(require 'clojure.walk 'clojure.pprint)
(defn ns-trans-form [ns1 ns2 form]
(clojure.walk/prewalk
(fn [f] (if ((every-pred symbol? #(= (namespace %) ns1)) f)
(symbol ns2 (name f))
f))
form))
(defn ns-trans-text [ns1 ns2 text]
(with-out-str
(->> text
read-string
(ns-trans-form ns1 ns2)
clojure.pprint/pprint)))
(print (ns-trans-text "editor-ns" "repl-ns" "(+ editor-ns/a b)" ))
;=> (+ repl-ns/a b)
So, editor-ns/a was transformed to repl-ns/a.
(Answering my own question...)
Given that it's not easy to refer to a namespace's non-public vars from outside the namespace, there's no simple way to do this.
Perhaps a hack is possible, based on the idea at http://christophermaier.name/blog/2011/04/30/not-so-private-clojure-functions. That would involve walking the form and creating new symbols that resolve to new vars that have the same value as vars referred to in the original form. Perhaps I'll investigate this further sometime, but not right now.
I'm learning Clojure and as an exercise I wanted to write something like the unix "comm" command.
To do this, I read the contents of each file into a set, then use difference/intersection to show exclusive/common files.
After a lot of repl-time I came up with something like this for the set creation part:
(def contents (ref #{}))
(doseq [line (read-lines "/tmp/a.txt")]
(dosync (ref-set contents (conj #contents line))))
(I'm using duck-streams/read-lines to seq the contents of the file).
This is my first stab at any kind of functional programming or lisp/Clojure. For instance, I couldn't understand why, when I did a conj on the set, the set was still empty. This lead me to learning about refs.
Is there a better Clojure/functional way to do this? By using ref-set, am I just twisting the code to a non-functional mindset or is my code along the lines of how it should be done?
Is there a a library that already does this? This seems like a relatively ordinary thing to want to do but I couldn't find anything like it.
Clojure 1.3:
user> (require '[clojure.java [io :as io]])
nil
user> (line-seq (io/reader "foo.txt"))
("foo" "bar" "baz")
user> (into #{} (line-seq (io/reader "foo.txt")))
#{"foo" "bar" "baz"}
line-seq gives you a lazy sequence where each item in the sequence is a line in the file.
into dumps it all into a set. To do what you were trying to do (add each item one by one into a set), rather than doseq and refs, you could do:
user> (reduce conj #{} (line-seq (io/reader "foo.txt")))
#{"foo" "bar" "baz"}
Note that the Unix comm compares two sorted files, which is likely a more efficient way to compare files than doing set intersection.
Edit: Dave Ray is right, to avoid leaking open file handles it's better to do this:
user> (with-open [f (io/reader "foo.txt")]
(into #{} (line-seq f)))
#{"foo" "bar" "baz"}
I always read with slurp and after that split with re-seq due to my needs.
What would be an ideomatic way in Clojure to get a lazy sequence over a file containing float values serialized from Java? (I've toyed with a with-open approach based on line-reading examples but cannot seem to connect the dots to process the stream as floats.)
Thanks.
(defn float-seqs [#^java.io.DataInputStream dis]
(lazy-seq
(try
(cons (.readFloat dis) (float-seqs dis))
(catch java.io.EOFException e
(.close dis)))))
(with-open [dis (-> file java.io.FileInputStream. java.io.DataInputStream.)]
(let [s (float-seqs dis)]
(doseq [f s]
(println f))))
You are not required to use with-open if you are sure you are going to consume the whole seq.
If you use with-open, double-check that you're not leaking the seq (or a derived seq) outside of its scope.