writing zip file to file in Clojure - clojure

I have a method for zipping:
(defn zip-project [project-id notebooks files]
(with-open [out (ByteArrayOutputStream.)
zip (ZipOutputStream. out)]
(doseq [nb notebooks]
(.putNextEntry zip (ZipEntry. (str "Notebooks/" (:notebook/name nb) ".bkr")))
(let [nb-json (:notebook/contents nb)
bytes (.getBytes nb-json)]
(.write zip bytes))
(.closeEntry zip))
(doseq [{:keys [name content]} files]
(.putNextEntry zip (ZipEntry. (str "Files/" name)))
(io/copy content zip)
(.closeEntry zip))
(.finish zip)
(.toByteArray out)))
after I make a zip I want to save it into the file something like /tmp/sample/sample.zip, but I cannot seem to make it. here is what I am doing:
(defn create-file! [path zip]
(let [f (io/file path)]
(io/make-parents f)
(io/copy zip f)
true))
The problem is, when I run unzip from terminal it says that zip file is empty and if I unzip it using Archive utility it extracts with cpgz extension.
What am I doing wrong here?

You will need essentially 4 things
Import everything (normally you would use (ns ...) but you can run this in the repl
(import 'java.io.FileOutputStream)
(import 'java.io.BufferedOutputStream)
(import 'java.io.ZipOutputStream)
(import 'java.util.zip.ZipOutputStream)
(import 'java.util.zip.ZipEntry)
You need a way to initialize the stream. This can be done nicely with the -> macro:
(defn zip-stream
"Opens a ZipOutputStream over the given file (as a string)"
[file]
(-> (FileOutputStream. file)
BufferedOutputStream.
ZipOutputStream.))
You need a way to create/close entries in the ZipOutputStream
(defn create-zip-entry
"Create a zip entry with the given name. That will be the name of the file inside the zipped file."
[stream entry-name]
(.putNextEntry stream (ZipEntry. entry-name)))
Finally you need a way to write your content.
(defn write-to-zip
"Writes a string to a zip stream as created by zip-stream"
[stream str]
(.write stream (.getBytes str)))
Putting it all together:
(with-open [stream (zip-stream "coolio.zip")]
(create-zip-entry stream "foo1.txt")
(write-to-zip stream "Hello Foo1")
(.closeEntry stream) ;; don't forget to close entries
(create-zip-entry stream "foo2.txt")
(write-to-zip stream "Hello Foo 2")
(.closeEntry stream))
The result:

Related

Generate and stream a zip-file in a Ring web app in Clojure

I have a Ring handler that needs to:
Zip a few files
Stream the Zip to the client.
Now I have it sort of working, but only the first zipped entry gets streamed, and after that it stalls/stops. I feel it has something to do with flushing/streaming that is wrong.
Here is my (compojure) handler:
(GET "/zip" {:as request}
:query-params [order-id :- s/Any]
(stream-lessons-zip (read-string order-id) (:db request) (:auth-user request)))
Here is the stream-lessons-zip function:
(defn stream-lessons-zip
[]
(let [lessons ...];... not shown
{:status 200
:headers {"Content-Type" "application/zip, application/octet-stream"
"Content-Disposition" (str "attachment; filename=\"files.zip\"")
:body (futil/zip-lessons lessons)}))
And i use a piped-input-stream to do the streaming like so:
(defn zip-lessons
"Returns an inputstream (piped-input-stream) to be used directly in Ring HTTP responses"
[lessons]
(let [paths (map #(select-keys % [:file_path :file_name]) lessons)]
(ring-io/piped-input-stream
(fn [output-stream]
; build a zip-output-stream from a normal output-stream
(with-open [zip-output-stream (ZipOutputStream. output-stream)]
(doseq [{:keys [file_path file_name] :as p} paths]
(let [f (cio/file file_path)]
(.putNextEntry zip-output-stream (ZipEntry. file_name))
(cio/copy f zip-output-stream)
(.closeEntry zip-output-stream))))))))
So I have confirmed that the 'lessons' vector contains like 4 entries, but the zip file only contains 1 entry. Furthermore, Chrome doesn't seem to 'finalize' the download, ie. it thinks it is still downloading.
How can I fix this?
It sounds like producing a stateful stream using blocking IO is not supported by http-kit. Non-stateful streams can be done this way:
http://www.http-kit.org/server.html#async
A PR to introduce stateful streams using blocking IO was not accepted:
https://github.com/http-kit/http-kit/pull/181
It sounds like the option to explore is to use a ByteArrayOutputStream to fully render the zip file to memory, and then return the buffer that produces. If this endpoint isn't highly trafficked and the zip file it produces is not large (< 1 gb) then this might work.
So, it's been a few years, but that code still runs in production (ie. it works). So I made it work back then, but forgot to mention it here (and forgot WHY it works, to be honest,.. it was very much trial/error).
This is the code now:
(defn zip-lessons
"Returns an inputstream (piped-input-stream) to be used directly in Ring HTTP responses"
[lessons {:keys [firstname surname order_favorite_name company_name] :as annotation
:or {order_favorite_name ""
company_name ""
firstname ""
surname ""}}]
(debug "zipping lessons" (count lessons))
(let [paths (map #(select-keys % [:file_path :file_name :folder_number]) lessons)]
(ring-io/piped-input-stream
(fn [output-stream]
; build a zip-output-stream from a normal output-stream
(with-open [zip-output-stream (ZipOutputStream. output-stream)]
(doseq [{:keys [file_path file_name folder_number] :as p} paths]
(let [f (cio/as-file file_path)
baos (ByteArrayOutputStream.)]
(if (.exists f)
(do
(debug "Adding entry to zip:" file_name "at" file_path)
(let [zip-entry (ZipEntry. (str (if folder_number (str folder_number "/") "") file_name))]
(.putNextEntry zip-output-stream zip-entry)
(.close baos)
(.writeTo baos zip-output-stream)
(.closeEntry zip-output-stream)
(.flush zip-output-stream)
(debug "flushed")))
(warn "File '" file_name "' at '" file_path "' does not exist, not adding to zip file!"))))
(.flush zip-output-stream)
(.flush output-stream)
(.finish zip-output-stream)
(.close zip-output-stream))))))

How to download a file and unzip it from memory in clojure?

I'm making a GET request using clj-http and the response is a zip file. The contents of this zip is always one CSV file. I want to save the CSV file to disk, but I can't figure out how.
If I have the file on disk, (fs/unzip filename destination) from the Raynes/fs library works great, but I can't figure out how I can coerce the response from clj-http into something this can read. If possible, I'd like to unzip the file directly without
The closest I've gotten (if this is even close) gets me to a BufferedInputStream, but I'm lost from there.
(require '[clj-http.client :as client])
(require '[clojure.java.io :as io])
(->
(client/get "http://localhost:8000/blah.zip" {:as :byte-array})
(:body)
(io/input-stream))
You can use the pure java java.util.zip.ZipInputStream or java.util.zip.GZIPInputStream. Depends how the content is zipped. This is the code that saves your file using java.util.zip.GZIPInputStream :
(->
(client/get "http://localhost:8000/blah.zip" {:as :byte-array})
(:body)
(io/input-stream)
(java.util.zip.GZIPInputStream.)
(clojure.java.io/copy (clojure.java.io/file "/path/to/output/file")))
Using java.util.zip.ZipInputStream makes it only a bit more complicated :
(let [stream (->
(client/get "http://localhost:8000/blah.zip" {:as :byte-array})
(:body)
(io/input-stream)
(java.util.zip.ZipInputStream.))]
(.getNextEntry stream)
(clojure.java.io/copy stream (clojure.java.io/file "/path/to/output/file")))
(require '[clj-http.client :as httpc])
(import '[java.io File])
(defn download-unzip [url dir]
(let [saveDir (File. dir)]
(with-open [stream (-> (httpc/get url {:as :stream})
(:body)
(java.util.zip.ZipInputStream.))]
(loop [entry (.getNextEntry stream)]
(if entry
(let [savePath (str dir File/separatorChar (.getName entry))
saveFile (File. savePath)]
(if (.isDirectory entry)
(if-not (.exists saveFile)
(.mkdirs saveFile))
(let [parentDir (File. (.substring savePath 0 (.lastIndexOf savePath (int File/separatorChar))))]
(if-not (.exists parentDir) (.mkdirs parentDir))
(clojure.java.io/copy stream saveFile)))
(recur (.getNextEntry stream))))))))

Read from a file and return output in a vector

I'm just learning clojure and trying to read in a file and do something with the returned vector of results. In this instance I'm just trying to print it out.
Below is the code in question:
(defn read_file
"Read in a file from the resources directory"
[input]
(with-open [rdr (reader input)]
(doseq [line (line-seq rdr)])))
(defn -main []
(println (read_file "resources/input.txt") ))
The println returns a "nil". What do I need to do to return "line"
If the file is not very big, you can use slurp to read the file content as a string, then split it with a specific delimiter (in this case \n).
(defn read-file [f]
(-> (slurp f)
(clojure.string/split-lines)))
doseq returns nil. It's supposed to be used when you're doing stuff in a do fashion on the elements of a sequence, so mostly side-effect stuff.
Try this:
(defn file->vec
"Read in a file from the resources directory"
[input]
(with-open [rdr (reader input)]
(into [] (line-seq rdr))))
But you shouldn't do this for big files, in those cases you don't want the whole file to sit in memory. For this reason, slurp is equally bad.

Zip a file in clojure

I want to zip a file in clojure and I can't find any libraries to do it.
Do you know a good way to zip a file or a folder in Clojure?
Must I use a java library?
There is a stock ZipOutputStream in Java which can be used from Clojure. I don't know whether there is a library somewhere. I use the plain Java functions with a small helper macro:
(defmacro ^:private with-entry
[zip entry-name & body]
`(let [^ZipOutputStream zip# ~zip]
(.putNextEntry zip# (ZipEntry. ~entry-name))
~#body
(flush)
(.closeEntry zip#)))
Obviously every ZIP entry describes a file.
(require '[clojure.java.io :as io])
(with-open [file (io/output-stream "foo.zip")
zip (ZipOutputStream. file)
wrt (io/writer zip)]
(binding [*out* wrt]
(doto zip
(with-entry "foo.txt"
(println "foo"))
(with-entry "bar/baz.txt"
(println "baz")))))
To zip a file you might want to do something like this:
(with-open [output (ZipOutputStream. (io/output-stream "foo.zip"))
input (io/input-stream "foo")]
(with-entry output "foo"
(io/copy input output)))
All compression and decompression of files can be done with a simple shell command which we can access through clojure.java.shell
Using the same method you can also compress and decompress any compression type you would usually from your terminal.
(use '[clojure.java.shell :only [sh]])
(defn unpack-resources [in out]
(clojure.java.shell/sh
"sh" "-c"
(str " unzip " in " -d " out)))
(defn pack-resources [in out]
(clojure.java.shell/sh
"sh" "-c"
(str " zip " in " -r " out)))
(unpack-resources "/path/to/my/zip/foo.zip"
"/path/to/store/unzipped/files")
(pack-resources "/path/to/store/archive/myZipArchiveName.zip"
"/path/to/my/file/myTextFile.csv")
You can import this (gzip) https://gist.github.com/bpsm/1858654
Its quite interesting.
Or more precisely, you can use this
(defn gzip
[input output & opts]
(with-open [output (-> output clojure.java.io/output-stream GZIPOutputStream.)]
(with-open [rdr (clojure.java.io/reader input)]
(doall (apply clojure.java.io/copy rdr output opts)))))
You can use rtcritical/clj-ant-tasks library that wraps Apache Ant, and zip with a single command.
Add library dependency [rtcritical/clj-ant-tasks "1.0.1"]
(require '[rtcritical.clj-ant-tasks :refer [run-ant-task]])
To zip a file:
(run-ant-task :zip {:destfile "/tmp/file-zipped.zip"
:basedir "/tmp"
:includes "file-to-zip"})
Note: run-ant-task(s) functions in this library namespace can be used to run any other Apache Ant task(s) as well.
For more information, see https://github.com/rtcritical/clj-ant-tasks

Read csv into a list in clojure

I know there are a lot of related questions, I have read them but still have not gained some fundamental understanding of how to read-process-write. Take the following function for example which uses clojure-csv library to parse a line
(defn take-csv
"Takes file name and reads data."
[fname]
(with-open [file (reader fname)]
(doseq [line (line-seq file)]
(let [record (parse-csv line)]))))
What I would like to obtain is data read into some collection as a result of (def data (take-csv "file.csv")) and later to process it. So basically my question is how do I return record or rather a list of records.
"doseq" is often used for operations with side effect. In your case to create collection of records you can use "map":
(defn take-csv
"Takes file name and reads data."
[fname]
(with-open [file (reader fname)]
(doall (map (comp first csv/parse-csv) (line-seq file)))))
Better parse the whole file at ones to reduce code:
(defn take-csv
"Takes file name and reads data."
[fname]
(with-open [file (reader fname)]
(csv/parse-csv (slurp file))))
You also can use clojure.data.csv instead of clojure-csv.core. Only should rename parse-csv to take-csv in previous function.
(defn put-csv [fname table]
(with-open [file (writer fname)]
(csv/write-csv file table)))
With all the things you can do with .csv files, I suggest using clojure-csv or clojure.data.csv. I mostly use clojure-csv to read in a .csv file.
Here are some code snippets from a utility library I use with most of my Clojure programs.
from util.core
(ns util.core
^{:author "Charles M. Norton",
:doc "util is a Clojure utilities directory"}
(:require [clojure.string :as cstr])
(:import java.util.Date)
(:import java.io.File)
(:use clojure-csv.core))
(defn open-file
"Attempts to open a file and complains if the file is not present."
[file-name]
(let [file-data (try
(slurp file-name)
(catch Exception e (println (.getMessage e))))]
file-data))
(defn ret-csv-data
"Returns a lazy sequence generated by parse-csv.
Uses open-file which will return a nil, if
there is an exception in opening fnam.
parse-csv called on non-nil file, and that
data is returned."
[fnam]
(let [csv-file (open-file fnam)
inter-csv-data (if-not (nil? csv-file)
(parse-csv csv-file)
nil)
csv-data
(vec (filter #(and pos? (count %)
(not (nil? (rest %)))) inter-csv-data))]
(if-not (empty? csv-data)
(pop csv-data)
nil)))
(defn fetch-csv-data
"This function accepts a csv file name, and returns parsed csv data,
or returns nil if file is not present."
[csv-file]
(let [csv-data (ret-csv-data csv-file)]
csv-data))
Once you've read in a .csv file, then what you do with its contents is another matter. Usually, I am taking .csv "reports" from one financial system, like property assessments, and formatting the data to be uploaded into a database of another financial system, like billing.
I will often either zipmap each .csv row so I can extract data by column name (having read in the column names), or even make a sequence of zipmap'ped .csv rows.
Just to add this good answers, here is a full example
First, add clojure-csv into your dependencies
(ns scripts.csvreader
(:require [clojure-csv.core :as csv]
[clojure.java.io :as io]))
(defn take-csv
"Takes file name and reads data."
[fname]
(with-open [file (io/reader fname)]
(-> file
(slurp)
(csv/parse-csv))))
usage
(take-csv "/path/youfile.csv")