Clojure. Http stream file - clojure

I want to stream large binary files (exe, jpg, ..., all types of files). It seems that the aleph client can do it.
I looked at the official sample and understood that if we pass the lazy sequence to the body, the response can be passed a stream.
(defn streaming-numbers-handler
"Returns a streamed HTTP response, consisting of newline-delimited numbers every 100
milliseconds. While this would typically be represented by a lazy sequence, instead we use
a Manifold stream. Similar to the use of the deferred above, this means we don't need
to allocate a thread per-request.
In this handler we're assuming the string value for `count` is a valid number. If not,
`Integer.parseInt()` will throw an exception, and we'll return a `500` status response
with the stack trace. If we wanted to be more precise in our status, we'd wrap the parsing
code with a try/catch that returns a `400` status when the `count` is malformed.
`manifold.stream/periodically` is similar to Clojure's `repeatedly`, except that it emits
the value returned by the function at a fixed interval."
[{:keys [params]}]
(let [cnt (Integer/parseInt (get params "count" "0"))]
{:status 200
:headers {"content-type" "text/plain"}
:body (let [sent (atom 0)]
(->> (s/periodically 100 #(str (swap! sent inc) "\n"))
(s/transform (take cnt))))}))
I have the following code:
(ns core
(:require [aleph.http :as http]
[byte-streams :as bs]
[cheshire.core :refer [parse-string generate-string]]
[clojure.core.async :as a]
[clojure.java.io :as io]
[clojure.string :as str]
[clojure.tools.logging :as log]
[clojure.walk :as w]
[compojure.core :as compojure :refer [ANY GET defroutes]]
[compojure.route :as route]
[me.raynes.fs :as fs]
[ring.util.response :refer [response redirect content-type]]
[ring.middleware.params :as params]
[ring.util.codec :as cod])
(:import java.io.File)
(:import java.io.FileInputStream)
(:import java.io.InputStream)
(:gen-class))
(def share-dir "\\\\my-pc-network-path")
(def content-types
{".png" "image/png"
".GIF" "image/gif"
".jpeg" "image/jpeg"
".svg" "image/svg+xml"
".tiff" "image/tiff"
".ico" "image/vnd.microsoft.icon"
".bmp" "image/vnd.wap.wbmp"
".css" "text/css"
".csv" "text/csv"
".html" "text/html"
".txt" "text/plain"
".xml" "text/xml"})
(defn byte-seq [^InputStream is size]
(let [ib (byte-array size)]
((fn step []
(lazy-seq
(let [n (.read is ib)]
(when (not= -1 n)
(let [cb (chunk-buffer size)]
(dotimes [i size] (chunk-append cb (aget ib i)))
(chunk-cons (chunk cb) (step))))))))))
(defn get-file [req]
(let [uri (:uri req)
file (str/replace (w/keywordize-keys (cod/form-decode uri)) #"/file/" "")
full-dir (io/file share-dir file)
_ (log/debug "FULL DIR: "full-dir)
filename (.getName (File. (.getParent full-dir)))
extension (fs/extension filename)
_ (log/debug "EXTENSION: " extension)
resp {:status 200
:headers {"Content-type" (get content-types (.toLowerCase extension) "application/octet-stream")
"Content-Disposition" (str "inline; filename=\"" filename "\"")}
:body (with-open [is (FileInputStream. full-dir)]
(let [bs (byte-seq is 4096)]
(byte-array bs)))}
]
resp))
(def handler
(params/wrap-params
(compojure/routes
(GET "/file/*" [] get-file)
(route/not-found "No such file."))))
(defn -main [& args]
(println "Starting...")
(http/start-server handler {:host "0.0.0.0" :port 5555}))
I get the uri and try to read file with chunks. I wanted to do it because files can be about 3 GB. So, what I expected is the application would not not use more memory than the size of a chunk.
But when I've set 1GB (-Xmx option) for the application, it has taken all memory (1 GB).
Why is 1GB taken? Does JVM work this way?
When I have 100 simultaneous connections (e.g., each file is 3GB), I get OutOfMemoryError.
The task is to "stream" a file with chunks to avoid OutOfMemoryError.

In the get-file function you are calling byte-array on the result of byte-seq. byte-array will realize the LazySeq returned by byte-seq meaning all of it will be in memory.
As far as I know (at least for TCP server), aleph accepts any lazy sequence of ByteBuffer as a body and will handle it optimally for you, so you could just return the result of calling byte-seq
(defn get-file [req]
(let [uri (:uri req)
file (str/replace (w/keywordize-keys (cod/form-decode uri)) #"/file/" "")
full-dir (io/file share-dir file)
_ (log/debug "FULL DIR: "full-dir)
filename (.getName (File. (.getParent full-dir)))
extension (fs/extension filename)
_ (log/debug "EXTENSION: " extension)
resp {:status 200
:headers {"Content-type" (get content-types (.toLowerCase extension) "application/octet-stream")
"Content-Disposition" (str "inline; filename=\"" filename "\"")}
:body (with-open [is (FileInputStream. full-dir)]
(byte-seq is 4096))}
]
resp))
Alternative
Aleph is in compliance with the ring specification and :body accepts File or InputStream. So there is no need to return the bytes yourself.
(defn get-file [req]
(let [uri (:uri req)
file (str/replace (w/keywordize-keys (cod/form-decode uri)) #"/file/" "")
full-dir (io/file share-dir file)
_ (log/debug "FULL DIR: "full-dir)
filename (.getName (File. (.getParent full-dir)))
extension (fs/extension filename)
_ (log/debug "EXTENSION: " extension)
resp {:status 200
:headers {"Content-type" (get content-types (.toLowerCase extension) "application/octet-stream")
"Content-Disposition" (str "inline; filename=\"" filename "\"")}
:body full-dir}
]
resp))

Related

Get Clojure map from POST request

Here is my code:
(ns cowl.server
(:use compojure.core)
(:require [ring.adapter.jetty :as jetty]
[ring.middleware.params :as params]
[ring.middleware.json :refer [wrap-json-response]]
[ring.util.response :refer [response]]
[clojure.data.json :as json]
[cowl.db :as db]))
(defroutes main-routes
(POST "/api/news/as-read" { body :body }
(str (json/read-str (slurp body))))))
(def app
(-> main-routes
wrap-json-response))
(defn serve []
(jetty/run-jetty app {:port 3000}))
If I post this JSON: { "name": "demas" } I get {"name" "demas"}. But this is not a Clojure map.
I need something like (:name (json/read-str (slurp body))). How can I get it ?
Instead of handling body JSON parsing by yourself, you can use ring.middleware.json/wrap-json-body. Just modify your middleware setup:
(def app
(-> main-routes
wrap-json-response
(wrap-json-body {:keywords? true})))
and your request :body will become JSON parsed to Clojure data.
You may wish to use the keywordize-keys function:
http://clojuredocs.org/clojure.walk/keywordize-keys
(ns xyz.core
(:require [clojure.walk :as walk]))
(walk/keywordize-keys {"a" 1 "b" 2})
;;=> {:a 1 :b 2}
You will probably also find that the Cheshire lib is the best way to process JSON in Clojure: https://github.com/dakrone/cheshire#decoding
;; parse some json
(parse-string "{\"foo\":\"bar\"}")
;; => {"foo" "bar"}
;; parse some json and get keywords back
(parse-string "{\"foo\":\"bar\"}" true) ; true -> want keyword keys
;; => {:foo "bar"}
;; parse some json and munge keywords with a custom function
(parse-string "{\"foo\":\"bar\"}" (fn [k] (keyword (.toUpperCase k))))
;; => {:FOO "bar"}
You can use :key-fn function as well:
(json/read-str (return :body)
:key-fn keyword)
Doing this you will parse your JSON to default map syntax.

http-kit websocket resolve symbol

Problem: I get a null pointer exception when I call the caller function in the below code if I pass it data via json from a javascript socket client. I don't get the error when I run this same code from the repl and I pass it the same data i.e.
In repl:
(def data (json/write-str {:controller "hql" :function "something"}))
(caller data)
This works fine
From javascript: this gives me s null pointer exception when calling the function caller from the handler.
<script>
var msg = {'controller': 'hql','function':'something'};
socket = new WebSocket('ws://192.168.0.7:9090');
socket.onopen = function() {
socket.send(JSON.stringify(msg));
}
socket.onmessage = function(s) {
console.log(s);
socket.close();
}
socket.onclose = function() {
console.log('Connection Closed');
}
</script>
(ns hqlserver.core
(:use [org.httpkit.dbcp :only [use-database! close-database! insert-record update-values query delete-rows]]
[org.httpkit.server])
(:require [clojure.data.json :as json]))
(defn hql [data]
(use-database! "jdbc:mysql://localhost/xxxx" "user" "xxxxx")
(let [rows (query ("select username,firstname,lastname from users order by lastname,firstname")]
(close-database!)
(json/write-str rows)))
(defn get-controller [data]
(if data
(let [data (json/read-str data :key-fn keyword)]
(data :controller))))
(defn caller [data]
((ns-resolve *ns* (symbol (get-controller data))) data))
(defn handler [request]
(with-channel request channel
(on-close channel (fn [status] (println "channel closed: " status)))
(on-receive channel (fn [data] (send! channel (caller data))))))
(defn -main [& args]
(run-server handler {:port 9090}))
Don't know if it's the best way but this is the solution:
Thanks to noisesmith for helping and tips....
(ns hqlserver.core
(:use [org.httpkit.dbcp :only [use-database! close-database! insert- record update-values query delete-rows]]
[org.httpkit.server]
[hqlserver.s300.login :only [login]])
(:require [clojure.data.json :as json]
[hqlserver.s300.site :refer [site-index site-update]]
[clojure.string :as str]))
(defn get-site [data]
(if data
(let [data (json/read-str data :key-fn keyword)]
(-> data :site))))
(defn get-controller [data]
(if data
(let [data (json/read-str data :key-fn keyword)]
(-> data :controller))))
(defn get-function [data]
(if data
(let [data (json/read-str data :key-fn keyword)]
(-> data :function))))
(defn caller [data]
(let [site (get-site data)]
(let [controller (get-controller data)]
(let [function (get-function data)]
(cond
(and (= site "s300")(= controller "login")(= function "login")) (login data)
(and (= site "s300")(= controller "site")(= function "site-index")) (site-index data)
(and (= site "s300")(= controller "site")(= function "site-update")) (site-update data)
:else (json/write-str {:error "No records found!"}))))))
(defn async-handler [ring-request]
(with-channel ring-request channel
(if (websocket? channel)
(on-receive channel (fn [data]
(send! channel (caller data))))
(send! channel {:status 200
:headers {"Content-Type" "text/plain"}
:body "Long polling?"}))))
(defn -main [& args]
(run-server async-handler {:port 9090}));

How to download a file and unzip it from memory in clojure?

I'm making a GET request using clj-http and the response is a zip file. The contents of this zip is always one CSV file. I want to save the CSV file to disk, but I can't figure out how.
If I have the file on disk, (fs/unzip filename destination) from the Raynes/fs library works great, but I can't figure out how I can coerce the response from clj-http into something this can read. If possible, I'd like to unzip the file directly without
The closest I've gotten (if this is even close) gets me to a BufferedInputStream, but I'm lost from there.
(require '[clj-http.client :as client])
(require '[clojure.java.io :as io])
(->
(client/get "http://localhost:8000/blah.zip" {:as :byte-array})
(:body)
(io/input-stream))
You can use the pure java java.util.zip.ZipInputStream or java.util.zip.GZIPInputStream. Depends how the content is zipped. This is the code that saves your file using java.util.zip.GZIPInputStream :
(->
(client/get "http://localhost:8000/blah.zip" {:as :byte-array})
(:body)
(io/input-stream)
(java.util.zip.GZIPInputStream.)
(clojure.java.io/copy (clojure.java.io/file "/path/to/output/file")))
Using java.util.zip.ZipInputStream makes it only a bit more complicated :
(let [stream (->
(client/get "http://localhost:8000/blah.zip" {:as :byte-array})
(:body)
(io/input-stream)
(java.util.zip.ZipInputStream.))]
(.getNextEntry stream)
(clojure.java.io/copy stream (clojure.java.io/file "/path/to/output/file")))
(require '[clj-http.client :as httpc])
(import '[java.io File])
(defn download-unzip [url dir]
(let [saveDir (File. dir)]
(with-open [stream (-> (httpc/get url {:as :stream})
(:body)
(java.util.zip.ZipInputStream.))]
(loop [entry (.getNextEntry stream)]
(if entry
(let [savePath (str dir File/separatorChar (.getName entry))
saveFile (File. savePath)]
(if (.isDirectory entry)
(if-not (.exists saveFile)
(.mkdirs saveFile))
(let [parentDir (File. (.substring savePath 0 (.lastIndexOf savePath (int File/separatorChar))))]
(if-not (.exists parentDir) (.mkdirs parentDir))
(clojure.java.io/copy stream saveFile)))
(recur (.getNextEntry stream))))))))

RabbitMQ Clojure Langohr doesn't throw error when rabbitmq server restarted

I have the following code to monitoring database table to create MQ messsage every several seconds. And I found that when I restarted the RabbitMQ server the app still runs without throwing an exception, and it still print message was created. So why it won't throw an exception? And another question is how to close the connect when I kill the app? Since it's a service, I have no place to write the code to close the RabbitMQ connection.
(require '[clojure.java.jdbc :as j])
(require '[langohr.core :as rmq])
(require '[langohr.channel :as lch])
(require '[langohr.queue :as lq])
(require '[langohr.exchange :as le])
(require '[langohr.consumers :as lc])
(require '[langohr.basic :as lb])
(require '[clojure.data.json :as json])
(require '[clojure.java.io :as io])
(defn load-props
[file-name]
(with-open [^java.io.Reader reader (io/reader file-name)]
(let [props (java.util.Properties.)]
(.load props reader)
(into {} (for [[k v] props] [(keyword k) (read-string v)])))))
(def ^{:const true}
default-exchange-name "")
(defn create-message-from-database
([channel qname db-spec]
(let [select-sql "select a.file_id from file_store_metadata
where processed=false "
results (j/query db-spec select-sql)
]
(doseq [row results]
(let [file-id (:file_id row)]
(lb/publish channel default-exchange-name qname (json/write-str row) {:content-type "text/plain" :type "greetings.hi" :persistent true})
(j/update! db-spec :file_store_metadata {:processed true} ["file_id = ?" (:file_id row)])
(println "message created for a new file id" (:file_id row))))
))
)
(defn create-message-from-database-loop
([x channel qname db] (
while true
(Thread/sleep (* x 1000))
(create-message-from-database channel qname db)
))
)
(defn -main
[& args]
(let [
properties (load-props "resource.properties")
postgres-db (:database properties)
rabbitmq-url (:rabbitmq properties)
wake-up-interval (:interval properties)
rmq-conn (rmq/connect {:uri rabbitmq-url})
ch (lch/open rmq-conn)
qname "etl scheduler"]
(println "monitoring file store meta data on " postgres-db " every " wake-up-interval " seconds")
(println "message will be created on rabbitmq server" rabbitmq-url)
(lq/declare ch qname {:exclusive false :auto-delete false :persistent true})
(create-message-from-database-loop wake-up-interval ch qname postgres-db)
)
)

Is there some way to let compojure support type conversion automatically?

Now can use compojure this way:
(GET ["/uri"] [para1 para2]
)
Para1 and para2 are all of type String.
I would like to let it know the type correcttly,like this:
(GET ["/uri"] [^String para1 ^Integer para2]
)
It can convert para1 to be Sting and para2 to Integer.
Is there some library or good way to do this?
This is possible as of Compojure 1.4.0 using the syntax [x :<< as-int]
This is not currently possible with only Compojure.
You could use Prismatic schema coercion.
(require '[schema.core :as s])
(require '[schema.coerce :as c])
(require '[compojure.core :refer :all])
(require '[ring.middleware.params :as rparams])
(def data {:para1 s/Str :para2 s/Int s/Any s/Any})
(def data-coercer (c/coercer data c/string-coercion-matcher ))
(def get-uri
(GET "/uri" r
(let [{:keys [para1 para2]} (data-coercer (:params r))]
(pr-str {:k1 para1 :k2 (inc para2)}))))
(def get-uri-wrapped
(let [keywordizer (fn [h]
(fn [r]
(h (update-in r [:params] #(clojure.walk/keywordize-keys %)))))]
(-> get-uri keywordizer rparams/wrap-params)))
Here is a sample run:
(get-uri-wrapped {:uri "/uri" :query-string "para1=a&para2=3" :request-method :get})
{:status 200,
:headers {"Content-Type" "text/html; charset=utf-8"},
:body "{:k1 \"a\", :k2 4}"}