Reuse ftp connection in Clojure using clj-ftp - clojure

I am trying to fetch files from FTP server using Clojure. I would like to download all files using one connection. I am using https://github.com/miner/clj-ftp/blob/master/src/miner/ftp.clj clj-ftp for this. Unfortunatelly I am unable to achieve it with one connection. Got two functions:
(defn one-session [files]
(ftp/with-ftp [client ftp-url]
(map #(ftp/client-get client %1)
files)))
(defn get-all [files]
(map #(ftp/with-ftp [client ftp-url]
(ftp/client-get client %1))
files))
When calling get-all everything works fine. When trying call one-session I got exception NullPointerException org.apache.commons.net.SocketClient.getRemoteAddress (SocketClient.java:658)
I noticed that in clj-ftp there is a lot of type hints, does it have inpact on it?
Whole stacktrace
Exception in thread "main" java.lang.NullPointerException, compiling:(/private/var/folders/4d/77tz4xfj7b1dkqtd3h4j10v40000gn/T/form-init2973639134882885374.clj:1:125)
at clojure.lang.Compiler.load(Compiler.java:7391)
at clojure.lang.Compiler.loadFile(Compiler.java:7317)
at clojure.main$load_script.invokeStatic(main.clj:275)
at clojure.main$init_opt.invokeStatic(main.clj:277)
at clojure.main$init_opt.invoke(main.clj:277)
at clojure.main$initialize.invokeStatic(main.clj:308)
at clojure.main$null_opt.invokeStatic(main.clj:342)
at clojure.main$null_opt.invoke(main.clj:339)
at clojure.main$main.invokeStatic(main.clj:421)
at clojure.main$main.doInvoke(main.clj:384)
at clojure.lang.RestFn.invoke(RestFn.java:421)
at clojure.lang.Var.invoke(Var.java:383)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.Var.applyTo(Var.java:700)
at clojure.main.main(main.java:37)
Caused by: java.lang.NullPointerException
at org.apache.commons.net.SocketClient.getRemoteAddress(SocketClient.java:658)
at org.apache.commons.net.ftp.FTPClient._openDataConnection_(FTPClient.java:789)
at org.apache.commons.net.ftp.FTPClient._retrieveFile(FTPClient.java:1854)
at org.apache.commons.net.ftp.FTPClient.retrieveFile(FTPClient.java:1845)
at miner.ftp$client_get.invokeStatic(ftp.clj:144)
at miner.ftp$client_get.invoke(ftp.clj:138)
at miner.ftp$client_get.invokeStatic(ftp.clj:140)
at miner.ftp$client_get.invoke(ftp.clj:138)
at zephyr.fetch$one_session$fn__1296.invoke(fetch.clj:30)
at clojure.core$map$fn__4785.invoke(core.clj:2644)
at clojure.lang.LazySeq.sval(LazySeq.java:40)
at clojure.lang.LazySeq.seq(LazySeq.java:49)
at clojure.lang.RT.seq(RT.java:521)
at clojure.core$seq__4357.invokeStatic(core.clj:137)
at clojure.core$print_sequential.invokeStatic(core_print.clj:46)
at clojure.core$fn__6072.invokeStatic(core_print.clj:153)
at clojure.core$fn__6072.invoke(core_print.clj:153)
at clojure.lang.MultiFn.invoke(MultiFn.java:233)
at clojure.core$pr_on.invokeStatic(core.clj:3572)
at clojure.core$pr.invokeStatic(core.clj:3575)
at clojure.core$pr.invoke(core.clj:3575)

I've checked the source of the ftp library and it seems that you have to realise the lazy sequence created by map. Otherwise, the call to ftp/client-get is executed after leaving the with-ftp block when the elements are fetched from the result sequence and at that time the created connection has been already closed.
To fix the problem you need to force the realisation of the sequence using doall:
(defn one-session [files]
(ftp/with-ftp [client ftp-url]
(doall
(map #(ftp/client-get client %1)
files))))
This will force all ftp/client-get calls to happen within your with-ftp scope.
On the other hand it might not be desired to realise all the sequence at once as it might have dangerous consequences (e.g. memory utilisation). You might read more on Clojure lazy seqs mixed with side effects in Stuart Sierra's blog post.
In your particular case ftp/client-get returns boolean value indicating if the file download to a local file was successful or not so it is not a big issue. In other cases you might redesign your API so your function accepts not only a seq of files but also a function which encapsulate what you want to do with each file and apply that function to each value as it is consumed one by one without keeping the whole sequence in memory. #Frank Henard made a valid point that you could use doseq for that.

Related

Connection Pooling in Clojure

I am unable to understand the use of pool-db and connection function
in this connection pooling guide.
(defn- get-pool
"Creates Database connection pool to be used in queries"
[{:keys [host-port db-name username password]}]
(let [pool (doto (ComboPooledDataSource.)
(.setDriverClass "com.mysql.cj.jdbc.Driver")
(.setJdbcUrl (str "jdbc:mysql://"
host-port
"/" db-name))
(.setUser username)
(.setPassword password)
;; expire excess connections after 30 minutes of inactivity:
(.setMaxIdleTimeExcessConnections (* 30 60))
;; expire connections after 3 hours of inactivity:
(.setMaxIdleTime (* 3 60 60)))]
{:datasource pool}))
(def pool-db (delay (get-pool db-spec)))
(defn connection [] #pool-db)
; usage in code
(jdbc/query (connection) ["Select SUM(1, 2, 3)"])
Why can't we simply do?
(def connection (get-pool db-spec))
; usage in code
(jdbc/query connection ["SELECT SUM(1, 2, 3)"])
The delay ensures that you create the connection pool the first time you try to use it, rather than when the namespace is loaded.
This is a good idea because your connection pool may fail to be created for any one of a number of reasons, and if it fails during namespace load you will get some odd behaviour - any defs after your failing connection pool creation will not be evaluated, for example.
In general, top level var definitions should be constructed so they cannot fail at runtime.
Bear in mind they may also be evaluated during the AOT compile process, as amalloy notes below.
In your application, you want to create the pool just one time and reuse it. For this reason, delay is used to wrap the (get-pool db-spec) method so that this method will be invoked only the first time it is forced with deref/# and will cache the pool return it in subsequent forcecalls
The difference is that in the delay version a pool will be created only if it is called (which might not be the case if everything was cached), but the non-delay version will instantiate a pool no matter what, i.e. always, even if a database connection is not used.
delay runs only if deref is called and does nothing otherwise.
I would suggest you use an existing library to handle connection pooling, something like hikari-cp, which is highly configurable and works across many implements of SQL.

Persisting State from a DRPC Spout in Trident

I'm experimenting with Storm and Trident for this project, and I'm using Clojure and Marceline to do so. I'm trying to expand the wordcount example given on the Marceline page, such that the sentence spout comes from a DRPC call rather than from a local spout. I'm having problems which I think stem from the fact that the DRPC stream needs to have a result to return to the client, but I would like the DRPC call to effectively return null, and simply update the persisted data.
(defn build-topology
([]
(let [trident-topology (TridentTopology.)]
(let [
;; ### Two alternatives here ###
;collect-stream (t/new-stream trident-topology "words" (mk-fixed-batch-spout 3))
collect-stream (t/drpc-stream trident-topology "words")
]
(-> collect-stream
(t/group-by ["args"])
(t/persistent-aggregate (MemoryMapState$Factory.)
["args"]
count-words
["count"]))
(.build trident-topology)))))
There are two alternatives in the code - the one using a fixed batch spout loads with no problem, but when I try to load the code using a DRPC stream instead, I get this error:
InvalidTopologyException(msg:Component: [b-2] subscribes from non-existent component [$mastercoord-bg0])
I believe this error comes from the fact that the DRPC stream must be trying to subscribe to an output in order to have something to return to the client - but persistent-aggregate doesn't offer any such outputs to subscribe to.
So how can I set up my topology so that a DRPC stream leads to my persisted data being updated?
Minor update: Looks like this might not be possible :( https://issues.apache.org/jira/browse/STORM-38

Cassaforte client/prepared with multi.cql

Is the client/prepared macro only applicable to the .cql namespace and not .multi.cql?
I use multi.cql to control my cluster and session construction, and executing normal queries is fine. However, if I attempt something along the lines of:
(client/prepared
(insert session :some_table {:id "some-id"
:value "some-value"})))
I get an error:
java.lang.ClassCastException: clojure.lang.Var$Unbound cannot be cast to com.datastax.driver.core.Session
at clojurewerkz.cassaforte.client$prepare.invoke(client.clj:174) ~[classes/:na]
at clojurewerkz.cassaforte.client$execute.doInvoke(client.clj:278) ~[classes/:na]
at clojure.lang.RestFn.invoke(RestFn.java:457) ~[clojure-1.5.1.jar:na]
at clojurewerkz.cassaforte.multi.cql$execute_.invoke(cql.clj:17) ~[classes/:na]
at clojurewerkz.cassaforte.multi.cql$insert.doInvoke(cql.clj:132) ~[classes/:na]
at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-1.5.1.jar:na]
My session is constructed fine, I can use it to execute normal queries.
I'm relatively new to Clojure, so it's possible I'm doing something stupid.
If client/prepared is not applicable to .multi.cql - how can I use multi and prepared statements? I see there is an option to force-prepared-queries when creating the cluster, that's a little brute force but probably acceptable.
It was a bug that has been fixed with the latest release (1.1.0) of Cassaforte:
https://groups.google.com/forum/#!topic/clojure-cassandra/JFUtS4k70w8

How can I debug clj-apache-http?

I'm trying to get an OAuth application up, but I fail because the API servers won't talk to me. Unfortunatly the used clj-apache-http won't tell me what was the problem, I only get this warning:
WARNUNG: Authentication error: Unable to respond to any of these challenges: {oauth=WWW-Authenticate: OAuth realm="http%3A%2F%2FSERVERNAME"}
Exception in thread "Thread-1" java.lang.RuntimeException: java.lang.Exception: JSON error (unexpected character): I (example.clj:1)
at clojure.lang.AFn.run(AFn.java:28)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.Exception: JSON error (unexpected character): I (example.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:5440)
at clojure.lang.Compiler.load(Compiler.java:5857)
at clojure.lang.RT.loadResourceScript(RT.java:340)
at clojure.lang.RT.loadResourceScript(RT.java:331)
at clojure.lang.RT.load(RT.java:409)
at clojure.lang.RT.load(RT.java:381)
at clojure.core$load$fn__4511.invoke(core.clj:4905)
at clojure.core$load.doInvoke(core.clj:4904)
at clojure.lang.RestFn.invoke(RestFn.java:409)
at clojure.core$load_one.invoke(core.clj:4729)
Ok, now that isn't helping me. There is a strange character I as the first one in the response - that's clearly no JSON.
I want to get the log level of apaches HTTPClient up to DEBUG now, but I don't know how to set system properties via leiningen. Any tips?
This is a way to get all requests by clj-apache-http through a local debugging proxy:
(require ['com.twinql.clojure.http :as 'http])
(:content
(http/get (java.net.URI. "http://yourhost.com")
:parameters (http/map->params {
:default-proxy (http/http-host
:host "127.0.0.1"
:port 8765)}) :as :string))
Alternatively, you could add this to your log4j.properties:
log4j.logger.httpclient.wire.header=DEBUG
log4j.logger.httpclient.wire.content=DEBUG
log4j.logger.org.apache.commons.httpclient=DEBUG
log4j.logger.org.apache.http=DEBUG
log4j.logger.org.apache.http.wire=DEBUG
Which will dump all communication to your log file. This is useful don't want to change the code that makes the request.

Wrong number of args passed to: repl$repl

I have a trouble with a compojure "Getting started" example that I do notunderstand. When I run the example from http://weavejester.github.com/compojure/docs/getting-started.html
...I get the following error at the lein repl step:
~/hello-www> lein repl src/hello_www/core.clj
Exception in thread "main" java.lang.IllegalArgumentException: Wrong number of args passed to: repl$repl (NO_SOURCE_FILE:0)
at clojure.lang.Compiler.eval(Compiler.java:5359)
at clojure.lang.Compiler.eval(Compiler.java:5311)
at clojure.core$eval__4350.invoke(core.clj:2364)
at clojure.main$eval_opt__6502.invoke(main.clj:228)
at clojure.main$initialize__6506.invoke(main.clj:247)
at clojure.main$script_opt__6526.invoke(main.clj:263)
at clojure.main$main__6544.doInvoke(main.clj:347)
at clojure.lang.RestFn.invoke(RestFn.java:483)
at clojure.lang.Var.invoke(Var.java:381)
at clojure.lang.AFn.applyToHelper(AFn.java:180)
at clojure.lang.Var.applyTo(Var.java:482)
at clojure.main.main(main.java:37)
Caused by: java.lang.IllegalArgumentException: Wrong number of args passed to: repl$repl
at clojure.lang.AFn.throwArity(AFn.java:439)
at clojure.lang.AFn.invoke(AFn.java:43)
at clojure.lang.Var.invoke(Var.java:369)
at clojure.lang.AFn.applyToHelper(AFn.java:165)
at clojure.lang.Var.applyTo(Var.java:482)
at clojure.core$apply__3776.invoke(core.clj:535)
at leiningen.core$_main__59$fn__61.invoke(core.clj:94)
at leiningen.core$_main__59.doInvoke(core.clj:91)
at clojure.lang.RestFn.applyTo(RestFn.java:138)
at clojure.core$apply__3776.invoke(core.clj:535)
at leiningen.core$_main__59.invoke(core.clj:97)
at user$eval__67.invoke(NO_SOURCE_FILE:1)
at clojure.lang.Compiler.eval(Compiler.java:5343)
... 11 more
I have tried both the stable and the developer version of lein without any success. Any ideas on what I could look for next? I get the same result both on linux and cygwin.
When I run it manually, it seems to work fine on linux:
java -cp "lib/*" clojure.main src/hello_www/core.clj
2010-05-17 19:34:17.280::INFO: Logging to STDERR via org.mortbay.log.StdErrLog
2010-05-17 19:34:17.281::INFO: jetty-6.1.14
2010-05-17 19:34:17.382::INFO: Started SocketConnector#0.0.0.0:8080
Taking into account your comment on the question -- the relevant part is "With lein-stable it works, but not with master from git." -- I'd say that you're being hit by Leiningen's new handling of the repl task introduced in commit 44b6369aec1e23bcda1db1b6570a03ca524464e5 from 16th April 2010.
Leiningen 1.1 was released on 16th February and does things the old way, which means the repl task is handled specially by the lein script; post-44b6369aec Leiningen handles the repl task the same way as all the others, i.e. through the leiningen.repl/repl function. The latter simply doesn't accept additional arguments, hence the arity-related IllegalArgumentException that you're seeing. Before you ask, I'm not sure if that is likely to change in the future.
What should work is lein repl followed by (require 'hello-www.core); regrettably, however, there seems to be an issue with Leiningen's HEAD which prevents that from working (at least on my box). It's a safe bet to expect that it's going to get fixed eventually, but for the time being, just use lein-stable. That Compojure tutorial uses Clojure 1.1 and not the bleeding edge... It might save you some time to treat Leiningen the same way.