I have a few threads on the go, each of which make a blocking call to HTTP Kit. My code's been working but has recently taken to freezing after about 30 minutes. All of my threads are stuck at the following point:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
clojure.core$promise$reify__7005.deref(core.clj:6823)
clojure.core$deref.invokeStatic(core.clj:2228)
clojure.core$deref.invoke(core.clj:2214)
my_project.web$fetch.invokeStatic(web.clj:35)
Line my_project.web.clj:35 is something like:
(let [result #(org.httpkit.client/get "http://example.com")]
(I'm using plain Java threads rather than core.async because I'm running the context of a set of concurrent Apache Kafka clients each in their own thread. The Kafka Client does spin up a lot of its own threads, especially as I'm running it a few times, e.g. 5 in parallel).
The fact that all of my threads end up parked like this in HTTP Kit suggests a resource leak, or some code in HTTP Kit dying before it has chance to deliver, or perhaps resource starvation.
Another thread seems to be stuck here. It's possible that it's blocking all of the promise deliveries.
sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:850)
sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
org.httpkit.client.HttpsRequest.unwrapRead(HttpsRequest.java:35)
org.httpkit.client.HttpClient.doRead(HttpClient.java:131)
org.httpkit.client.HttpClient.run(HttpClient.java:377)
java.lang.Thread.run(Thread.java:748)
Any ideas what the problem could be, or pointers for how to diagnose it?
A common thing to do is to set up a DefaultUncaughtExceptionHandler.
This will at least give you an indication if there are exceptions in your threads.
(defn init-jvm-uncaught-exception-logging []
(Thread/setDefaultUncaughtExceptionHandler
(reify Thread$UncaughtExceptionHandler
(uncaughtException [_ thread ex]
(log/error ex "Uncaught exception on" (.getName thread))))))
Stuart Sierra has written nicely on this: https://stuartsierra.com/2015/05/27/clojure-uncaught-exceptions
Related
I am making a Messenger bot and I am using Ring as my http framework.
Sometime I want to apply delays between messages sent by the bot. My expectation would be that it is safe to use Thread/sleep because this will make the active thread sleep and not the entire server. Is that so, or should I resort to clojure/core.async?
This is the code I would be writing without async:
(match [reply]
; The bot wants to send a message (text, images, videos etc.) after n milliseconds
[{:message message :delay delay}]
(do
(Thread/sleep interval delay)
(facebook/send-message sender-id message))
; More code would follow...
A link to Ring code where its behaviour in this sense is clear would be appreciated, as well as any other with explanation on the matter.
Ring is the wrong thing to ask this question about: ring is not an http server, but rather an abstraction over http servers. Ring itself does not have a fixed threading model: all it really cares about is that you have a function from request to response.
What really makes this decision is which ring adapter you use. By far the most common is ring-jetty-adapter, which is a jetty http handler that delegates to your function through ring. And jetty does indeed have a single thread for each request, so that you can sleep in one thread without impacting others (but as noted in another answer, threads are not free, so you don't want to do a ton of this regularly).
But there are other ring handlers with different threading models. For example, aleph includes a ring adapter based on netty, which uses java.nio for non-blocking IO in a small, limited threadpool; in that case, sleeping on a "request thread" is very disruptive.
Assuming you're talking about code in a handler, Thread/sleep in Ring does make the thread for the request sleep. If you have multiple requests you are burning up expensive server threads.
The reason why Ring blocks is because the (non-async) model is based on function composition, where the result of one function is the output for another. So they have to wait, where exactly I can pinpoint this in the code I don't know.
Putting it in a go-block is better, because then you are not blocking server threads. It can return the response while you send the message. Do note that you cannot use results from the go block.
If you also want a response asynchronously (without blocking a server thread) you can for example use Pedestal.
For most servers synchronous handlers are sufficient, but if you are using Thread/sleeps AND want a response I would recommend using asynchronous Ring handlers or Pedestal or another framework.
I'm trying to use lwjgl with clojure for game developement.
My first step is trying to display something on an OpenGL screenfrom the REPL. After launching the repl with lein repl this is what I have done so far:
(import org.lwjgl.opengl GL11 Display DisplayMode
(Display/setDisplayMode (DisplayMode. 800 600))
(Display/create) ; This shows a black 800x600 window as expected
(GL11/glClearColor 1.0 0.0 0.0 1.0)
(GL11/glClear (bit-or GL11/GL_COLOR_BUFFER_BIT GL11/GL_DEPTH_BUFFER_BIT))
(Display/update)
Note that this works, if done quick enough. But after a while (even if I just wait) I start getting errors about the current OpenGL context not being bound to the current thread.
(Display/update)
IllegalStateException No context is current org.lwjgl.opengl.LinuxContextImplementation.swapBuffers (LinuxContextImplementation.java:72)
(GL11/glClear ...)
RuntimeException No OpenGL context found in the current thread. org.lwjgl.opengl.GLContext.getCapabilities (GLContext.java:124)
But maybe the most intriguing of the errors happens when I try to call Display/destroy
(Display/destroy)
IllegalStateException From thread Thread[nREPL-worker-4,5,main]: Thread[nREPL-worker-0,5,] already has the context current org.lwjgl.opengl.ContextGL.checkAccess (ContextGL.java:184)
It all looks as if the repl randomly spawned another thread after some time of inactivity. As I've been able to read, LWJGL only lets you make OpenGL calls from the thread from where it originally was created, so I bet this is causing those errors.
But how could the REPL be randomly switching threads? And especially if I'm not doing anything, just waiting.
It's a known issue already reported against nREPL project (and discussed on Clojure Google Group). It seems that nREPL uses thread pool which terminates idle threads (probably according to keepalive setting).
Until it's fixed you can use a workaround for this (a bit awkward, I admit):
(import '(java.util.concurrent Executors))
(def opengl-executor (Executors/newSingleThreadExecutor))
(defmacro with-executor [executor & body]
`(.submit ~executor (fn [] ~#body)))
(on-executor opengl-executor
(println (.getId (Thread/currentThread))))
By using your own executor, all code wrapped in on-executor will be executed in its thread. newSingleThreadExecutor creates one single thread which according to doc will replace it only when the current one fails due to exception. When you try to execute the last expression with long delays, the printed thread ID should remain the same.
Remember that you should shutdown the executor when stopping your application.
I have a clojure processing app that is a pipeline of channels. Each processing step does its computations asynchronously (ie. makes a http request using http-kit or something), and puts it result on the output channel. This way the next step can read from that channel and do its computation.
My main function looks like this
(defn -main [args]
(-> file/tmp-dir
(schedule/scheduler)
(search/searcher)
(process/resultprocessor)
(buy/buyer)
(report/reporter)))
Currently, the scheduler step drives the pipeline (it hasn't got an input channel), and provides the chain with workload.
When I run this in the REPL:
(-main "some args")
It basically runs forever due to the infinity of the scheduler. What is the best way to change this architecture such that I can shut down the whole system from the REPL? Does closing each channel means the system terminates?
Would some broadcast channel help?
You could have your scheduler alts! / alts!! on a kill channel and the input channel of your pipeline:
(def kill-channel (async/chan))
(defn scheduler [input output-ch kill-ch]
(loop []
(let [[v p] (async/alts!! [kill-ch [out-ch (preprocess input)]]
:priority true)]
(if-not (= p kill-ch)
(recur))))
Putting a value on kill-channel will then terminate the loop.
Technically you could also use output-ch to control the process (puts to closed channels return false), but I normally find explicit kill channels cleaner, at least for top-level pipelines.
To make things simultaneously more elegant and more convenient to use (both at the REPL and in production), you could use Stuart Sierra's component, start the scheduler loop (on a separate thread) and assoc the kill channel on to your component in the component's start method and then close! the kill channel (and thereby terminate the loop) in the component's stop method.
I would suggest using something like https://github.com/stuartsierra/component to handle system setup. It ensures that you could easily start and stop your system in the REPL. Using that library, you would set it up so that each processing step would be a component, and each component would handle setup and teardown of channels in their start and stop protocols. Also, you could probably create an IStream protocol for each component to implement and have each component depend on components implementing that protocol. It buys you some very easy modularity.
You'd end up with a system that looks like the following:
(component/system-map
:scheduler (schedule/new-scheduler file/tmp-dir)
:searcher (component/using (search/searcher)
{:in :scheduler})
:processor (component/using (process/resultprocessor)
{:in :searcher})
:buyer (component/using (buy/buyer)
{:in :processor})
:report (component/using (report/reporter)
{:in :buyer}))
One nice thing with this sort of approach is that you could easily add components if they rely on a channel as well. For example, if each component creates its out channel using a tap on an internal mult, you could add a logger for the processor just by a logging component that takes the processor as a dependency.
:processor (component/using (process/resultprocessor)
{:in :searcher})
:processor-logger (component/using (log/logger)
{:in processor})
I'd recommend watching his talk as well to get an idea of how it works.
You should consider using Stuart Sierra's reloaded workflow, which depends on modelling your 'pipeline' elements as components, that way you can model your logical singletons as 'classes', meaning you can control the construction and destruction (start/stop) logic for each one of them.
I am running a clojure app reading from a kaka stream. I am using the shovel github project https://github.com/l1x/shovel to read from a kafka stream. When I profiled my application using visual vm looking for hotspots I noticed that most of the cpu time about 70% is being spent in the function clojure.core$promise$reify__6310.deref.
The shovel api consumer is a thinwrapper on the Kafka consumergroup api. It reads from a kafka topic and publishes out to a core async channel. Should i be concerned that my application latencies would be affected if i continued using this api. Is there any explanation why the reify on the promise is taking this much cpu time.
In Clojure, $ is used in the printed representation of a class to represent an inner class. clojure.core$promise$reify__6310.deref means calling the method deref on a class that is created via reify as an inner class of clojure.core/promise. As it turns out, if you look at the class of a promise, it will show up as an inner reified class inside clojure.core$promise.
A promise in Clojure represents data that may not yet be available. You can see its behavior in a repl:
user> (def p (promise))
#'user/p
user> (class p)
clojure.core$promise$reify__6363
user> (deref p)
This will hang and give no result, and not give the next repl prompt, until you deliver to the promise from another repl connection, or interrupt the deref call. The fact that time is being spent on deref of a promise simply means that the program logic is waiting on values that are not yet computed (or have not yet come in via the network, etc.).
The scenario I try to resolve is a s follows, I have a testing program that makes a web to a web endpoint on a system.
This test program has a jetty web server running on which it expects a callback from the external system that completes a successful test cycle. In case that the callback is not received during an specific time range (timeout), the test fails.
To achieve this, I want the test runner to wait on an "event" that the jetty handler will set upon callback.
I thought about using java's CyclicBarrier but I wonder if there is an idiomatic way in clojure to solve this.
Thanks
You can use promise you asked about recently :) Something like this:
(def completion (promise))
; In test runner.
; Wait 5 seconds then fail.
(let [result (deref completion 5000 :fail)]
(if (= result :success)
(println "Great!")
(println "Failed :(")))
; In jetty on callback
(deliver completion :success)
In straight Clojure, using an agent that tracks outstanding callbacks would make sense, though in practice I would recommend using Aleph, which is a library for asynchronous web programming that makes even driven handlers rather easy. It produces ring handlers, which sounds like it would fit nicely with your existing code.