Detect non-empty STDIN in Clojure - clojure

How do you detect non-empty standard input (*in*) without reading from it in a non-blocking way in Clojure?
At first, I thought calling using the java.io.Reader#ready() method would do, but (.ready *in*) returns false even when standard input is provided.

Is this what you are looking for? InputStream .available
(defn -main [& args]
(if (> (.available System/in) 0)
(println "STDIN: " (slurp *in*))
(println "No Input")))
$ echo "hello" | lein run
STDIN: hello
$ lein run
No Input
Update: It does seem that .available is a race condition checking STDIN. n alternative is to have a fixed timeout for STDIN to become available otherwise assume no data is coming from STDIN
Here is an example of using core.async to attempt to read the first byte from STDIN and append it to the rest of the STDIN or timeout.
(ns stdin.core
(:require
[clojure.core.async :as async :refer [go >! timeout chan alt!!]])
(:gen-class))
(defn -main [& args]
(let [c (chan)]
(go (>! c (.read *in*)))
(if-let [ch (alt!! (timeout 500) nil
c ([ch] (if-not (< ch 0) ch)))]
(do
(.unread *in* ch)
(println (slurp *in*)))
(println "No STDIN"))))

Have you looked at PushbackReader? You can use it like:
Read a byte (blocking). Returns char read or -1 if stream is closed.
When returns, you know a byte is ready.
If the byte is something you're not ready for, put it back
If stream is closed (-1 return val), exit.
Repeat.
https://docs.oracle.com/javase/8/docs/api/index.html?java/io/PushbackReader.html
If you need it to be non-blocking stick it into a future, a core.async channel, or similar.

Related

How to execute program from Clojure without any external libraries and showing its output at real time?

My attempt:
(import 'java.lang.Runtime)
(. (Runtime/getRuntime) exec (into-array ["youtube-dl" "--no-playlist" "some youtube video link"]))
I also tried sh. But both approaches don't do what I want - running a program similarly like shell does (sh waits until program exits, exec launches it and doesn't wait for its exit; both don't output anything to standard output). I want live showing of process output, e.g. when I run youtube-dl I want to see progress of a video download.
How to do this simple simple task in Clojure?
You must start the process and listen to its output stream. One solution is :
(:require [clojure.java.shell :as sh]
[clojure.java.io :as io])
(let [cmd ["yes" "1"]
proc (.exec (Runtime/getRuntime) (into-array cmd))]
(with-open [rdr (io/reader (.getInputStream proc))]
(doseq [line (line-seq rdr)]
(println line))))

Should clojure.core.async channel be drained to release parked puts

The problem: I have channel that consumer reads from and might stop reading when got enough data. When reader stops it closes channel with clojure.core.async/close!
The documentation says that at this moment all puts to channel after close is invoked should return false and do nothing. But the documentation also says that
Logically closing happens after all puts have been delivered. Therefore, any blocked or parked puts will remain blocked/parked until a taker releases them.
Does it mean that to release producers that were already blocked in parked puts at the moment of closing channel I should always also drain channel (read all remaining items) at consumer side? Following code shows that go block never finishes:
(require '[clojure.core.async :as a])
(let [c (a/chan)]
(a/go
(prn "Go")
(prn "Put" (a/>! c 333)))
(Thread/sleep 300) ;; Let go block to be scheduled
(a/close! c))
If this is true, and I do not want to read all events then I should implement e.g. timeouts at producer side to detect that no more data is necessary?
Is there simpler way for consumer to tell "enough" to push back so producer stops also gracefully?
I found out that clojure.core.async/put! does not block and allows to avoid unnecessary blocking. Are there disadvantages of using it instead of clojure.core.aasync/>!?
closing chans frees all who are reading from them them, and leaves writers blocked
here is the reading case (where it works nicely):
user> (def a-chan (async/chan))
#'user/a-chan
user> (future (async/<!! a-chan)
(println "continuting after take"))
#future[{:status :pending, :val nil} 0x5fb5a025]
user> (async/close! a-chan)
nil
user> continuting after take
And here is a test of the writing case where, as you say, draining it may be a good idea:
user> (def b-chan (async/chan))
#'user/b-chan
user> (future (try (async/>!! b-chan 4)
(println "continuting after put")
(catch Exception e
(println "got exception" e))
(finally
(println "finished in finally"))))
#future[{:status :pending, :val nil} 0x17be0f7b]
user> (async/close! b-chan)
nil
I don't find any evidence of the stuck writer unblocking here when the chan is closed
This behavior is intended, since they explicitly state it in the docs!
In your case, do (while (async/poll! c)) after closing channel c to release all blocked/parked (message sending) threads/go-blocks.
If you want to do anything with the content you can do:
(->> (repeatedly #(async/poll! c))
(take-while identity))

Clojure printing while the script is still running to the REPL

How do you print your output into REPL whilst the script is still running? I've notice that it stores everything to buffer, and then prints once it completes the code.
(defn -main
[x]
(when (pos? x)
(println x)
(Thread/sleep 10000)
(recur (dec x))))
(-main 10)
Java (& thus Clojure) use buffered output. If you are trying to print in the middle of a tight loop, you need the flush:
(println x)
(flush)

easiest way to use a i/o callback within concurrent http-kit/get instances

I am launching a few hundreds concurrent http-kit.client/get requests provided with a callback to write results to a single file.
What would be a good way to deal with thread-safety? Using chanand <!! from core.asyc?
Here's the code I would consider :
(defn launch-async [channel url]
(http/get url {:timeout 5000
:user-agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:10.0) Gecko/20100101 Firefox/10.0"}
(fn [{:keys [status headers body error]}]
(if error
(put! channel (json/generate-string {:url url :headers headers :status status}))
(put! channel (json/generate-string body))))))
(defn process-async [channel func]
(when-let [response (<!! channel)]
(func response)))
(defn http-gets-async [func urls]
(let [channel (chan)]
(doall (map #(launch-async channel %) urls))
(process-async channel func)))
Thanks for your insights.
Since you are already using core.async in your example, I thought I'd point out a few issues and how you can address them. The other answer mentions using a more basic approach, and I agree wholeheartedly that a simpler approach is just fine. However, with channels, you have a simple way of consuming the data which does not involve mapping over a vector, which will also grow large over time if you have many responses. Consider the following issues and how we can fix them:
(1) Your current version will crash if your url list has more than 1024 elements. There's an internal buffer for puts and takes that are asynchronous (i.e., put! and take! don't block but always return immediately), and the limit is 1024. This is in place to prevent unbounded asynchronous usage of the channel. To see for yourself, call (http-gets-async println (repeat 1025 "http://blah-blah-asdf-fakedomain.com")).
What you want to do is to only put something on the channel when there's room to do so. This is called back-pressure. Taking a page from the excellent wiki on go block best practices, one clever way to do this from your http-kit callback is to use the put! callback option to launch your next http get; this will only happen when the put! immediately succeeds, so you will never have a situation where you can go beyond the channel's buffer:
(defn launch-async
[channel [url & urls]]
(when url
(http/get url {:timeout 5000
:user-agent "Mozilla"}
(fn [{:keys [status headers body error]}]
(let [put-on-chan (if error
(json/generate-string {:url url :headers headers :status status})
(json/generate-string body))]
(put! channel put-on-chan (fn [_] (launch-async channel urls))))))))
(2) Next, you seem to be only processing one response. Instead, use a go-loop:
(defn process-async
[channel func]
(go-loop []
(when-let [response (<! channel)]
(func response)
(recur))))
(3) Here's your http-gets-async function. I see no harm in adding a buffer here, as it should help you fire off a nice burst of requests at the beginning:
(defn http-gets-async
[func urls]
(let [channel (chan 1000)]
(launch-async channel urls)
(process-async channel func)))
Now, you have the ability to process an infinite number of urls, with back-pressure. To test this, define a counter, and then make your processing function increment this counter to see your progress. Using a localhost URL that is easy to bang on (wouldn't recommend firing off hundreds of thousands of requests to, say, google, etc.):
(def responses (atom 0))
(http-gets-async (fn [_] (swap! responses inc))
(repeat 1000000 "http://localhost:8000"))
As this is all asynchronous, your function will return immediately and you can look at #responses grow.
One other interesting thing you can do is instead of running your processing function in process-async, you could optionally apply it as a transducer on the channel itself.
(defn process-async
[channel]
(go-loop []
(when-let [_ (<! channel)]
(recur))))
(defn http-gets-async
[func urls]
(let [channel (chan 10000 (map func))] ;; <-- transducer on channel
(launch-async channel urls)
(process-async channel)))
There are many ways to do this, including constructing it so that the channel closes (note that above, it stays open). You have java.util.concurrent primitives to help in this regard if you like, and they are quite easy to use. The possibilities are very numerous.
This is simple enough that I wouldn't use core.async for it. You can do this with an atom storing use a vector of the responses, then have a separate thread reading the contents of atom until it's seen all of the responses. Then, in your http-kit callback, you could just swap! the response into the atom directly.
If you do want to use core.async, I'd recommend a buffered channel to keep from blocking your http-kit thread pool.

Can't seem to require >!! or <!! in Clojurescript?

I must be missing something very obvious here but I'm trying to setup a very basic program to put an item onto a channel then block until I can take it off again. The entire program is below:
(ns shopping-2.core
(:require [cljs.core.async :as async :refer [>!! <!! put! chan <! close! <!!]]))
(let [c (chan)]
(>!! c "hello")
(.write js/document (<!! c))
(close! c))
The JavaScript error I'm getting is:
Uncaught TypeError: Cannot call method 'call' of undefined
I had that error before when I forgot to :refer chan in (if I just open the channel, then close it again the program runs fine)
However this code seems to choke when I want to use the <!! or >!! macros.
There are some differences on what's available in clojurescript from the clojure version of core.async.
Because clojure on the JVM has real threads, it makes available both concurrency patterns with real threads and with go blocks:
Real threads use the macro thread to enclose the core.async magic
and its concurrency macros and functions end with two bangs, like
<!!, >!!, alt!! and alts!!.
Inversion of control threads
(fake threads) use the go macro to enclose the core.async magic
and uses the functions with one bang at the end, like <!, >!,
alt! and alts!.
In clojurescript (which runs in js) there are no real threads, so only the IoC (inversion of control) threads are available, which means that you have to use the second variant of the concurrency constructs.
Your example would be like:
(ns shopping-2.core
(:require-macros [cljs.core.async.macros :refer [go]])
(:require [cljs.core.async :as async :refer [put! chan <! >! close!]]))
(go
(let [c (chan)]
(>! c "hello")
(.write js/document (<! c))
(close! c)))
Anyway, that example has a concurrency problem, since the <! >! functions are blocking and you are putting into a chan in the same routine, the routine will block on the (>! c "hello") instruction and it will never read, definitely starving your program.
You could fix this using the put! fn which puts without blocking, or efectively running those instructions in different routines which I think demonstrates better what you intended to do.
(ns shopping-2.core
(:require-macros [cljs.core.async.macros :refer [go]])
(:require [cljs.core.async :as async :refer [put! chan <! >! close!]]))
;; put! version
(go
(let [c (chan)]
(put! c "hello")
(.write js/document (<! c))
(close! c)))
;; Concurrent version
;; 2 *threads* concurrently running. One is the putter and the other is the
;; reader
(let [c (chan)]
(go
(.write js/document (<! c))
(close! c))
(go
(>! c "hello")))
In the concurrent threads version, you will see that even the code that runs first is the read, it is effectively another routine so the code that runs later (the >!) effectively runs unblocking the first routine.
You can think of the go macro as spawning a new thread that will eventually start executing concurrently and that returns control to the next instructions of code after it immediately.
I suggest reading the code walk-through ignoring the clojure specific parts (>!! <!! etc) and some of the swannodette's tutorials which are great (like Clojurescript 101 and Communicating Sequential Processes)
The ClojureScript version of core.async doesn't include <!! or >!!.
I couldn't find a source for this besides the actual source: https://github.com/clojure/core.async/blob/56ded53243e1ef32aec71715b1bfb2b85fdbdb6e/src/main/clojure/cljs/core/async.cljs