I have to following code for automation, the function accepts a unique number, and kick off a firefox. I could kick off multiple threads, each thread with a unique x passing to the function, so the function will be executed concurrently. Then will the local atom current-page be visible to other threads? if visible, then the reset! could set the atom an expected value from another thread
(defn consumer-scanning-pages [x]
(while true
(let [driver (get-firefox x)
current-page (atom 0)]
....
(reset! current-page ..)
)))
The atom will be visible to those threads you explicitly pass it to, to any further threads that those threads pass it to etc. It is no different in this respect to any other value that you may or may not pass around.
"Passing the atom to a thread" can be as simple as referring to an in-scope local it is stored in within the body of a Clojure thread-launching form:
(let [a (atom :foo)]
;; dereferencing the future object representing an off-thread computation
#(future
;; dereferencing the atom on another thread
#a))
;;= :foo
Merely creating an atom doesn't make it available to code that it is not explicitly made available to, and this is also true of code that happens to run on the thread that originally created the atom. (Consider a function that creates an atom, but never stores it in any externally visible data structures and ultimately returns an unrelated value. The atom it creates will become eligible for GC when the function returns at the latest; it will not be visible to any other code, on the same or any other thread.) Again, this is also the case with all other values.
It will not. You are creating a new atom each time that you call the function.
If you want a shared atom, just pass the atom as a param to consumer-scanning-pages
Related
In Clojure, promise objects implement clojure.lang.IFn, and invoking the promise with a single argument fulfills the promise. That's how deliver is implemented:[source]
(defn deliver
"Delivers the supplied value to the promise, releasing any pending
derefs. A subsequent call to deliver on a promise will have no effect."
{:added "1.1"
:static true}
[promise val] (promise val))
If (deliver x y) is just a level of indirection over (x y), why use deliver at all?
I'm assuming this is supposed to help disambiguate promises from functions in some way—but the same argument could apply to using some promise-specific function to read from a promise rather than using the general deref function for that.
It's syntactic sugar to make code like this look nice:
(-> url
download
extract-value
(deliver consumer)
The deliver function used to have the behavior of ensuring that if you where the second caller to it an exception would be thrown. It was changed in 2011 and now the later calls are simply ignored.
Promises always had the same behavior if called as a function and if called from deliver, the function deliver only filled the roll of making something a little different look a little different. These days T would still use it to communicate with my future self
deref is a lot less general than the function-call mechanism. When you see something deref'd, you know it is fetching some value from somewhere. When you see (f x) you really have no idea what is happening if you don't already know what f is: it could do anything at all. deliver gives you more context.
How to convey all of the current thread's bindings to another thread? To be specific, I need the following snippet to print 2 (not 1) to stdout:
(defvar *foo* 1)
(let ((*foo* 2))
(bordeaux-threads:make-thread (lambda () (print *foo*)))) ;; prints 1
Of course I could copy *foo*'s value by hand, like this:
(let ((*foo* 2))
(bordeaux-threads:make-thread
(let ((foo-binding *foo*))
(lambda ()
(let ((*foo* foo-binding))
(print *foo*)))))) ;; prints 2
but is there anything that will allow to copy all of them at once?
The API is explicit regarding variable sharing:
The interaction between threads and dynamic variables is in some
cases complex, and depends on whether the variable has only a global
binding (as established by e.g. DEFVAR/DEFPARAMETER/top-level SETQ) or
has been bound locally (e.g. with LET or LET*) in the calling thread.
1.
Global bindings are shared between threads: the initial value of a global variable in the new thread will be the same as in the parent,
and an assignment to such a variable in any thread will be visible to
all threads in which the global binding is visible.
2.
Local bindings are local to the thread they are introduced in,
except that
3.
Local bindings in the the caller of MAKE-THREAD may or may not be shared with the new thread that it creates: this is
implementation-defined. Portable code should not depend on particular
behaviour in this case, nor should it assign to such variables without
first rebinding them in the new thread.
So make the the binding global and not local seems to be the easiest (not implementation dependent) route.
#coredump also suggests to checkout the *default-special-bindings* list for a possible sharing methodology:
This variable holds an alist associating special variable symbols with
forms to evaluate for binding values. Special variables named in this
list will be locally bound in the new thread before it begins
executing user code.
This variable may be rebound around calls to MAKE-THREAD to add/alter
default bindings. The effect of mutating this list is undefined, but
earlier forms take precedence over later forms for the same symbol, so
defaults may be overridden by consing to the head of the list.
Forms are evaluated in the new thread or in the calling thread?
Standard contents of this list: print/reader control, etc. Can borrow
the Franz equivalent?
I have the following piece of code
(def number (ref 0))
(dosync (future (alter number inc))) ; A
(future (dosync (alter number inc))) ; B
The 2nd one succeeds, but the first one fails with no transaction is running. But it is wrapped inside a dosync right?
Does clojure remember opening of transactions based on which thread it was created in ?
You are correct. The whole purpose of dosync is to begin a transaction in the current thread. The future runs its code in a new thread, so the alter in case A is not inside of a dosync for its thread.
For case B, the alter and dosync are both in the same (new) thread, so there is no problem.
There are multiple reasons this doesn't work. As Alan Thompson writes, transactions are homed to a single thread, and so when you create a new thread you lose your transaction.
Another problem is the dynamic scope of dosync. The same problem would arise if you wrote
((dosync #(alter number inc)))
Here we create a function inside of the dosync scope, and let that function be the result of the dosync. Then we call the function from outside of the dosync block, but of course the transaction is no longer running.
That's very similar to what you're doing with future: future creates a function and then executes it on a new thread, returning a handle you can use to inspect the progress of that thread. Even if cross-thread transactions were allowed, you would have a race condition here: does the dosync block close its transaction before or after the alter call in the future is executed?
I'm using
(def f
(future
(while (not (Thread/interrupted))
(function-to-run))))
(Thread/sleep 100)
(future-cancel f)
to cancel my code after a specified amount of time (100ms).
The problem is, I need to cancel the already running function 'function-to-run' as well, it is important that it really stops executing that function after 100ms.
Can I somehow propagate the interrupted signal to the function?
The function is not third-party, I wrote it myself.
The basic thing to note here is: you cannot safely kill a thread without its own cooperation. Since you are the owner of the function you wish to be able to kill prematurely, it makes sense to allow the function to cooperate and die gracefully and safely.
(defn function-to-run
[]
(while work-not-done
(if-not (Thread/interrupted)
; ... do your work
(throw (InterruptedException. "Function interrupted...")))))
(def t (Thread. (fn []
(try
(while true
(function-to-run))
(catch InterruptedException e
(println (.getMessage e)))))))
To begin the thread
(.start t)
To interrupt it:
(.interrupt t)
Your approach was not sufficient for your use case because the while condition was checked only after control flow returned from function-to-run, but you wanted to stop function-to-run during its execution. The approach here is only different in that the condition is checked more frequently, namely, every time through the loop in function-to-run. Note that instead of throwing an exception from function-to-run, you could also return some value indicating an error, and as long as your loop in the main thread checks for this value, you don't have to involve exceptions at all.
If your function-to-run doesn't feature a loop where you can perform the interrupted check, then it likely is performing some blocking I/O. You may not be able to interrupt this, though many APIs will allow you to specify a timeout on the operation. In the worst case, you can still perform intermittent checks for interrupted in the function around your calls. But the bottom line still applies: you cannot safely forcibly stop execution of code running in the function; it should yield control cooperatively.
Note:
My original answer here involved presenting an example in which java's Thread.stop() was used (though strongly discouraged). Based on feedback in the comments, I revised the answer to the one above.
I'm building a wrapper around OrientDB in Clojure. One of the biggest limitations (IMHO) of OrientDB is that the ODatabaseDocumentTx is not thread-safe, and yet the lifetime of this thing from .open() to .close() is supposed to represent a single transaction, effectively forcing transactions to occur is a single thread. Indeed, thread-local refs to these hybrid database/transaction objects are provided by default. But what if I want to log in the same thread as I want to persist "real" state? If I hit an error, the log entries get rolled back too! That use case alone puts me off of virtually all DBMS's since most do not allow named transaction scope management. /soapbox
Anyways, OrientDB is the way it is, and it's not going to change for me. I'm using Clojure and I want an elegant way to construct a with-tx macro such that all imperative database calls within the with-tx body are serialized.
Obviously, I can brute-force it by creating a sentinel at the top level of the with-tx generated body and deconstructing every form to the lowest level and wrapping them in a synchronized block. That's terrible, and I'm not sure how that would interact with something like pmap.
I can search the macro body for calls to the ODatabaseDocumentTx object and wrap those in synchronized blocks.
I can create some sort of dispatching system with an agent, I guess.
Or I can subclass ODatabaseDocumentTx with synchronized method calls.
I'm scratching my head trying to come up with other approaches. Thoughts? In general the agent approach seems more appealing simply because if a block of code has database method calls interspersed, I would rather do all the computation up front, queue the calls, and just fire a whole bunch of stuff to the DB at the end. That assumes, however, that the computation doesn't need to ensure consistency of reads. IDK.
Sounds like a job for Lamina.
One option would be to use Executor with 1 thread in thread pool. Something like shown below. You can create a nice macro around this concept.
(import 'java.util.concurrent.Executors)
(import 'java.util.concurrent.Callable)
(defmacro sync [executor & body]
`(.get (.submit ~executor (proxy [Callable] []
(call []
(do ~#body))))))
(let [exe (Executors/newFixedThreadPool (int 1))
dbtx (sync exe (DatabaseTx.))]
(do
(sync exe (readfrom dbtx))
(sync exe (writeto dbtx))))
The sync macro make sure that the body expression is executed in the executor (which has only one thread) and it waits for the operation to complete so that all operations execute one by one.