Readers-writers using STM in Clojure - clojure

There is the following version of readers-writers problem: multiple readers and writers, 2 or more readers can read simultaneously, if a writer is writing no one can read or write, it is preferred if all writers get an equal chance to write (for example in 100 rounds 5 writers should write about 20 times each). What is the proper way to implement this in Clojure using STM? I'm not looking for a complete code, just some general directions.

Clojure's built-in STM can't really include all the constraints you are looking for because readers never wait for writers and your requirements require readers to wait.
if you can forgive not blocking readers then you can go ahead and
(. (java.lang.Thread. #(dosync (write stuff)) start))
(. (java.lang.Thread. #(dosync (read stuff)) start))
if you need readers to block then you will need a different STM, the world has lots of them

Clojure's STM gives you much nicer guarantees than that. Writers wait for each other, but readers can still read while a writer is writing; it just sees the most-recent consistent state. If a writer isn't done writing yet, the reader doesn't see its changes at all.

As mentioned in other answers that readers don't block while reading and you want reader to block then you probably implement them as "writer" which write the same value it gets in its callback function. I know this is weird solution but may be this can help you out or give you some further directions.

Related

What is the difference between locking and atom/reset!/swap! in Clojure

I was reading some source code and came across locking usage in Clojure. It made me thinking about the atom version. So what are the differences between 2 code snippets, I think that they do the same thing?
(def lock (Object.))
(locking lock
...some operation)
(def state (atom true))
(when #state
(reset! state false)
...some operation
(reset! state true))
Locking (aka synchronization) is only ever needed when multiple threads are changing a piece of mutable state.
The locking macro is a low-level feature that one almost never needs to use in Clojure. It is similar in some ways to a synchronized block in Java.
In clojure, one normally just uses an atom for this purpose. On rare occasions an agent or ref is called for. In even rarer situations, you can use a dynamic Var to get thread-local mutable state.
Internally, a Clojure atom delegates all currency operations to the class java.util.concurrent.atomic.AtomicReference.
Your code snippet shows a misunderstanding of an atom's purpose and operation. Two concurrent threads could process your atom code snipped at the same time, so the attempt does not provide thread safety and would result in bugs and corrupted data.
If you want to explore the really primitive (i.e. Java 1.2) synchronization primitives, see:
The Oracle docs
The book Effective Java (1st edition shows most details, 3rd Ed. defers to higher-level classes)
The book Java Concurrency in Practice This book is so scary it'll turn your hair white. After reading this one, you'll be so afraid of roll-your-own concurrency that you'll never even think about doing it again.
Double-Checked Locking is Broken: A excellent example of how hard it is to get locking right (also here).

An imaginary lock mechanism: non-blocking write, read and invalidate

Here is the scenario. Bob is a writer and Alice is a reader. Bob writes things and Alice reads them. The rules are:
1) Bob can write whether Alice is reading or not (reading does not block writes).
2) When Bob is writing, Alice cannot read (writing does block reads).
3) When Alice finishes reading, she can know if Bob wrote during her read (readers can detect if the data they just read is not valid).
2) and 3) are really one combined rule, but I list two for good discussion. The problem can be solved by one mutex and one counter (version number), but what I do not know is, is the above a well-know scenario with a commonly used name? Has any research been done on it?
Which I do not know is, is the problem a well-know scenario named by terms?
Yes, it is called Seqlock:
https://en.wikipedia.org/wiki/Seqlock
Does anyone study at it or I am just making a wheel?
AFAIK there are a variety of implementation (such as Linux kernel) and papers.

Real time data streaming with 1 writer and N concurrent readers

A server controls 1 writer continuously producing data frames in real time and N possible concurrent read requests. Whenever a reader makes a request to the server, the reader should be able to get the most recent produced frame or wait for it, if not available. Although, it is allowed for N different readers to concurrently "consume" the same frame, each individual reader must not read the same frame more than one time.
Is there any well-known algorithm or a strategy for the above problem which does not waste too many resources and gives the readers a good throughput?
For now my idea is to use the so called "triple buffering" (one buffer per frame), where two buffers are filled by the writer alternatively and one buffer is shared by the concurrent readers. If the number of concurrent readings is 0, once a frame has been produced, the corresponding buffer can be swapped with the buffer dedicated to the readers. It seems an easy model, although all the concurrent readers might be affected by the timings of the slowest reader in the group. The problem about making sure that one reader cannot get the same frame two times has still to be solved with some sort of synchronisation in a clean way which fits the above model.
If you any other idea, or code (in modern C++ is preferred), C++ library... I'd appreciate it.
the leader of project Disruptor: Martin Thompson has this new project: Aeron and it's super fast. What's more, it's already support C++ api. Check out the introduction video and article from highscalability:
https://www.youtube.com/watch?v=tM4YskS94b0
http://highscalability.com/blog/2014/11/17/aeron-do-we-really-need-another-messaging-system.html
If I understood your question correctly, you can use disruptor pattern here. It uses ring buffers to effectivly pass data between threads. See multicast events section here. The LMAX disruptor was originaly written in java, though some implementation exists for c++. See pure c version, c++11 version and another c++ version. Also, have you seen intel thread building blocks library? It has some usefull and highly effective concurrent data structures, scheduler, synchronization primitives for c++. Hope this helps...

how to synchronize three dependent threads

If I have
1. mainThread: write data A,
2. Thread_1: read A and write it to into a Buffer;
3. Thread_2: read from the Buffer.
how to synchronize these three threads safely, with not much performance loss? Is there any existing solution to use? I use C/C++ on linux.
IMPORTANT: the goal is to know the synchronization mechanism or algorithms for this particular case, not how mutex or semaphore works.
First, I'd consider the possibility of building this as three separate processes, using pipes to connect them. A pipe is (in essence) a small buffer with locking handled automatically by the kernel. If you do end up using threads for this, most of your time/effort will be spent on creating nearly an exact duplicate of the pipes that are already built into the kernel.
Second, if you decide to build this all on your own anyway, I'd give serious consideration to following a similar model anyway. You don't need to be slavish about it, but I'd still think primarily in terms of a data structure to which one thread writes data, and from which another reads the data. By strong preference, all the necessary thread locking necessary would be built into that data structure, so most of the code in the thread is quite simple, reading, processing, and writing data. The main difference from using normal Unix pipes would be that in this case you can maintain the data in a more convenient format, instead of all the reading and writing being in text.
As such, what I think you're looking for is basically a thread-safe queue. With that, nearly everything else involved becomes borders on trivial (at least the threading part of it does -- the processing involved may not be, but at least building it with multiple threads isn't adding much to the complexity).
It's hard to say how much experience with C/C++ threads you have. I hate to just point to a link but have you read up on pthreads?
https://computing.llnl.gov/tutorials/pthreads/
And for a shorter example with code and simple mutex'es (lock object you need to sync data):
http://students.cs.byu.edu/~cs460ta/cs460/labs/pthreads.html
I would suggest Boost.Thread for this purpose. This is quite good framework with mutexes and semaphores, and it is multiplatform. Here you can find very good tutorial about this.
How exactly synchronize these threads is another problem and needs more information about your problem.
Edit The simplest solution would be to put two mutexes -- one on A and second on Buffer. You don't have to worry about deadlocks in this particular case. Just:
Enter mutex_A from MainThread; Thread1 waits for mutex to be released.
Leave mutex from MainThread; Thread1 enters mutex_A and mutex_Buffer, starts reading from A and writes it to Buffer.
Thread1 releases both mutexes. ThreadMain can enter mutex_A and write data, and Thread2 can enter mutex_Buffer safely read data from Buffer.
This is obviously the simplest solution, and probably can be improved, but without more knowledge about the problem, this is the best I can come up with.

Read write mutex in C++

This is an interview question. How do you implement a read/write mutex? There will be multiple threads reading and writing to a resource. I'm not sure how to go about it. If there's any information needed, please let me know.
Update: I'm not sure if my statement above is valid/understandable. But what I really want to know is how do you implement multiple read and multiple writes on a single object in terms of mutex and other synchronization objects needed?
Check out Dekker's algorithm.
Dekker's algorithm is the first known
correct solution to the mutual
exclusion problem in concurrent
programming. The solution is
attributed to Dutch mathematician Th.
J. Dekker by Edsger W. Dijkstra in his
manuscript on cooperating sequential
processes. It allows two threads to
share a single-use resource without
conflict, using only shared memory for
communication.
Note that Dekker's algorithm uses a spinlock (not a busy waiting) technique.
(Th. J. Dekker's solution, mentioned by E. W. Dijkstra in his EWD1303 paper)
The short answer is that it is surprisingly difficult to roll your own read/write lock. It's very easy to miss a very subtle timing problem that could result in deadlock, two threads both thinking they have an "exclusive" lock, etc.
In a nutshell, you need to keep a count of how many readers are active at any particular time. Only when the number of active readers is zero, should you grant a thread write access. There are some design choices as to whether readers or writers are given priority. (Often, you want to give writers the priority, on the assumption that writing is done less frequently.) The (surprisingly) tricky part is to ensure that no writer is given access when there are readers, or vice versa.
There is an excellent MSDN article, "Compound Win32 Synchronization Objects" that takes you through the creation of a reader/writer lock. It starts simple, then grows more complicated to handle all the corner cases. One thing that stood out was that they showed a sample that looked perfectly good-- then they would explain why it wouldn't actually work. Had they not pointed out the problems, you might have never noticed. Well worth a read.
Hope this is helpful.
This sounds like an rather difficult question for an interview; I would not "implement" a read/write mutex, in the sense of writing one from scratch--there are much better off-the-shelf solutions available. The sensible real world thing would be to use an existing mutex type. Perhaps what they really wanted to know was how you would use such a type?
Afaik you need either an atomic compare-and-swap instruction, or you need to be able to disable interrupts. See Compare-and-swap on wikipedia. At least, that's how an OS would implement it. If you have an operating system, stand on it's shoulders, and use an existing library (boost for example).