Single atom vs multiple refs

Single atom vs multiple refs - concurrency

What are tradeoffs of representing state using a single atom and a hashmap vs multiple refs?
For example:
(def start (atom {:location "Chicago" :employer "John"}))
vs
(def location (ref "Chicago"))
(def employer (ref "John"))
Many thanks

The single atom version is better and has less tradeoffs. Given that you don't want to change the employer and the location uncoordinated, your win is that you don't have to create a dosync block to change either location or employer or both. Using the atom, you can simply (swap! start assoc :location "baz").
A big tradeoff of using multiple refs is that all transactions to refs will be tried in parallel and the first one who is ready wins, the others will be restarted. While that is also true for atoms, having more refs for all entries requires more monitoring, grouping (for dosync blocks) etc. behind the scenes. To have less restarts, it makes sense to group the information in a hash-map. Depending on whether coordinated change is required, put it in a ref or an atom.

Multiple Refs allow for more concurrency, since all writes to an Atom are linearized. The STM allows for many parallel transactions to commit when there are no conflicting writes / ensures (and additionally it provides commute which allows one to make certain writes which would normally cause a conflict to not do so).
Additionally, the STM cooperates with Agents -- actions sent to Agents from within a transaction will be performed if and only if the transaction commits. This allows one to cause side effects from inside a transaction safely. Atoms offer no similar facility.
The trade-off is the STM's overhead is larger than an Atom's, plus there is the possibility of certain anomalies occurring (write skew, see the Wikipedia page on snapshot isolation). Additionally, it's possible to achieve great concurrency with the STM while having serious problems with obtaining a snapshot of the entire system; in this connection, see Christophe Grand's excellent blog post and his megaref library.
In many scenarios people find that just storing all state in a single Atom is enough and that's definitely a simpler, more lightweight approach.

I don't think you should be thinking about trade-offs between atoms vs. refs, since they're used for different situations.
You should use an atom when you want to change a single thing atomically.
refs use STM and involve many different things being changed simultaneously, in a transaction.
In your particular situation you should be answering the question about the thing you're changing.
Is it a single thing you want/can change in one step
Are different things you want/need to change transactionally
If you switch the old database for the new database and change everything together to change a single field, and so you say your database is an atom, you're abusing the mechanism.
Hope the distinction helps, for your example I would use an atom none the less.
There's a good summary here with motivations behind each strategy.

Related

real world application of clojure STM

As I understand clojure STM can be used for transacting values across refs.
I understand this property is useful in datastores, where 2 or more locations have to be mutated in a single transaction - atomicity.
However in what cases will this be useful in software applications ? I could just store all my state in one map and use an clojure.core/atom if I want shared mutable state.
In what types of applications / scenarios will usage of refs make sense over atoms or other state primitives in clojure.

Using an atom is indeed what's suggested in the Elements of Clojure book by Zach Tellman (chapter 2 - If you have mutable state, use an atom).
They say that until ~60% utilization of the stateful container, an atom is probably a better choice.
The advice is summarized at the end of the section as:
If you have mutable state, make sure it belongs inside your process.
If it does, try to represent it as a single atom.
If that causes performance issues, try spreading the work across more processes.
If that isn’t possible, see if the atom can be split into smaller atoms that don’t require shared consistency.
Finally, if that doesn’t help, you should start looking into Clojure’s STM primitives.

How to LRU-cache numerous objects made of C++ STL heavy structures?

I have big C++/STL data structures (myStructType) with imbricated lists and maps. I have many objects of this type I want to LRU-cache with a key. I can reload objects from disk when needed. Moreover, it has to be shared in a multiprocessing high performance application running on a BSD plateform.
I can see several solutions:
I can consider a life-time sorted list of pair<size_t lifeTime, myStructType v> plus a map to o(1) access the index of the desired object in the list from its key, I can use shm and mmap to store everything, and a lock to manage access (cf here).
I can use a redis server configured for LRU, and redesign my data structures to redis key/value and key/lists pairs.
I can use a redis server configured for LRU, and serialise my data structures (myStructType) to have a simple key/value to manage with redis.
There may be other solutions of course. How would you do that, or better, how have you successfully done that, keeping in mind high performance ?
In addition, I would like to avoid heavy dependencies like Boost.

I actually built caches (not only LRU) recently.
Options 2 and 3 are quite likely not faster than re-reading from disk. That's effectively no cache at all. Also, this would be a far heavier dependency than Boost.
Option 1 can be challenging. For instance, you suggest "a lock". That would be quite a contended lock, as it must protect each and every lifetime update, plus all LRU operations. Since your objects are already heavy, it may be worthwhile to have a unique lock per object. There are intermediate variants of this solution, where there is more than one lock, but also more than one object per lock. (You still need a key to protect the whole map, but that's for replacement only)
You can also consider if you really need strict LRU. That strategy assumes that the chances of an object being reused decreases over time. If that's not actually true, random replacement is just as good. You can also consider evicting more than one element at a time. One of the challenges is that when an element needs removing, it would be so from all threads, but it's sufficient if one thread removes it. That's why a batch removal helps: if a thread tries to take a lock for batch removal and it fails, it can continue under the assumption that the cache will have free space soon.
One quick win is to not update the LRU time of the last used element. It was already the newest, making it any newer won't help. This of course only has an effect if you often use that element quickly again, but (as noted above) otherwise you'd just use random eviction.

Could Clojure's STM model be made to work over multiple JVMs?

I know that Clojure works well on a multicore machine, but I was wondering if it would make sense for it to work over a JVM cluster spread out over many machines?

Runa looked into using Terracotta for this and ended up publishing swarmiji as a distributed agent library.
One of the real differences between an SMP system and a Cluster is shared memory. In a Cluster, code has to ask for data, whereas in SMP it can just read it directly. This has some nice advantages and some (scaling) disadvantages.
Clojure's STM, which differes quite significantly from the many other STM systems out there, is built upon the notion of relative time as measured by a generation counter per transaction. Without common access to this generation counter it cannot give events an order and can't do it's job (please forgive this overly simple explanation).
One of the STM's main motivations was to create a system that really took advantage of shared memory concurrency by ensuring, for instance, that readers never wait for writers and readers always see valid data. Because this was build to take advantage of shared memory it loses a lot of its appeal without shared memory.
The actor model (ala Erlang) is a better fit for distributed computing.
Or, in other words: perhaps we should not try to put a square peg in a distributed concurrent hole.

Not really. I mean, it could be made to work; things like Terracotta claim to be able to distribute a logical JVM over multiple nodes, but clojure's STM / collection semantics rely pretty strongly on inter-thread memory sharing to be efficient wrt space and time.
You're probably better off taking care of the multi-node parts of your system using a message-passing or batch-oriented architecture.

I could do it but its not a good idea. There is a reason that NoSql is big now, its because transaction don't work well ofer a network.

The Avout project allows you to distribute STM state over multiple machines:
http://avout.io/
https://github.com/liebke/avout

One ref or multiple refs in Clojure?

I am developing a clojure application which makes heavy use of STM. Is it better to use one global ref or many smaller refs in general. Each ref will be storing data like in a relational database table, so there will be several refs.

A possible benefit of using fewer refs is that you it will be easier to comprehend what is happening in your presumably multi-threaded app.
A possible benefit of using more refs is that you will be locking less code at a time and increasing speed.
If you have a ref per table and you need to maintain data integrity between two tables, then you are going to be responsible for implementing that logic since the STM has no knowledge of how the data relates between the tables.
Your answer might lie in how often a change in one table will effect another and whether breaking your single ref out into one ref per table even gives you a notable performance increase at all.

I've usually found it is better to minimise the number of refs.
Key reasons:
It's often helpful to treat large portions of application state as single immutable blocks (e.g. to enable easy snapshots for analysis, or passing to test functions)
There will be more overhead from using lots of small refs
It keeps your top-level namespace simpler
Of course, this will ultimately depend on your workload and concurrency demands, but my general advice would be to go with as few refs as possible until you are sure that you need more.

Haskell for simulating multilane traffic circle?

It was hard for me to come up with a real-world example for a concurrency:
Imagine the above situation, where
there are many lanes, many junctions
and a great amount of cars. Besides,
there is a human factor.
The problem is a hard research area for traffic engineers. When I investigated it a some time ago, I noticed that many models failed on it. When people are talking about functional programming, the above problem tends to pop up to my mind.
Can you simulate it in Haskell? Is Haskell really so concurrent? What are the limits to parallelise such concurrent events in Haskell?

I'm not sure what the question is exactly. Haskell 98 doesn't specify anything for concurrency. Specific implementations, like GHC, provide extensions that implement parallelism and concurrency.
To simulate traffic, it would depend on what you needed out of the simulation, e.g. if you wanted to track individual cars or do it in a general statistical way, whether you wanted to use ticks or a continuous model for time, etc. From there, you could come up with a representation of your data that lent itself to parallel or concurrent evaluation.
GHC provides several methods to leverage multiple hardware execution units, ranging from traditional semaphores and mutexes, to channels with lightweight threads (which could be used to implement an actor model like Erlang), to software transactional memory, to pure functional parallel expression evaluation, with strategies, and experimental nested data parallelism.
So yes, Haskell has many approaches to parallel execution that could certainly be used in traffic simulations, but you need to have a clear idea of what you're trying to do before you can choose the best digital representation for your concurrent simulation. Each approach has its own advantages and limits, including learning curve. You may even learn that concurrency is overkill for the scale of your simulations.

It sounds to me like you are trying to do a simulation, rather than real-world concurrency. This kind of thing is usually tackled using discrete event simulation. I did something similar in Haskell a few years ago, and rolled my own discrete event simulation library based on the continuation monad transformer. I'm afraid its owned by my employer, so I can't post it, but it wasn't too difficult. A continuation is effectively a suspended thread, so define something like this (from memory):
type Sim r a = ContT r (StateT ThreadQueue IO a)
newtype ThreadQueue = TQ [() -> Sim r ()]
The ThreadQueue inside the state holds the queue of currently scheduled threads. You can also have other types of thread queue to hold threads that are not scheduled, for instance in a semaphore (based on "IORef (Int, ThreadQueue)"). Once you have semaphores you can build the equivalent of MVars and MQueues.
To schedule a thread use "callCC". The argument to "callCC" is a function "f1" that itself takes a function "c" as an argument. This inner argument "c" is the continuation: calling it resumes the thread. When you do this, from that thread's point of view "callCC" just returned the value you gave as an argument to "c". In practice you don't need to pass values back to the suspended threads, so the parameter type is null.
So your argument to "callCC" is a lambda function that takes "c" and puts it on the end of whatever queue is appropriate for the action you are doing. Then it takes the head of the ThreadQueue from inside the state and calls that. You don't need to worry about this function returning: it never does.

If you need a concurrent programming language with a functional sequential subset, consider Erlang.
More about Erlang

I imagine you're asking if you could have one thread for each object in the system?
The GHC runtime scales nicely to millions of threads, and multiplexes those threads onto the available hardware, via the abstractions Chris Smith mentioned. So it certainly is possible to have thousands of threads in your system, if you're using Haskell/GHC.
Performance-wise, it tends to be a good deal faster than Erlang, but places less emphasis on distribution of processes across multiple nodes. GHC in particular, is more targetted towards fast concurrency on shared memory multicore systems.

Erlang, Scala, Clojure are languages that might suit you.
But I think what you need more is to find a suitable Multi-Agents simulation library or toolkit, with bindings to your favourite language.
I can tell you about MASON, Swarm and Repast. But these are Java and C libaries...

I've done one answer on this, but now I'd like to add another from a broader perspective.
It sounds like the thing that make this a hard problem is that each driver is basing their actions on mental predictions of what other drivers are going to do. For instance when I am driving I can tell when a car is likely to pull in front of me, even before he indicates, based on the way he is lining himself up with the gap between me and the car in front. He in turn can tell that I have seen him from the fact that I'm backing off to make room for him, so its OK to pull in. A good driver picks up lots of these subtle clues, and its very hard to model.
So the first step is to find out what aspects of real driving are not included in the failed models, and work out how to put them in.
(Clue: all models are wrong, but some models are useful).
I suspect that the answer is going to involve giving each simulated driver one or more mental models of what each other driver is going to do. This involves running the planning algorithm for Driver 2 using several different assumptions that Driver 1 might make about the intentions of Driver 2. Meanwhile Driver 2 is doing the same about Driver 1.
This is the kind of thing that can be very difficult to add to an existing simulator, especially if it was written in a conventional language, because the planning algorithm may well have side effects, even if its only in the way it traverses a data structure. But a functional language may well be able to do better.
Also, the interdependence between drivers probably means there is a fixpoint somewhere in there, which lazy languages tend to do better with.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js