real world application of clojure STM

real world application of clojure STM - clojure

As I understand clojure STM can be used for transacting values across refs.
I understand this property is useful in datastores, where 2 or more locations have to be mutated in a single transaction - atomicity.
However in what cases will this be useful in software applications ? I could just store all my state in one map and use an clojure.core/atom if I want shared mutable state.
In what types of applications / scenarios will usage of refs make sense over atoms or other state primitives in clojure.

Using an atom is indeed what's suggested in the Elements of Clojure book by Zach Tellman (chapter 2 - If you have mutable state, use an atom).
They say that until ~60% utilization of the stateful container, an atom is probably a better choice.
The advice is summarized at the end of the section as:
If you have mutable state, make sure it belongs inside your process.
If it does, try to represent it as a single atom.
If that causes performance issues, try spreading the work across more processes.
If that isn’t possible, see if the atom can be split into smaller atoms that don’t require shared consistency.
Finally, if that doesn’t help, you should start looking into Clojure’s STM primitives.

Related

Single atom vs multiple refs

What are tradeoffs of representing state using a single atom and a hashmap vs multiple refs?
For example:
(def start (atom {:location "Chicago" :employer "John"}))
vs
(def location (ref "Chicago"))
(def employer (ref "John"))
Many thanks

The single atom version is better and has less tradeoffs. Given that you don't want to change the employer and the location uncoordinated, your win is that you don't have to create a dosync block to change either location or employer or both. Using the atom, you can simply (swap! start assoc :location "baz").
A big tradeoff of using multiple refs is that all transactions to refs will be tried in parallel and the first one who is ready wins, the others will be restarted. While that is also true for atoms, having more refs for all entries requires more monitoring, grouping (for dosync blocks) etc. behind the scenes. To have less restarts, it makes sense to group the information in a hash-map. Depending on whether coordinated change is required, put it in a ref or an atom.

Multiple Refs allow for more concurrency, since all writes to an Atom are linearized. The STM allows for many parallel transactions to commit when there are no conflicting writes / ensures (and additionally it provides commute which allows one to make certain writes which would normally cause a conflict to not do so).
Additionally, the STM cooperates with Agents -- actions sent to Agents from within a transaction will be performed if and only if the transaction commits. This allows one to cause side effects from inside a transaction safely. Atoms offer no similar facility.
The trade-off is the STM's overhead is larger than an Atom's, plus there is the possibility of certain anomalies occurring (write skew, see the Wikipedia page on snapshot isolation). Additionally, it's possible to achieve great concurrency with the STM while having serious problems with obtaining a snapshot of the entire system; in this connection, see Christophe Grand's excellent blog post and his megaref library.
In many scenarios people find that just storing all state in a single Atom is enough and that's definitely a simpler, more lightweight approach.

I don't think you should be thinking about trade-offs between atoms vs. refs, since they're used for different situations.
You should use an atom when you want to change a single thing atomically.
refs use STM and involve many different things being changed simultaneously, in a transaction.
In your particular situation you should be answering the question about the thing you're changing.
Is it a single thing you want/can change in one step
Are different things you want/need to change transactionally
If you switch the old database for the new database and change everything together to change a single field, and so you say your database is an atom, you're abusing the mechanism.
Hope the distinction helps, for your example I would use an atom none the less.
There's a good summary here with motivations behind each strategy.

Could Clojure's STM model be made to work over multiple JVMs?

I know that Clojure works well on a multicore machine, but I was wondering if it would make sense for it to work over a JVM cluster spread out over many machines?

Runa looked into using Terracotta for this and ended up publishing swarmiji as a distributed agent library.
One of the real differences between an SMP system and a Cluster is shared memory. In a Cluster, code has to ask for data, whereas in SMP it can just read it directly. This has some nice advantages and some (scaling) disadvantages.
Clojure's STM, which differes quite significantly from the many other STM systems out there, is built upon the notion of relative time as measured by a generation counter per transaction. Without common access to this generation counter it cannot give events an order and can't do it's job (please forgive this overly simple explanation).
One of the STM's main motivations was to create a system that really took advantage of shared memory concurrency by ensuring, for instance, that readers never wait for writers and readers always see valid data. Because this was build to take advantage of shared memory it loses a lot of its appeal without shared memory.
The actor model (ala Erlang) is a better fit for distributed computing.
Or, in other words: perhaps we should not try to put a square peg in a distributed concurrent hole.

Not really. I mean, it could be made to work; things like Terracotta claim to be able to distribute a logical JVM over multiple nodes, but clojure's STM / collection semantics rely pretty strongly on inter-thread memory sharing to be efficient wrt space and time.
You're probably better off taking care of the multi-node parts of your system using a message-passing or batch-oriented architecture.

I could do it but its not a good idea. There is a reason that NoSql is big now, its because transaction don't work well ofer a network.

The Avout project allows you to distribute STM state over multiple machines:
http://avout.io/
https://github.com/liebke/avout

One ref or multiple refs in Clojure?

I am developing a clojure application which makes heavy use of STM. Is it better to use one global ref or many smaller refs in general. Each ref will be storing data like in a relational database table, so there will be several refs.

A possible benefit of using fewer refs is that you it will be easier to comprehend what is happening in your presumably multi-threaded app.
A possible benefit of using more refs is that you will be locking less code at a time and increasing speed.
If you have a ref per table and you need to maintain data integrity between two tables, then you are going to be responsible for implementing that logic since the STM has no knowledge of how the data relates between the tables.
Your answer might lie in how often a change in one table will effect another and whether breaking your single ref out into one ref per table even gives you a notable performance increase at all.

I've usually found it is better to minimise the number of refs.
Key reasons:
It's often helpful to treat large portions of application state as single immutable blocks (e.g. to enable easy snapshots for analysis, or passing to test functions)
There will be more overhead from using lots of small refs
It keeps your top-level namespace simpler
Of course, this will ultimately depend on your workload and concurrency demands, but my general advice would be to go with as few refs as possible until you are sure that you need more.

performance penalty of message passing as opposed to shared data

There is a lot of buzz these days about not using locks and using Message passing approaches like Erlang. Or about using immutable datastructures like in Functional programming vs. C++/Java.
But what I am concerned with is the following:
AFAIK, Erlang does not guarantee Message delivery. Messages might be lost. Won't the algorithm and code bloat and be complicated again if you have to worry about loss of messages? Whatever distributed algorithm you use must not depend on guaranteed delivery of messages.
What if the Message is a complicated object? Isn't there a huge performance penalty in copying and sending the messages vs. say keeping it in a shared location (like a DB that both processes can access)?
Can you really totally do away with shared states? I don't think so. For e.g. in a DB, you have to access and modify the same record. You cannot use message passing there. You need to have locking or assume Optimistic concurrency control mechanisms and then do rollbacks on errors. How does Mnesia work?
Also, it is not the case that you always need to worry about concurrency. Any project will also have a large piece of code that doesn't have to do anything with concurrency or transactions at all (but they do have performance and speed as a concern). A lot of these algorithms depend on shared states (that's why pass-by-reference or pointers are so useful).
Given this fact, writing programs in Erlang etc is a pain because you are prevented from doing any of these things. May be, it makes programs robust, but for things like Solving a Linear Programming problem or Computing the convex hulll etc. performance is more important and forcing immutability etc. on the algorithm when it has nothing to do with Concurrency/Transactions is a poor decision. Isn't it?

That's real life : you need to account for this possibility regardless of the language / platform. In a distributed world (the real world), things fail: live with it.
Of course there is a cost: nothing is free in our universe. But shouldn't you use another medium (e.g. file, db) instead of shuttling "big objects" in communication pipes? You can always use "message" to refer to "big objects" stored somewhere.
Of course not: the idea behind functional programming / Erlang OTP is to "isolate" as much as possible the areas were "shared state" is manipulated. Futhermore, having clearly marked places where shared state is mutated helps testability & traceability.
I believe you are missing the point: there is no such thing as a silver bullet. If your application cannot be successfully built using Erlang then don't do it. You can always some other part of the overall system in another fashion i.e. use a different language / platform. Erlang is no different from another language in this respect: use the right tool for the right job.
Remember: Erlang was designed to help solve concurrent, asynchronous and distributed problems. It isn't optimized for working efficiently on a shared block of memory for example... unless you count interfacing with nif functions working on shared blocks part of the game :-)

Real-world systems are always hybrids anyway: I don't believe the modern paradigms try, in practice, to get rid of mutable data and shared state.
The objective, however, is not to need concurrent access to this shared state. Programs can be divided into the concurrent and the sequential, and use message-passing and the new paradigms for the concurrent parts.
Not every code will get the same investment: There is concern that threads are fundamentally "considered harmful". Something like Apache may need traditional concurrent threads and a key piece of technology like that may be carefully refined over a period of years so it can blast away with fully concurrent shared state. Operating system kernels are another example where "solve the problem no matter how expensive it is" may make sense.
There is no benefit to fast-but-broken: But for new code, or code that doesn't get so much attention, it may be the case that it simply isn't thread-safe, and it will not handle true concurrency, and so the relative "efficiency" is irrelevant. One way works, and one way doesn't.
Don't forget testability: Also, what value can you place on testing? Thread-based shared-memory concurrency is simply not testable. Message-passing concurrency is. So now you have the situation where you can test one paradigm but not the other. So, what is the value in knowing that the code has been tested? The danger in not even knowing if the other code will work in every situation?

A few comments on the misunderstanding you have of Erlang:
Erlang guarantees that messages will not be lost, and that they will arrive in the order sent. A basic error situation is that machine A can not speak to machine B. When that happens process monitors and links will trigger, and system node-down messages will be sent to the processes that registered for it. Nothing will be silently dropped. Processes will "crash" and supervisors (if any) tries to restart them.
Objects can not be mutated, so they are always copied. One way to secure immutability is by copying values to other erlang process' heaps. Another way is to allocate objects in a shared heap, message references to them and simply not have any operations that mutate them. Erlang does the first for performance! Realtime suffers if you need to stop all processes to garbage collect a shared heap. Ask Java.
There is shared state in Erlang. Erlang is not proud of it, but it is pragmatic about it. One example is the local process registry which is a global map that maps a name to a process so that system processes can be restarted and claim their old name. Erlang just tries to avoid shared state if it possibly can. ETS tables that are public are another example.
Yes, sometimes Erlang is too slow. This happens all languages. Sometimes Java is too slow. Sometimes C++ is too slow. Just because a tight loop in a game had to drop down to assembly to kick off some serious SIMD-based vector mathematics you can't deduce that everything should be written in assembly because it is the only language that is fast when it matters. What matters is being able to write systems that have good performance, and Erlang manages quite well. See benchmarks on yaws or rabbitmq.
Your facts are not facts about Erlang. Even if you think Erlang programming is a pain, you will find other people create some awesome software thanks to it. You should attempt writing an IRC server in Erlang, or something else very concurrent. Even if you're never going to use Erlang again, you would have learned to think about concurrency another way. But of course, you will, because Erlang is awesome easy.
Those that do not understand Erlang are doomed to re-implement it badly.
Okay, the original was about Lisp, but... its true!

There are some implicit assumption in your questions - you assume that all the data can fit
on one machine and that the application is intrinsically localised to one place.
What happens if the application is so large it cannot fit on one machine? What happens if the application outgrows one machine?
You don't want to have one way to program an application if it fits on one machine and
a completely different way of programming it as soon as it outgrows one machine.
What happens if you want make a fault-tolerant application? To make something fault-tolerant you need at least two physically separated machines and no sharing.
When you talk about sharing and data bases you omit to mention that things like mySQL
cluster achieve fault-tolerence precisely by maintaining synchronised copies of the
data in physically separated machines - there is a lot of message passing and
copying that you don't see on the surface - Erlang just exposes this.
The way you program should not suddenly change to accommodate fault-tolerance and scalability.
Erlang was designed primarily for building fault-tolerant applications.
Shared data on a multi-core has it's own set of problems - when you access shared data
you need to acquire a lock - if you use a global lock (the easiest approach) you can end up
stopping all the cores while you access the shared data. Shared data access on a multicore
can be problematic due to caching problems, if the cores have local data caches then accessing "far away" data (in some other processors cache) can be very expensive.
Many problems are intrinsically distributed and the data is never available in one place
at the same time so - these kind of problems fit well with the Erlang way of thinking.
In a distributed setting "guaranteeing message delivery" is impossible - the destination machine might have crashed. Erlang cannot thus guarantee message delivery -
it takes a different approach - the system will tell you if it failed to deliver a message
(but only if you have used the link mechanism) - then you can write you own custom error
recovery.)
For pure number crunching Erlang is not appropriate - but in a hybrid system Erlang
is good at managing how computations get distributed to available processors, so we see a lot of systems where Erlang manages the distribution and fault-tolerent aspects of the problem, but the problem itself is solved in a different language.
and other languages are used

For e.g. in a DB, you have to access and modify the same record
But that is handled by the DB. As a user of the database, you simply execute your query, and the database ensures it is executed in isolation.
As for performance, one of the most important things about eliminating shared state is that it enables new optimizations. Shared state is not particularly efficient. You get cores fighting over the same cache lines, and data has to be written through to memory where it could otherwise stay in a register or in CPU cache.
Many compiler optimizations rely on absence of side effects and shared state as well.
You could say that a stricter language guaranteeing these things requires more optimizations to be performant than something like C, but it also makes these optimizations much much easier for the compiler to implement.
Many concerns similar to concurrency issues arise in singlethreaded code. Modern CPUs are pipelined, execute instructions out of order, and can run 3-4 of them per cycle. So even in a single-threaded program, it is vital that the compiler and CPU is able to determine which instructions can be interleaved and executed in parallel.

For correctness, shared is the way to go, and keep the data as normalized as possible. For immediacy, send messages to inform of changes, but always back them up with polling. Messages get dropped, duplicated, re-ordered, delayed - don't rely on them.
If speed is what you're worried about, first do it single-thread and tune the daylights out of it. Then if you've got multiple cores and know how to split up the work, use parallelism.

Erlang provides supervisors and gen_server callbacks for synchronous calls, so you will know about it if a message isn't delivered: either the gen_server call returns a timeout, or your whole node will be brought down and up if the supervisor is triggered.
usually if the processes are on the same node, message-passing languages optimise away the data copying, so it's almost like shared memory, except if the object is changed used by both afterward, which can not be done using shared memory either anyways
There is some state which is kept by processes by passing it around to themselves in the recursive tail-calls, also some state can be of course passed through messages. I don't use mnesia much, but it is a transactional database, so once you have passed the operation to mnesia (and it has returned) you are pretty much guaranteed it will go through..
Which is why it is easy to tie such applications into erlang with the use of ports or drivers. The easiest are the ports, it's much like a unix pipe, though I think performance isn't that great...and as said, message-passing usually ends up just being pointer passing anyways as the VM/compiler optimise the memory copy out.

Haskell for simulating multilane traffic circle?

It was hard for me to come up with a real-world example for a concurrency:
Imagine the above situation, where
there are many lanes, many junctions
and a great amount of cars. Besides,
there is a human factor.
The problem is a hard research area for traffic engineers. When I investigated it a some time ago, I noticed that many models failed on it. When people are talking about functional programming, the above problem tends to pop up to my mind.
Can you simulate it in Haskell? Is Haskell really so concurrent? What are the limits to parallelise such concurrent events in Haskell?

I'm not sure what the question is exactly. Haskell 98 doesn't specify anything for concurrency. Specific implementations, like GHC, provide extensions that implement parallelism and concurrency.
To simulate traffic, it would depend on what you needed out of the simulation, e.g. if you wanted to track individual cars or do it in a general statistical way, whether you wanted to use ticks or a continuous model for time, etc. From there, you could come up with a representation of your data that lent itself to parallel or concurrent evaluation.
GHC provides several methods to leverage multiple hardware execution units, ranging from traditional semaphores and mutexes, to channels with lightweight threads (which could be used to implement an actor model like Erlang), to software transactional memory, to pure functional parallel expression evaluation, with strategies, and experimental nested data parallelism.
So yes, Haskell has many approaches to parallel execution that could certainly be used in traffic simulations, but you need to have a clear idea of what you're trying to do before you can choose the best digital representation for your concurrent simulation. Each approach has its own advantages and limits, including learning curve. You may even learn that concurrency is overkill for the scale of your simulations.

It sounds to me like you are trying to do a simulation, rather than real-world concurrency. This kind of thing is usually tackled using discrete event simulation. I did something similar in Haskell a few years ago, and rolled my own discrete event simulation library based on the continuation monad transformer. I'm afraid its owned by my employer, so I can't post it, but it wasn't too difficult. A continuation is effectively a suspended thread, so define something like this (from memory):
type Sim r a = ContT r (StateT ThreadQueue IO a)
newtype ThreadQueue = TQ [() -> Sim r ()]
The ThreadQueue inside the state holds the queue of currently scheduled threads. You can also have other types of thread queue to hold threads that are not scheduled, for instance in a semaphore (based on "IORef (Int, ThreadQueue)"). Once you have semaphores you can build the equivalent of MVars and MQueues.
To schedule a thread use "callCC". The argument to "callCC" is a function "f1" that itself takes a function "c" as an argument. This inner argument "c" is the continuation: calling it resumes the thread. When you do this, from that thread's point of view "callCC" just returned the value you gave as an argument to "c". In practice you don't need to pass values back to the suspended threads, so the parameter type is null.
So your argument to "callCC" is a lambda function that takes "c" and puts it on the end of whatever queue is appropriate for the action you are doing. Then it takes the head of the ThreadQueue from inside the state and calls that. You don't need to worry about this function returning: it never does.

If you need a concurrent programming language with a functional sequential subset, consider Erlang.
More about Erlang

I imagine you're asking if you could have one thread for each object in the system?
The GHC runtime scales nicely to millions of threads, and multiplexes those threads onto the available hardware, via the abstractions Chris Smith mentioned. So it certainly is possible to have thousands of threads in your system, if you're using Haskell/GHC.
Performance-wise, it tends to be a good deal faster than Erlang, but places less emphasis on distribution of processes across multiple nodes. GHC in particular, is more targetted towards fast concurrency on shared memory multicore systems.

Erlang, Scala, Clojure are languages that might suit you.
But I think what you need more is to find a suitable Multi-Agents simulation library or toolkit, with bindings to your favourite language.
I can tell you about MASON, Swarm and Repast. But these are Java and C libaries...

I've done one answer on this, but now I'd like to add another from a broader perspective.
It sounds like the thing that make this a hard problem is that each driver is basing their actions on mental predictions of what other drivers are going to do. For instance when I am driving I can tell when a car is likely to pull in front of me, even before he indicates, based on the way he is lining himself up with the gap between me and the car in front. He in turn can tell that I have seen him from the fact that I'm backing off to make room for him, so its OK to pull in. A good driver picks up lots of these subtle clues, and its very hard to model.
So the first step is to find out what aspects of real driving are not included in the failed models, and work out how to put them in.
(Clue: all models are wrong, but some models are useful).
I suspect that the answer is going to involve giving each simulated driver one or more mental models of what each other driver is going to do. This involves running the planning algorithm for Driver 2 using several different assumptions that Driver 1 might make about the intentions of Driver 2. Meanwhile Driver 2 is doing the same about Driver 1.
This is the kind of thing that can be very difficult to add to an existing simulator, especially if it was written in a conventional language, because the planning algorithm may well have side effects, even if its only in the way it traverses a data structure. But a functional language may well be able to do better.
Also, the interdependence between drivers probably means there is a fixpoint somewhere in there, which lazy languages tend to do better with.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js