Relation between Paxos and Snapshot Isolation for concurrency control - concurrency

I was wondering what is the actual relationship between Paxos based replication schemes and different concurrency models like snapshot isolations. Could anybody kindly able to explain these two in regards to their relationship and with few practical examples?

Snapshot isolation is a strong consistency concurrency criterion in the sense that it totally orders conflicting updates. Total order is equivalent to consensus in distributed systems. Paxos is a solution for consensus in distributed systems with process faults. So the answer to your question is: The relationship between snapshot isolation and Paxos-based replication schemes is that they should be equally hard to implement in a theoretical sense (i.e., will be possible with the same assumptions).
Other examples of strong consistent concurrency criteria are serializability, linearizability, and sequential consistency. In contrast, weak consistent criteria such as causal consistency or eventual consistency don't need consensus and are thus fundamentally different from the Paxos-based replication schemes that you might have seen.

Related

Multi-threaded synchronization of shared model in project

I am writing right now a multi-threaded application (game to be precise) as a hobby/research project. I have lately run into a really "simple" problem, which is making synchronization between threads (if it matters a lot to you, it's in c++).
My main issue is — I try to learn good design, and mutexing my whole model everywhere I can, is (in my opinion) resource wasteful, and just plainly asking for problems in further development. I have thought about making the whole process of synchronization transaction-based, but I feel it just does not fit the game type required performance/extensibility. I am new to concurrent programming, and I am here to learn something new about patterns specific to concurrent programming.
Some words about current design:
MVC approach
Online synchronization is being handled by a separate agent, which is identical on slave-client and master-server and is being handled separately from any server logic
Database like structure is being synced undependably from server logic and has some minor subscription/observer pattern build in to notify controllers about changes.
Notes
I do not look for documentation specific pieces of information (if they are not directly connected to performance or design), I know my cppreference,
I do look for some extensive blog post/websites which can teach me some more about concurrent design patterns,
I do want to know If I am just plainly doing things wrong (not in the wrong order, though).
EDIT
Like Mike has mentioned, I did not ask the question:
1) What are the best design patterns/norms which can be used in concurrent programming (Mostly usable in my case),
2) What are the biggest no-goes when it comes to concurrent programming performance.
You are starting from a bit of a mistaken idea. Parallelism is about performance, concurrency is about correctness. A concurrent system isn't necessarily the fastest solution. A good concurrent system minimizes and explicitly defines dependencies; enabling a robust, reactive system with minimal latency. In contrast, a parallel system seeks to minimize its execution time by maximizing its utilization of resources; in doing so, it might maximize latency. There is overlap, but the mindset is quite different.
There are many good concurrent languages. C++ isn't one of them. That said, you can write good concurrent systems in any language. Most concurrent languages have a strong message passing bias, but good message passing libraries are available to most languages.
Message passing is a distinct from low level synchronization mechanism in that it is a model or way of thinking in and of itself. Mutexes, semaphores, etc... are not. They are tools, and should likely be ignored until the design is reasonably complete.
The design phase should be more abstract than synchronization mechanisms. Ideally, it should thresh out the operations (or transactions, if you prefer) and the necessary interactions between them. From that schema, choices about how to arrange data and code for concurrent access should be natural. If it isn't, your schema is incomplete.

what's the difference between single paxos state machine and multi paxos state machine in spanner paper

From the spanner paper, it says that
"To support replication, each spanserver implements a single Paxos state machine on top of each tablet. (An early Spanner incarnation supported multiple Paxos state machines per tablet, which allowed for more flexible replication configurations. The complexity of that design led us to abandon it.)"
so can anyone explain what dose the single paxos state machine mean? and what does the multiple paxos stathe machines mean?
I guess the multiple Paxos state machine per tablet is that there is multiple independent single Paxos state machne in a single tablet, then the leader and followers in a single tablet can replicate data in parallel, since these single paxos state machines are independent.
is it right? If I misunderstanding something, please correct me. Thank you.
what dose the single paxos state machine mean?
By "Paxos state machine" the paper's authors mean a replicated state machine (RSM) based on the Paxos consensus algorithm.
The idea of an RSM was introduced in this paper and is explained in this Wikipedia article. In short, the idea is to build a replicated log of (deterministic) state machine operations and execute those operations on each replica in the same order, which will necessarily lead to the same state on each replica.
Paxos, which is covered well in this paper and explained in this Wikipedia article, can be used to create that replicated log.
A single vs multiple Paxos state machines means literally that - a single or multiple Paxos-based RSMs. (Note it's not about basic Paxos vs Multi-Paxos, that's something else). Think of it almost like a single vs multiple databases.
What is the practical difference? It's unclear from the paper. The paper only says that multiple RSMs "allowed for more flexible replication configurations." One possibility is that the multiple RSMs allowed for some form of sharding. On multi-tenant architectures it is common to have multiple RSMs replicas per server, one for each entity (e.g. for each database). Another possibility is that it allowed them to overcome some performance bottleneck. There are probably other possibilities as well.

Are actor based solutions to classic concurrency problems easier to prove correct?

I'm trying to grasp whether Actor Model, especially popular frameworks like Akka provides measurable benefits to software designs.
While I'm not CS theorist, I would feel more confident in this model if it were true that it allows to construct simpler correctness proofs and better specifications. Is reliability a strong point of Agent Model? Would it be a good fit for mission-critical software (health care, avionics etc.)?
My worry is that without some strong and possibly unattainable "global" guarantees like fairness (as I believe actors may exhaust threads available to others) or delivery in finite time, using this model can lead to distributed system that is an order of magnitude more complex to design, describe and debug than alternatives.
As with any other model, Akka reliability depends on the developer. Provided that you use Akka correctly (with the recommended best practices), it is very reliable. And using the actor model correctly is easy and intuitive once you learn the basics.
For example: an actor will hold a thread for too long only if you send him too large messages (large == takes long to process). If you use more smaller messages, the dispatcher will guarantee pretty good fairness. Or, if you need absolute fairness, just use a round robin router.
Delivery in finite time... depends highly on how you choose the mailbox. For example you can prioritize some important messages over others.
Another Akka advantage is easy scalability. If something works for 5 actors, it will work for a 10000 actors. So I believe distributed systems in Akka are easier to design and manage than the traditional models.

Could Clojure's STM model be made to work over multiple JVMs?

I know that Clojure works well on a multicore machine, but I was wondering if it would make sense for it to work over a JVM cluster spread out over many machines?
Runa looked into using Terracotta for this and ended up publishing swarmiji as a distributed agent library.
One of the real differences between an SMP system and a Cluster is shared memory. In a Cluster, code has to ask for data, whereas in SMP it can just read it directly. This has some nice advantages and some (scaling) disadvantages.
Clojure's STM, which differes quite significantly from the many other STM systems out there, is built upon the notion of relative time as measured by a generation counter per transaction. Without common access to this generation counter it cannot give events an order and can't do it's job (please forgive this overly simple explanation).
One of the STM's main motivations was to create a system that really took advantage of shared memory concurrency by ensuring, for instance, that readers never wait for writers and readers always see valid data. Because this was build to take advantage of shared memory it loses a lot of its appeal without shared memory.
The actor model (ala Erlang) is a better fit for distributed computing.
Or, in other words: perhaps we should not try to put a square peg in a distributed concurrent hole.
Not really. I mean, it could be made to work; things like Terracotta claim to be able to distribute a logical JVM over multiple nodes, but clojure's STM / collection semantics rely pretty strongly on inter-thread memory sharing to be efficient wrt space and time.
You're probably better off taking care of the multi-node parts of your system using a message-passing or batch-oriented architecture.
I could do it but its not a good idea. There is a reason that NoSql is big now, its because transaction don't work well ofer a network.
The Avout project allows you to distribute STM state over multiple machines:
http://avout.io/
https://github.com/liebke/avout

How does db4o support concurrency and transactions?

We are looking at db40 for a high volume e-commerce website using Java on the server side. Concurrency and transactional support is very important to us. When a customer purchases an item we need to lock the product and customer objects to update inventory and customer order history, respectively, in a single transaction. Is this possible with db4o? I want to make sure it support multi-object transactions.
There are already similar questions here, like this one. And my answer is more or less the same.
About the high volume e-commerce website: db4o was never build as a high volume, big database but rather for embedded use cases like desktop and mobile apps. Well it depends what a 'high volume' means. I assume that it means hundreds or concurrent transactions. Thats certainly out of scope of db4o.
Concurrency and transactional support: The db4o core is still inherently single threaded and therefore can only serve a small amount of concurrent operations. db4o supports transactions with the read committed isolation. That means that a transaction can only see the committed state of other transactions. In practice thats a very weak guarantee.
To your example: you can update the purchase with the product and consumer in one transaction. However another transaction could update any of these objects and commit. Then a running transaction which already has read some objects might does calculations with the old value and stores it. So the weak isolation 'taints' your state.
You could use locks to prevent that, but db4o hasn't any nice object-locking mechanism. And it would decrease the performance further.
All in all I think you probably need a 'larger' database, which has better support for Concurrency and transaction-handling.
It sounds like you need to use db4o semaphores.