concurrency issue :reduce credit for account - concurrency

hi we try to implement a process like when a user does something, his company's credit will be deducted accordingly.
But there is a concurrency issue when multiple users in one company participant in the process because the credit got deducted wrong.
Can anyone point a right direction for such issue?
thanks very much.

This is a classic problem that is entirely independent of the implementation language(s).
You have a shared resource that is maintaining a persistent data store. (This is typically a database, likely an RDBMS).
You also have a (business) process that uses and/or modifies the information maintained in the shared data store.
When this process can be performed concurrently by multiple actors, the issue of informational integrity arises.
The most common way to address this is to serialize access to the shared resources, so that the operation against the shared resources occur in sequence.
This serialization can happen at the actor level, or, at the shared resource itself, and can take many forms, such as queuing actions, or using messaging, or using transactions at the shared resource. Its here that considerations such as system type, application, and the platforms and systems that are used become important and determine the design of the overall system.
Take a look at this wikipedia article on db transactions, and then google your way to more technical content on this topic. You may also wish to take a look at messaging systems, and if you are feeling adventurous, also read up on software transactional memory.

Related

Is it valid to use AWS SQS as a write queue to Aurora database to increase system performance

I am developing a web application server on AWS that need to support high throughput on reading and write. My boss gave me a high-level design like this.
I am stuck on the "Write Queue". The team told me that we need it to increase the performance of writing because we can have only 1 master replica to which we can write. I have some basic knowledge about message queues such as SQS and RabbitMQ but don't know anything about using it as a database write queue.
At the current stage, I have 3 questions:
Using this architecture, is it really able to increase the performance of writing to the database (as opposed to writing directly to the master replica).
How to handle transactions, especially how to rollback, when errors occur during the writing. Normally, we would control the transaction in application code such that when an error occurs, the whole transaction is rollback and the App Server response to the client with some error code.
I mentioned that I have researched about using message queue as a write queue, but I am not sure if I am looking in the right direction. Maybe, there is some other technology already that is suitable to be a write queue to the database?
In addition to the questions, I believe this should be a big topic and would like to know the resources where I can research in detail on this topic.
In similar cases, queues are used as a mean for de-coupling two systems. There are several advantages and disadvantages when implementing such architectural patterns. I will try to list what I believe are the main ones.
Advantages
Improved response time
As queues do not require complex transactions they are usually a fast and, if correctly configured, safe storage. This means the perceived response latency from the client side will decrease giving the feeling that the service is "faster".
Separation of concerns
Correctly de-coupling services increases their resilience to errors. For example, if the DB cannot accept more write requests, the clients will be unaffected and their requests will still not be lost as they will be in the queue. This gives Operators more time to react to problems while the service value is only partially affected.
Improved scalability
When operations become complex, it's usually a good idea to separate them into microcomponents. It is way easier to scale up microcomponents than monolithic services. Job queues enable such design patterns.
Disadvantages
Recovering from errors becomes more complex
As said above, if the DB stops accepting requests, jobs will pile up in the queue. Now you have 2 problems to deal with: a full DB and a full job queue. System problems start propagating across your architecture like ripples causing several side effects and making hard to understand what is the root cause.
Identifying bottlenecks requires more time
If the DB writes are slow, putting a queue in front of it won't make things faster. Jobs will still pile up in the queue and your next task will be figuring out why this happens. When dealing with complex ETL pipelines, improving performance becomes a quite tedious whack-a-mole operation where your bottlenecks just shift from system to system.
Cost per operation increases
The more stages a job needs to traverse for its completion, the more time and money that job will require.
De-coupling components is usually seen as a silver bullet for dealing with performance issues. The correct separation of concerns and responsibilities is a very beneficial practice but requires a lot of experience and care. Nowadays monolithic services are seeing as the root of all evils. Personally I prefer to deal with a monolithic bunch of spaghetti rather than a distributed one.

What's the difference between the message passing and shared memory models?

The question duplicates the topic of the question asked here.
I would like to ask for some additional clarification with respect to a different point of view.
in distributed computing, memory coherency is in the end implemented using message passing over network channels, with distributed locking and so forth. Message passing, IIUC, would not always eliminate concurrency, except at the very low level, because the processes still usually affect each others state. And they do so, in what they believe to be consistent way.
For example, a simple command interpreter can be implemented on top of message passing, and the commands can be sent as part of several remote transactions executed in parallel from multiple conversant processes. So, high level interactions would require design for concurrency in most cases. That is, IMO, it is very unlikely that processes have no transaction semantics of long term operations.
Additionally, sending a message, with a value in consistent state does not guarantee correctness. It is important, how this value is produced and what happens in between the messages that provided the input data and the messages that publish the transformed result.
On the other hand, low level interactions with physical memory are always essentially some kind of message passing over buses. So, at the lowest level, sharing memory and message passing are identical.
And at the per-instruction level, the atomicity of aligned loads and stores is usually guaranteed. So, the distinction is still blurry to me.
In prose, how does the choice of the shared memory vs message passing relate to concurrency? Is it just a matter of choice of technical pattern to solving concurrency, and mathematical model for defining and analyzing interactions of parallel processes, or are those techniques also architectural patterns that when applied systematically, fundamentally affect the concurrency issues in the system?
Thanks.
Edit (some additional remarks):
Apparently, the two methods are distinguished by correctness and performance. But I have the following problems with this distinction.
I know that messages can act like transfers of big scattered virtual datums. However, the "by-value" approach does not guarantee consistency without synchronization beyond the atomic read of non-unitary (or procedurally generated) logical datum. By consistency, I imply something like causality or sequential ordering of the changes, etc. With message passing, indeed, every process only mutates its own memory. A process acts just like a controller to its private memory. This is like sharing on top of message passing, serialized by the process owning the datum, but on a MESSAGE-BY-MESSAGE basis (similar to how memory serializes on a word-by-word or cache-line-by-cache-line basis). It remains responsibility of the application programmer to guarantee synchronization of the transactions involved in sending the messages. Namely, that messages to one process, from multiple conversant processes, must be sent in consistent order corresponding to the semantics of the operations those processes are executing. May be with control messages to the owning process, or through coordination directly between the contenders, but some restriction to the concurrency of the messages should be most likely necessary.
Sharing memory can be indeed faster for local inter-process communication (ignoring contention), but why would this be the case for cross-machine communication? Shared memory for distributed computing is implemented on top of the network communication. So, shared memory, aside from caching benefits, can not be faster.
The techniques are obviously different. What I can't seem to understand is how they can be broadly compared to each other, when there is nothing intrinsically beneficial to either one. One must assume what the platform supplies, and what the software tries to accomplish, and such assumption can not be universally true.
If you are architecting a distributed and/or multi-threading application, you would want to ensure that it performs better than a single process single thread application.
With distributed applications, i.e. multiple processes on potentially multiple systems, latency between communication nodes is a prime concern. With the advent of microservers, latency as well as power consumption goes down significantly to the point where it behooves software developers to start thinking about how to design, develop, debug, deploy, etc. multi-core/microserver applications.
When developing multi-process applications, it usually boils down to using two sets of OS calls at the lowest layer to implement inter-process communication: shared memory, e.g. by using shmget, shmat, shmctl, etc., and, message passing, e.g. by using socket, accept, send, recv, etc.
With shared memory, latency is negligible. Once a reference to a shared memory buffer is obtained, an application can go to any part of the shared memory and modify it. Of course, processes have to cooperate using locks, mutexes, etc. to ensure that integrity of the data structures is maintained and that the application works correctly. The problem with this solution is, how do you test for all situations that integrity is maintained when there is no control over when a context switch may occur?
With message passing, no data is shared. All communication is by means of exchanging buffers. This eliminates having to be concerned with locks, mutexes, etc., but now one has to ensure that the application can handle issues such as network timeouts,bandwidth, latency, etc.
In order to develop apps that can scale beyond a single system, the most common method is to use message passing. If the communicating processes happen to be on the same host, it still works.
Irrespective of whether it is shared memory or message passing, concurrency in the end is essentially about ensuring the integrity of data structures with locks/mutexes in case of shared memory and serializing request/response in case of message passing.

How does db4o support concurrency and transactions?

We are looking at db40 for a high volume e-commerce website using Java on the server side. Concurrency and transactional support is very important to us. When a customer purchases an item we need to lock the product and customer objects to update inventory and customer order history, respectively, in a single transaction. Is this possible with db4o? I want to make sure it support multi-object transactions.
There are already similar questions here, like this one. And my answer is more or less the same.
About the high volume e-commerce website: db4o was never build as a high volume, big database but rather for embedded use cases like desktop and mobile apps. Well it depends what a 'high volume' means. I assume that it means hundreds or concurrent transactions. Thats certainly out of scope of db4o.
Concurrency and transactional support: The db4o core is still inherently single threaded and therefore can only serve a small amount of concurrent operations. db4o supports transactions with the read committed isolation. That means that a transaction can only see the committed state of other transactions. In practice thats a very weak guarantee.
To your example: you can update the purchase with the product and consumer in one transaction. However another transaction could update any of these objects and commit. Then a running transaction which already has read some objects might does calculations with the old value and stores it. So the weak isolation 'taints' your state.
You could use locks to prevent that, but db4o hasn't any nice object-locking mechanism. And it would decrease the performance further.
All in all I think you probably need a 'larger' database, which has better support for Concurrency and transaction-handling.
It sounds like you need to use db4o semaphores.

How to get a debug flow of execution in C++

I work on a global trading system which supports many users. Each user can book,amend,edit,delete trades. The system is regulated by a central deal capture service. The deal capture service informs all the user of any updates that occur.
The problem comes when we have crashes, as the production environment is impossible to re-create on a test system, I have to rely on crash dumps and log files.
However this doesn't tell me what the user has been doing.
I'd like a system that would (at the time of crashing) dump out a history of what the user has been doing. Anything that I add has to go into the live environment so it can't impact performance too much.
Ideas wise I was thinking of a MACRO at the top of each function which acted like a stack trace (only I could supply additional user information, like trade id's, user dialog choices, etc ..) The system would record stack traces (on a per thread basis) and keep a history in a cyclic buffer (varying in size, depending on how much history you wanted to capture). Then on crash, I could dump this history stack.
I'd really like to hear if anyone has a better solution, or if anyone knows of an existing framework?
Thanks
Rich
Your solution sounds pretty reasonable, though perhaps rather than relying on viewing your audit trail in the debugger you can trigger it being printed with atexit() handlers. Something as simple as a stack of strings that have __FILE__,__LINE__,pthread_self() in them migth be good enough
You could possibly use some existing undo framework, as its similar to an audit trail, but it's going to be more heavyweight than you want. It will likely be based on the command pattern and expect you to implement execute() methods, though I suppose you could just leave them blank.
Trading systems usually don't suffer the performance hit of instrumentation of that level. C++ based systems, in particular, tend to sacrifice the ease of debugging for performance. Otherwise, more companies would be developing such systems in Java/C#.
I would avoid an attempt to introduce stack traces into C++. I am also not confident that you could introduce such a system in a way that would not affect the behavior of the program in some way (e.g., affect threading behavior).
It might, IMHO, be preferable to log the external inputs (e.g., user GUI actions and message traffic) rather than attempt to capture things internally in the program. In that case, you might have a better chance of replicating the failure and debugging it.
Are you currently logging all network traffic to/from the client? Many FIX based systems record this for regulatory purposes. Can you easily log your I/O?
I suggest creating another (circular) log file that contains your detailed information. Beware that this file will grow exponentially compared to other files.
Another method is to save the last N transactions. Write a program that reads the transaction log and feeds the data into your virtual application. This may help create the cause. I've used this technique with embedded systems before.

performance penalty of message passing as opposed to shared data

There is a lot of buzz these days about not using locks and using Message passing approaches like Erlang. Or about using immutable datastructures like in Functional programming vs. C++/Java.
But what I am concerned with is the following:
AFAIK, Erlang does not guarantee Message delivery. Messages might be lost. Won't the algorithm and code bloat and be complicated again if you have to worry about loss of messages? Whatever distributed algorithm you use must not depend on guaranteed delivery of messages.
What if the Message is a complicated object? Isn't there a huge performance penalty in copying and sending the messages vs. say keeping it in a shared location (like a DB that both processes can access)?
Can you really totally do away with shared states? I don't think so. For e.g. in a DB, you have to access and modify the same record. You cannot use message passing there. You need to have locking or assume Optimistic concurrency control mechanisms and then do rollbacks on errors. How does Mnesia work?
Also, it is not the case that you always need to worry about concurrency. Any project will also have a large piece of code that doesn't have to do anything with concurrency or transactions at all (but they do have performance and speed as a concern). A lot of these algorithms depend on shared states (that's why pass-by-reference or pointers are so useful).
Given this fact, writing programs in Erlang etc is a pain because you are prevented from doing any of these things. May be, it makes programs robust, but for things like Solving a Linear Programming problem or Computing the convex hulll etc. performance is more important and forcing immutability etc. on the algorithm when it has nothing to do with Concurrency/Transactions is a poor decision. Isn't it?
That's real life : you need to account for this possibility regardless of the language / platform. In a distributed world (the real world), things fail: live with it.
Of course there is a cost: nothing is free in our universe. But shouldn't you use another medium (e.g. file, db) instead of shuttling "big objects" in communication pipes? You can always use "message" to refer to "big objects" stored somewhere.
Of course not: the idea behind functional programming / Erlang OTP is to "isolate" as much as possible the areas were "shared state" is manipulated. Futhermore, having clearly marked places where shared state is mutated helps testability & traceability.
I believe you are missing the point: there is no such thing as a silver bullet. If your application cannot be successfully built using Erlang then don't do it. You can always some other part of the overall system in another fashion i.e. use a different language / platform. Erlang is no different from another language in this respect: use the right tool for the right job.
Remember: Erlang was designed to help solve concurrent, asynchronous and distributed problems. It isn't optimized for working efficiently on a shared block of memory for example... unless you count interfacing with nif functions working on shared blocks part of the game :-)
Real-world systems are always hybrids anyway: I don't believe the modern paradigms try, in practice, to get rid of mutable data and shared state.
The objective, however, is not to need concurrent access to this shared state. Programs can be divided into the concurrent and the sequential, and use message-passing and the new paradigms for the concurrent parts.
Not every code will get the same investment: There is concern that threads are fundamentally "considered harmful". Something like Apache may need traditional concurrent threads and a key piece of technology like that may be carefully refined over a period of years so it can blast away with fully concurrent shared state. Operating system kernels are another example where "solve the problem no matter how expensive it is" may make sense.
There is no benefit to fast-but-broken: But for new code, or code that doesn't get so much attention, it may be the case that it simply isn't thread-safe, and it will not handle true concurrency, and so the relative "efficiency" is irrelevant. One way works, and one way doesn't.
Don't forget testability: Also, what value can you place on testing? Thread-based shared-memory concurrency is simply not testable. Message-passing concurrency is. So now you have the situation where you can test one paradigm but not the other. So, what is the value in knowing that the code has been tested? The danger in not even knowing if the other code will work in every situation?
A few comments on the misunderstanding you have of Erlang:
Erlang guarantees that messages will not be lost, and that they will arrive in the order sent. A basic error situation is that machine A can not speak to machine B. When that happens process monitors and links will trigger, and system node-down messages will be sent to the processes that registered for it. Nothing will be silently dropped. Processes will "crash" and supervisors (if any) tries to restart them.
Objects can not be mutated, so they are always copied. One way to secure immutability is by copying values to other erlang process' heaps. Another way is to allocate objects in a shared heap, message references to them and simply not have any operations that mutate them. Erlang does the first for performance! Realtime suffers if you need to stop all processes to garbage collect a shared heap. Ask Java.
There is shared state in Erlang. Erlang is not proud of it, but it is pragmatic about it. One example is the local process registry which is a global map that maps a name to a process so that system processes can be restarted and claim their old name. Erlang just tries to avoid shared state if it possibly can. ETS tables that are public are another example.
Yes, sometimes Erlang is too slow. This happens all languages. Sometimes Java is too slow. Sometimes C++ is too slow. Just because a tight loop in a game had to drop down to assembly to kick off some serious SIMD-based vector mathematics you can't deduce that everything should be written in assembly because it is the only language that is fast when it matters. What matters is being able to write systems that have good performance, and Erlang manages quite well. See benchmarks on yaws or rabbitmq.
Your facts are not facts about Erlang. Even if you think Erlang programming is a pain, you will find other people create some awesome software thanks to it. You should attempt writing an IRC server in Erlang, or something else very concurrent. Even if you're never going to use Erlang again, you would have learned to think about concurrency another way. But of course, you will, because Erlang is awesome easy.
Those that do not understand Erlang are doomed to re-implement it badly.
Okay, the original was about Lisp, but... its true!
There are some implicit assumption in your questions - you assume that all the data can fit
on one machine and that the application is intrinsically localised to one place.
What happens if the application is so large it cannot fit on one machine? What happens if the application outgrows one machine?
You don't want to have one way to program an application if it fits on one machine and
a completely different way of programming it as soon as it outgrows one machine.
What happens if you want make a fault-tolerant application? To make something fault-tolerant you need at least two physically separated machines and no sharing.
When you talk about sharing and data bases you omit to mention that things like mySQL
cluster achieve fault-tolerence precisely by maintaining synchronised copies of the
data in physically separated machines - there is a lot of message passing and
copying that you don't see on the surface - Erlang just exposes this.
The way you program should not suddenly change to accommodate fault-tolerance and scalability.
Erlang was designed primarily for building fault-tolerant applications.
Shared data on a multi-core has it's own set of problems - when you access shared data
you need to acquire a lock - if you use a global lock (the easiest approach) you can end up
stopping all the cores while you access the shared data. Shared data access on a multicore
can be problematic due to caching problems, if the cores have local data caches then accessing "far away" data (in some other processors cache) can be very expensive.
Many problems are intrinsically distributed and the data is never available in one place
at the same time so - these kind of problems fit well with the Erlang way of thinking.
In a distributed setting "guaranteeing message delivery" is impossible - the destination machine might have crashed. Erlang cannot thus guarantee message delivery -
it takes a different approach - the system will tell you if it failed to deliver a message
(but only if you have used the link mechanism) - then you can write you own custom error
recovery.)
For pure number crunching Erlang is not appropriate - but in a hybrid system Erlang
is good at managing how computations get distributed to available processors, so we see a lot of systems where Erlang manages the distribution and fault-tolerent aspects of the problem, but the problem itself is solved in a different language.
and other languages are used
For e.g. in a DB, you have to access and modify the same record
But that is handled by the DB. As a user of the database, you simply execute your query, and the database ensures it is executed in isolation.
As for performance, one of the most important things about eliminating shared state is that it enables new optimizations. Shared state is not particularly efficient. You get cores fighting over the same cache lines, and data has to be written through to memory where it could otherwise stay in a register or in CPU cache.
Many compiler optimizations rely on absence of side effects and shared state as well.
You could say that a stricter language guaranteeing these things requires more optimizations to be performant than something like C, but it also makes these optimizations much much easier for the compiler to implement.
Many concerns similar to concurrency issues arise in singlethreaded code. Modern CPUs are pipelined, execute instructions out of order, and can run 3-4 of them per cycle. So even in a single-threaded program, it is vital that the compiler and CPU is able to determine which instructions can be interleaved and executed in parallel.
For correctness, shared is the way to go, and keep the data as normalized as possible. For immediacy, send messages to inform of changes, but always back them up with polling. Messages get dropped, duplicated, re-ordered, delayed - don't rely on them.
If speed is what you're worried about, first do it single-thread and tune the daylights out of it. Then if you've got multiple cores and know how to split up the work, use parallelism.
Erlang provides supervisors and gen_server callbacks for synchronous calls, so you will know about it if a message isn't delivered: either the gen_server call returns a timeout, or your whole node will be brought down and up if the supervisor is triggered.
usually if the processes are on the same node, message-passing languages optimise away the data copying, so it's almost like shared memory, except if the object is changed used by both afterward, which can not be done using shared memory either anyways
There is some state which is kept by processes by passing it around to themselves in the recursive tail-calls, also some state can be of course passed through messages. I don't use mnesia much, but it is a transactional database, so once you have passed the operation to mnesia (and it has returned) you are pretty much guaranteed it will go through..
Which is why it is easy to tie such applications into erlang with the use of ports or drivers. The easiest are the ports, it's much like a unix pipe, though I think performance isn't that great...and as said, message-passing usually ends up just being pointer passing anyways as the VM/compiler optimise the memory copy out.