How two phase commit is working in atomic transaction? - web-services

I have some queries in two phase commit protocol,
1.What will it do on the failure of second phase commit?
2.How it maintains the fault tolerance?
Thanks in advance.

Answers from Two-phase commit protocol.
What will it do on the failure of second phase commit?
Failure of the commit phase is handled as follows:
If any cohort votes No during the commit-request phase (or the coordinator's timeout expires):
The coordinator sends a rollback message to all the cohorts.
Each cohort undoes the transaction using the undo log, and releases
the resources and locks held during the transaction.
Each cohort
sends an acknowledgement to the coordinator.
The coordinator undoes
the transaction when all acknowledgements have been received.
How it maintains the fault tolerance?
In many cases it does not, and may need human intervention:
it is not resilient to all possible failure configurations, and in rare cases user (e.g., a system's administrator) intervention is needed to remedy an outcome. To accommodate recovery from failure (automatic in most cases) the protocol's participants use logging of the protocol's states. Log records, which are typically slow to generate but survive failures, are used by the protocol's recovery procedures. Many protocol variants exist that primarily differ in logging strategies and recovery mechanisms. Though usually intended to be used infrequently, recovery procedures compose a substantial portion of the protocol, due to many possible failure scenarios to be considered and supported by the protocol.

Related

Akka: Persistence failure when replaying events

We are working on an event sourced application with akka-persistance using Oracle database as event store. The application have been running in production for sometime now. Lately we are seeing the following error in the application for some of the persistent actors.
Persistence failure when replaying events for persistenceId [some-persistence-id]. Last known sequence number [0]
Can someone who faced a similar issue in their application share their experience of why this happens?
Also, going through Akka documentation at: https://doc.akka.io/docs/akka/current/persistence.html, onRecoveryFailure is responsible for handling such failures. Is there a way we can override this method to ignore the persisted events in case we see failures while replaying events? In our scenario replaying the events is not very critical and we can serve the users even by ignoring the m.
That log is typically a manifestation of something else. Since the failure is from sequence number zero, that points an actual query to the DB failing (e.g. timeout). There should be other logs around the time of that log which will provide further information.
Akka Persistence has a fairly strong assumption that the persisted state is important (otherwise why would you be persisting?). Off the top of my head, I would consider separating the parts of the actor which are affected by persistence from the parts which aren't: the non-persistent actor can spawn a persistent child and interact with it (it can do tricks with stashing, for instance, to present an illusion that it and its child are the same actor).

How denial of state attacks could be caused in corda?

Corda doc says "If a transaction is not checked for validity (non-validating notary), it creates the risk of “denial of state” attacks, where a node knowingly builds an invalid transaction consuming some set of existing states and sends it to the notary cluster, causing the states to be marked as consumed"
In this case, does "invaid transaction" include simply mistaking transaction such as passing too high value input by type mistakes and invalidated by flow step?
How denial of state attacks could be caused.
"Denial of State" could be caused if a rogue node builds a transaction and consumes a state which is shared between multiple parties. This is a trade-off of using non-validating notary as it doesn't validate the transaction against the contract code. The other parties would not be able to use the state anymore as it would already be consumed.

Any terrible thing will happen if change Chaincode state in invokeChaincode?

Let's say I have two chaincode in Hyperledger Fabric, ChaincodeA and ChaincodeB.
Some events in ChaincodeA will have to change state in ChaincodeB, for example, change its balance. If invokeChaincode() used in ChaincodeA to invoke some logic in ChaincodeB, which calls putState() to change ChaincodeB's state, any race condition could happen when getting consensus? What's the best practices on handling this?
While invoking a chaincode you do not change the state you only simulate transaction execution based on the current state. Only once transaction placed into the block by ordering service and reaches the peer where it has to pass VSCC and MVCC checks it gonna be eventually committed. MVCC will take care of possible race condition. Transaction execution works as following:
Client sends transaction proposal to the peer
Peer simulates transaction sign the results and put them into signed transaction proposal
Client has to repeat step #2 based on expected endorsement policies
Once client collected enough endorsements he send them to the ordering service
Ordering service cuts the block and order all transaction
Block delivered to the peers
Peer validates and eventually commits the block
As I understated two chaincode deployed on two different channels. chaincodeA want to call method of chaincodeB. As per specification its possible but only for read operation.
https://godoc.org/github.com/hyperledger/fabric/core/chaincode/shim#ChaincodeStub.InvokeChaincode
can you please share code how you are calling another chaincodeB from chaincodeA?

Time period between duplicate messages

According to the documentation for SQS (emphasis mine):
Amazon SQS stores copies of your messages on multiple servers for redundancy and high availability. On rare occasions, one of the servers storing a copy of a message might be unavailable when you receive or delete the message. If that occurs, the copy of the message will not be deleted on that unavailable server, and you might get that message copy again when you receive messages. Because of this, you must design your application to be idempotent (i.e., it must not be adversely affected if it processes the same message more than once).
What time period can reasonably occur between the original and duplicate messages being received? (seconds? hours? months?)
I have no specific proof or link to show you, but in my experience working with SQS you are talking about a range of time that is under a few minutes in most cases. The possibility of a duplicate message happening will be because of activity that took place on the message during the very small lag of time as the message is replicated via very high speed connections to redundant queues within the AWS infrastructure, so in other words, very quickly. It is also likely going to be affected by the visibility timeouts you have specified.

Architecture for robust payment processing

Imagine 3 system components:
1. External ecommerce web service to process credit card transactions
2. Local Database to store processing results
3. Local UI (or win service) to perform payment processing of the customer order document
The external web service is obviously not transactional, so how to guarantee:
1. results to be eventually persisted to database when received from web service even in case the database is not accessible at that moment(network issue, db timeout)
2. prevent clients from processing the customer order while payment initiated by other client but results not successfully persisted to database yet(and waiting in some kind of recovery queue)
The aim is to do processing having non transactional system components and guarantee the transaction won't be repeated by other process in case of failure.
(please look at it in the context of post sell payment processing, where multiple operators might attempt manual payment processing; not web checkout application)
Ask the payment processor whether they can detect duplicate transactions based on an order ID you supply. Then if you are unable to store the response due to a database failure, you can safely resubmit the request without fear of double-charging (at least one PSP I've used returned the same response/auth code in this scenario, along with a flag to say that this was a duplicate).
Alternatively, just set a flag on your order immediately before attempting payment, and don't attempt payment if the flag was already set. If an error then occurs during payment, you can investigate and fix the data at your leisure.
I'd be reluctant to go down the route of trying to automatically cancel the order and resubmitting, as this just gets confusing (e.g. what if cancelling fails - should you retry or not?). Best to keep the logic simple so when something goes wrong you know exactly where you stand.
In any system like this, you need robust error handling and error reporting. This is doubly true when it comes to dealing with payments, where you absolutely do not want to accidentaly take someone's money and not deliver the goods.
Because you're outsourcing your payment handling to a 3rd party, you're ultimately very reliant on the gateway having robust error handling and reporting systems.
In general then, you hand off control to the payment gateway and start a task that waits for a response from the gateway, which is either 'payment accepted' or 'payment declined'. When you get that response you move onto the next step in your process and everything is good.
When you don't get a response at all (time out), or the response is invalid, then how you proceed very much depends on the payment gateway:
If the gateway supports it send a 'cancel payment' style request. If the payment cancels successfully then you probably want to send the user to a 'sorry, please try again' style page.
If the gateway doesn't support canceling, or you have no communications to the gateway then you will need to manually (in person, such as telephone) contact the 3rd party to discover what went wrong and how to proceed. To aid this you need to dump as much detail as you have to error logs, such as date/time, customer id, transaction value, product ids etc.
Once you're back on your site (and payment is accepted) then you're much more in control of errors, but in brief if you cant complete the order, then you should either dump the details to disk (such as csv file for manual handling) or contact the gateway to cancel the payment.
Its also worth having a system in place to track errors as they occur, and if an excessive number occur then consider what should happen. If its a high traffic site for example you may want to temporarily prevent further customers from placing orders whilst the issue is investigated.
Distributed messaging.
When your payment gateway returns submit a message to a durable queue that guarantees a handler will eventually get it and process it. The handler would update the database. Should failure occur at that point the handler can leave the message in the queue or repost it to the queue, or post an alternate message.
Should something occur later that invalidates the transaction, another message could be queued to "undo" the change.
There's a fair amount of buzz lately about eventual consistency and distribute messaging. NServiceBus is the new component hotness. I suggest looking into this, I know we are.