I am trying to understand how PBFT(practical byzantine fault tolerance) applied in block chain. After reading paper, I found that process for PBFT to reach a consensus is like below:
A client sends a request to invoke a service operation to the primary
The primary multicasts the request to the backups
Replicas execute the request and send a reply to the client
The client waits for f + 1 replies from different replicas with the same result; this is the result of the operation.
This is how I understand how it is applied in block chain:
At first, the elected primary node wants to write transaction A to chain, it will broadcast transaction A to other nodes.
Any node receives the transaction checks if the transaction legal. If the transaction is thought as legal, the node will broadcast a legal signal to all of nodes in this round of consensus.
Any node that receives equal or greater than f + 1 responds will write the transaction to the its own chain.
Here are my questions:
For malfunctioned nodes, if they keep failing to write block into its chain, they will hold a different chains with healthy node. In next consensus, the existing chain will be picked up at first. How do nodes know which one is the correct chain?
In step 1, the elected node send transaction to other nodes. Does "other nodes" means all nodes in the network? How to make sure if all nodes included in the consensus because there is not a centralized agency.
How do nodes know which one is the correct chain?
For tolerating Byzantine faulty nodes, It needs at least 3f+1 nodes in the network. PBFT is one of the algorithms which can tolerate Byzantine failure. So PBFT can tolerate up to f Byzantine nodes.
f number of malicious nodes can be tolerated if you use PBFT. If there are f number of malicious nodes which keep failing to write block into its chain, resulting in inconsistency with correct nodes, then one can figure that the same chains from rest 2f + 1 nodes are correct. (Correct nodes always output exactly same data to the same request in same order).
Does "other nodes" means all nodes in the network? How to make sure if all nodes included in the consensus because there is not a centralized agency.
In PBFT setup, identities of all nodes should be established. To do that, there should be central authority to determine whether a node can join the network or not. (Important: central authority only engages in identity management, not the algorithm itself)
Why this is needed? It's because PBFT works by voting mechanism and voting is not secure when anyone (including malicious node) can join the network. For example, a proposed value by the primary only can be recorded to all nodes in the way of state machine replication, which it means that there needs at least 2f + 1 agreed matching messages for the value to be accepted to the correct nodes.
Without the trusted identity management, Sybil attack is possible. And this is the main reason why PBFT is not for the open blockchain which allows any node can freely join or leave the network.
Related
Running a test node in GCP, using Docker 9.9.4, Ubuntu, Postgres db, Infura. I had issues with public/private IP, but once I cleared that up my node is up and running. I am now throwing the error below repeatedly, potentially due to the blockchain connection. How do I fix this?
[ERROR] HeadTracker: dropping head 26085153 with hash 0xf50e19099b7e343829935d70dd7d86c5bc0398286b7a4e4f32ac033ac60c3733 because queue is full. WARNING: Your node is overloaded and may start missing jobs. logger/default.go:155 stacktrace=github.com/smartcontractkit/chainlink/core/logger.Errorf
This log output is related to an overload of your blockchain connection.
This notification is usually related to the usage of public websocket connections and/or free third party NaaS Provider. To fix this connection issue you can either run an own full node or change the tier or the third party NaaS provider. Also it is recommended to use Chainlink version 0.10.8 or higher, as the HeadTracker has been revised here and performs more efficient.
In regard to the question let me try to give you a small technical overview, which may clarify the payload of a Chainlink node to it's remote full node:
Your Chainlink node establishes a connection to a full node. There the Chainlink node initiates various subscriptions, which are a special feature of the websocket protocol to enable bidirectional communication. More precisely, this means that the Chainlink node is informed if a certain "state" of the subscription changes. Basically, the node interacts with using JSON-RPC methods and uses the following methods to initiate and process various functions internally:
eth_getBlockByNumber,eth_getBalance,eth_getTransactionReceipt,eth_getTransactionCount,eth_getLogs,eth_subscribe,eth_unsubscribe,eth_sendRawTransaction and eth_Call
https://ethereum.org/uk/developers/docs/apis/json-rpc/
The high amount of interactions of the Chainlink node are especially executed during the syncing process via the internal HeadTracker service. This service initiates a "head" subscription in order to interact with every single incoming new blockheader.
During this syncing process it uses the JSON-RPC methods eth_GetBlockByNumber and eth_getBalance to get all the necessary information from the block. So these two methods are used/ executed every block. The number of requests now depends on the average blocktime of the network the Chainlink node is connected to
An example would be the Kovan Testnet:
The avg. blocktime here is 6.7sec, which means you get a daily request number of approx. 21.000
During fulfilling job requests, those request also includes following methods: eth_getTransactionReceipt, eth_sendRawTransaction, eth_getLogs, eth_subscribe, eth_unsubscribe, eth_getTransactionCount and eth_call, which increases the total number significantly depending on the number of job requests.
It should also be noted that especially with faster blockchains (e.g. polygon) there is a very high payload of the WebSocket and you have to deal with a good full node connection in detail, as many full nodes do not receive such a high number of requests permanently.
I am new to blockchain technology and have a basic question.
I understand that in any blockchain network, if any node tries to commit something which is not in sync with other nodes , it gets rejected.Then how the new transaction is commited and validated? Who has authority to do it.
That's the thing about blockchain. There is no authority that determines which block will get added to the chain. And by blockchain, I mean public blockchain.
Blockchain's typically are either public or permissioned.
Public
Public blockchain's, such as bitcoin and ethereum work on the principle of proof of work. In layman's terms, if any participant wants there transaction to be processed, i.e added to the chain, they submit it to the network. This transaction is then processed by independent entities called miners who have to solve a computational puzzle in order to produce a valid block, which if accepted results in compensation in form of the said digital currency for the work put in by the miner. Also, the longest chain is always accepted as the valid chain.
There is absolutely no criteria or organisation that overlooks mining, meaning anybody can become a miner and start contributing. So the network is for the people, by the people, anybody can join and both submit as well as process transactions.
If the transaction is valid, that is you own the coin and are not double spending it, it will be processed by a miner. And if the block produced by the miner is accepted, so is your transaction.
Private/Permissioned
On the other hand, in case of private/permissioned blockchain's like hyperledger fabric for example, participation and block processing is decided by a single or multiple organisations. Hence in this case, a block is processed only if it produced by a valid member and it is endorsed by nodes of all participating organizations.
As you said "if any node tries to commit something which is not in sync with other nodes" what I get is that you are asking about the block which one node produce but rejected by the blockchain. This scenario happens when 2 nodes try to find the proof of work and one node finds it first and broadcast to the network but due to network delay (there could be other reasons too for that), and the other node didn't get the block in this way stale/uncle blocks are created. Bitcoin blockchain considers the longest blockchain and discards the other.
I still have some difficulties to understand how does the PBFT Consensus Algorithm work in Hyperledger Fabric 0.6. Are there any paper which describes the PBFT Algorithm in blockchain environment?
Thank you very much for your answers!
While Hyperledger Fabric v0.6 has been deprecated for quite some time (we are working towards release of v1.1 shortly, as I write this) we have preserved the archived repository, and the protocol specification contains all that you might want to know about how the system works.
It is really too long a description to add here.
Practical Byzantine Fault Tolerance (PBFT) is a protocol developed to
provide consensus in the presence of Byzantine faults.
The PBFT algorithm has three phases. These phases run in a sequence to
achieve consensus: pre-prepare, prepare, and commit
The protocol runs in rounds where, in each round, an elected leader
node, called the primary node, handles the communication with the
client. In each round, the protocol progresses through the three
previously mentioned phases. The participants in the PBFT protocol are
called replicas, where one of the replicas becomes primary as a
leader in each round, and the rest of the nodes act as backups.
Pre-prepare:
This is the first phase in the protocol, where the primary node, or
primary, receives a request from the client. The primary node assigns
a sequence number to the request. It then sends the pre-prepare
message with the request to all backup replicas. When the pre-prepare
message is received by the backup replicas, it checks a number of
things to ensure the validity of the message:
First, whether the digital signature is valid.
After this, whether the current view number is valid.
Then, that the sequence number of the operation's request message is valid.
Finally, if the digest/hash of the operation's request message is valid.
If all of these elements are valid, then the backup replica accepts
the message. After accepting the message, it updates its local state
and progresses toward the preparation phase.
Prepare:
A prepare message is sent by each backup to all other replicas in the
system. Each backup waits for at least 2F + 1 (F is the number of
faulty nodes.) to prepare messages to be received from other replicas.
They also check whether the prepared message contains the same view
number, sequence number, and message digest values. If all these
checks pass, then the replica updates its local state and progresses
toward the commit phase.
Commit:
Each replica sends a commit message to all other replicas in the
network. The same as the prepare phase, replicas wait for 2F + 1
commit messages to arrive from other replicas. The replicas also check
the view number, sequence number, and message digest values. If they
are valid for 2F+ 1 commit messages received from other replicas, then
the replica executes the request, produces a result, and finally,
updates its state to reflect a commit. If there are already some
messages queued up, the replica will execute those requests first
before processing the latest sequence numbers. Finally, the replica
sends the result to the client in a reply message. The client accepts
the result only after receiving 2F+ 1 reply messages containing the
same result.
Reference
I have N nodes (i.e. distinct JREs) in my infrastructure running Akka (not clustered yet)
Nodes have no particular "role", but they are just processors of data. The "processors" of this data will be Actors. All sorts of non-Akka/Actor (other java code) (callers) can invoke specific types of processors by creating messages them data to work on. Eventually they need the result back.
A "processor" Actor is pretty simply and supports a method like "process(data)", they are stateless, they mutate and send data to an external system. These processors can vary in execution time so they are a good fit for wrapping up in an Actor.
There are numerous different types of these "processors" and the configuration for each unique one is stored in a database. Each node in my system, when it starts up, needs to create a router Actor that fronts N instances of each of these unique processor Actor types. I cannnot statically define/name/create these Actors hardwired in code, or akka configuration.
It is important to note that the configuration for any Actor processor can be changed in the database at anytime and periodically the creator of the routers for these Actors needs to terminate and recreate them dynamically based on the new configuration.
A key point is that some of these "processors" can only have a very limited # of Actor instances across all of my nodes. I.E processorType-A can have an unlimited number of instances, while processorType-B can only have 2 instances running across the entire cluster. Hence callers on NODE1 who want to invoke processorType-B would need to have their message routed to NODE2, because that node is the only node running processorType-B actor instances.
With that context in mind here is my question that I'm looking for some design help with:
For points 1, 2, 3, 4 above, I have a good understanding of and implementation for
For points 5 and 6 however I am not sure how to properly implement this with Akka clustering given that my "nodes" are not aware of each other AND they each run the same code to dynamically create these router actors based on that database configuration)
Issues that come to mind are:
How do I properly deal with the "names" of these router Actors across the cluster? I.E for "processorType-A", which can have an unlimited number of Actor instances. Each node would locally have these instances available, yet if they are all terminated on a single node, I would still want messages for their "processor type" to be routed on to another node that still has viable instances available.
How do I deal with enforcing/coordinating the "processor" instance limitation across the cluster (i.e. "processorType-B" can only have 2 instances globally) etc. While processorType-A can have a much higher number. Its like nodes need to have some way to check with each other as to who has created these instances across the cluster? I'm not sure if Akka has a facility to do this on its own?
ClusterRouterPool? w/ ClusterRouterPoolSettings?
Any thoughts and/or design tip/ideas are much appreciated! Thanks
Suppose I have a the following two Actors
Store
Product
Every Store can have multiple Products and I want to dynamically split the Store into StoreA and StoreB on high traffic on multiple machines. The splitting of Store will also split the Products evenly between StoreA and StoreB.
My question is: what are the best practices of knowing where to send all the future BuyProduct requests to (StoreA or StoreB) after the split ? The reason I'm asking this is because if a request to buy ProductA is received I want to send it to the right store which already has that Product's state in memory.
Solution: The only solution I can think of is to store the path of each Product Map[productId:Long, storePath:String] in a ProductsPathActor every time a new Product is created and for every BuyProduct request I will query the ProductPathActor which will return the correct Store's path and then send the BuyProduct request to that Store ?
Is there another way of managing this in Akka or is my solution correct ?
One good way to do this is with Akka Cluster Sharding. From the docs:
Cluster sharding is useful when you need to distribute actors across
several nodes in the cluster and want to be able to interact with them
using their logical identifier, but without having to care about their
physical location in the cluster, which might also change over time.
There is an Activator Template that demonstrates it here.
To your problem, the concept of StoreA and StoreB are each a ShardRegion and map 1:1 with to your cluster nodes. The ShardCoordinator manages distribution between these nodes and acts as the conduit between regions.
For it's part, your Request Handler talks to a ShardRegion, which routes the message if necessary in conjunction with the coordinator. Presumably, there is a JVM-local ShardRegion for each Request Handler to talk to, but there's no reason that it could not be a remote actor.
When there is a change in the number of nodes, ShardCoordinator needs to move shards (i.e. the collections of entities that were managed by that ShardRegion) that are going to shut down in a process called "rebalancing". During that period, the entities within those shards are unavailable, but the messages to those entities will be buffered until they are available again. To this end, "being available" means that the new ShardRegion responds to a directed message for that entity.
It's up to you to bring that entity back to life on the new node. Akka Persistence makes this very easy, but requires you to use the Event Sourcing pattern in the process. This isn't a bad thing, as it can lead to web-scale performance much more easily. This is especially true when the database in use is something like Apache Cassandra. You will see that nodes are "passivated", which is essentially just caching off to disk so they can be restored on request, and Akka Persistence works with that passivation to transparently restore the nodes under the control of the new ShardRegion – essentially a "move".