Who runs the Notary in Corda? - blockchain

In case of a consortium of banks in a Corda network consisting of one notary, could you please tell me who will be responsible for running the notary? It is clear that a notary is a network service, so can any bank run the notary as and when it wishes to, to check for double spends or is there one "main" bank/node that takes charge to trigger the notary service. Thank you.

It's completely up to the network operator. The network can have anything from a single one-node validating notary to hundreds of pooled non-validating notaries.
You mention bank nodes joining and leaving the notary service at will. This is a property of Proof of Work consensus algorithms (miners can come and go as they please without undermining the consensus result), but it's not a property of consensus algorithms in general. If you wanted to adopt this mode, you'd have to use a suitable consensus algorithm for your notary pool. In general, therefore, you'd generally have nodes contributing to the notary pool full-time.
Depending on your use-case, it may also be preferable for paid third-parties to provide the notary pool, rather than having the banks that are transacting also provide the notarisation service.

Related

Does the private blockchain have to follow client-server model when considering BFT?

I'm a newbie, currently interested in data security & integrity.
I'm quite new to blockchain and distributed system theories, and suffering from some unclear doubts/questions on the fault-tolerant consensus.
May I ask for your kind advice on my dull thoughts regarding on the blockchain's true objective?
It would be a great help for me to step forward on understading better concept of consensus.
Here's the summary of what I understand (please correct me if I'm wrong)
In a synchronous network model, it is assumed that one can guarantee the message being delivered within a certain amount of time.
In an asynchronous network model, there is no certain guarantee on the message delivery.
In a design perspective, it is easier & more efficient to design a system based on the synchronous model.
Even the quorum size requirement can be reduced - synchronous model needs only f+1 votes while asynchronous model needs 2f+1 votes on the consensus.
(This is because that the synchronous model can eliminate the possiblity of message dropout (up to f), while the async model needs to consider both message dropout & possibly malicious messages.)
But in a distributed system based on multiple nodes, it is normally impossible to guarantee the message delivery since there is no central manager who can monitor all nodes whether each receives the message or not.
That is why most of the blockchain-oriented distributed ledgers (non-currency) are based on the asynchronous consensus schemes such as PBFT (Castro, Liskov 99).
In a private blockchain scenario, I believe that the main & final purpose of the consensus is to let all nodes hold a chain of blocks, where each block has a certain amount of agreements (i.e. more than f signatures).
So, based on the facts above, I got curious whether the fault-tolerance model only stands for the standard "client-server" environment.
What happens if we give up the client-server model, and let the client supersede peers' broadcast communications? (For a scenario where the client has enough computation power but is just short on storage, so it wants to manage multiple (but few, e.g. 3) replicas to store data via transactions)
To be more specific with a simple example (an authenticated environment with PKI):
Each replica only performs a "block creation (+signing)" and "block finalization (verifying signatures)" logic, and there may exist some malicious replicas (up to f) which try to spread different outputs, which is not originated from the client's transaction.
Let the client (a "data owner/manager" would be a more suitable term now...) visit all replicas, and it handles each replica as below:
Enforce that all replicas work as a synchronized state machine; assure all 3 replicas are synced up (check the latest block of each, and supplement the lagged ones)
Sets a callback (with a timeout) for each replica to guarantee the message delivery
Send a transaction to all replicas, and receive the block (with a signature) generated from the corresponding transaction (assume that the block contains only one transaction)
If more than f signatures (from replicas) for the same block message are collected, deliver the collected signatures to all replicas and order them to finalize the block.
If we do as above, instead of replicas making consensus on their own (i.e. no view-changes), can we still say that the system suffices a BFT model?
It seems that byzantine replicas cannot breach safety, since the client (and honest replicas) will only order/accept block finalization when the block comes with more than f signatures.
The concern is the liveness - can we say that the liveness still holds, since the system can continue (not a client-server model anymore, but the "system" goes on) because the client will ignore the faulty replica response and order honest replicas to finalize?
This curiosity is just stuck in my head, and it's blocking me from clearly understanding why the private blockchain systems need asynchronous consensus process among peers themselves.
(since they already trust the client transactions, and participants are designated by PKI & signatures)
Could anyone be kind and inform that whether my dominant-client example can still suffice the BFT, or something would go wrong?
Thank you so much!

In a mining pool service, do the client execute the entire PoW algorithm?

I know that a peer in a cryptocurrency network can contribute deciding the next block that has to be added to the blockchain. To do that and gain some rewards, such peer has to be the first peer able to resolve some PoW algorithm.
From what I have understood, mining pools use computational power of client machines in order to resolve the PoW as fast as possible.
I guess so that the mining pool server is the only peer that directly participates to the network and it performs entirely the algorithm using the computational power of the clients which perform only some secondary tasks.
How can be splitted this computational task to many clients?
Pool server receives "task" from a current coin node, by request getblocktemplate. Thereafter server, based on received tasks, prepares subtasks for participant miners, and also provide them another getblocktemplate strucutres, with reduced difficulty parameter. When miner solves subtask (with reduced difficulty), he sends his solution to a pool, this partial solution named a share. Pool computes participants contribution by number of submitted shares and shares difficulty.
Difficulty of some shares can be enough to comply coin network difficulty. Such share named solving share, and this is block solution. As result, this solving share added to blockchain as a block, and pool receives block reward.
Technically, miner can directly work with a wallet, without pool. This mode named solo mining.
See spec for getblocktemplate: https://github.com/bitcoin/bips/blob/master/bip-0022.mediawiki

How is multiparty computation (MPC) possible using Blockchain?

MPC involves running calculations on numbers, perhaps from different parties, and sharing the result without anyone seeing the underlying data. Even the person operating the computer cannot access the information
How is this possible on blockchains like ethereum/corda/hyperledger etc?
Corda is a permissioned blockchain platform, which focuses on Peer to Peer communication. It is direct, private, and secured.
If there is any multiparty computation happening with in a Corda network, it will be happened only among participating parties.
It will happen strictly according to the Corda contract code, which
is developed by the pre-agreed business rules.
Every transaction will collect digital signatures from the participating parties, and distributed the transaction to these participating parties when the transaction is done.
Please see more information at: https://docs.corda.net/docs/corda-os/4.4/key-concepts-ecosystem.html

Redundancy without central control point?

If it possible to provide a service to multiple clients whereby if the server providing this service goes down, another one takes it's place- without some sort of centralised "control" which detects whether the main server has gone down and to redirect the clients to the new server?
Is it possible to do without having a centralised interface/gateway?
In other words, its a bit like asking can you design a node balancer without having a centralised control to direct clients?
Well, you are not giving much information about the "service" you are asking about, so I'll answer in a generic way.
For the first part of my answer, I'll assume you are talking about a "centralized interface/gateway" involving ip addresses. For this, there's CARP (Common Address Redundancy Protocol), quoted from the wiki:
The Common Address Redundancy Protocol or CARP is a protocol which
allows multiple hosts on the same local network to share a set of IP
addresses. Its primary purpose is to provide failover redundancy,
especially when used with firewalls and routers. In some
configurations CARP can also provide load balancing functionality. It
is a free, non patent-encumbered alternative to Cisco's HSRP. CARP is
mostly implemented in BSD operating systems.
Quoting the netbsd's "Introduction to CARP":
CARP works by allowing a group of hosts on the same network segment to
share an IP address. This group of hosts is referred to as a
"redundancy group". The redundancy group is assigned an IP address
that is shared amongst the group members. Within the group, one host
is designated the "master" and the rest as "backups". The master host
is the one that currently "holds" the shared IP; it responds to any
traffic or ARP requests directed towards it. Each host may belong to
more than one redundancy group at a time.
This might solve your question at the network level, by having the slaves takeover the ip address in order, without a single point of failure.
Now, for the second part of the answer (the application level), with distributed erlang, you can have several nodes (a cluster) that will give you fault tolerance and redundancy (so you would not use ip addresses here, but "distributed erlang" -a cluster of erlang nodes- instead).
You would have lots of nodes lying around with your Distributed Applciation started, and your application resource file would contain a list of nodes (ordered) where the application can be run.
Distributed erlang will control which of the nodes is "the master" and will automagically start and stop your application in the different nodes, as they go up and down.
Quoting (as less as possible) from http://www.erlang.org/doc/design_principles/distributed_applications.html:
In a distributed system with several Erlang nodes, there may be a need
to control applications in a distributed manner. If the node, where a
certain application is running, goes down, the application should be
restarted at another node.
The application will be started at the first node, specified by the
distributed configuration parameter, which is up and running. The
application is started as usual.
For distribution of application control to work properly, the nodes
where a distributed application may run must contact each other and
negotiate where to start the application.
When started, the node will wait for all nodes specified by
sync_nodes_mandatory and sync_nodes_optional to come up. When all
nodes have come up, or when all mandatory nodes have come up and the
time specified by sync_nodes_timeout has elapsed, all applications
will be started. If not all mandatory nodes have come up, the node
will terminate.
If the node where the application is running goes down, the
application is restarted (after the specified timeout) at the first
node, specified by the distributed configuration parameter, which is
up and running. This is called a failover
distributed = [{Application, [Timeout,] NodeDesc}]
If a node is started, which has higher priority according to
distributed, than the node where a distributed application is
currently running, the application will be restarted at the new node
and stopped at the old node. This is called a takeover.
Ok, that was meant as a general overview, since it can be a long topic :)
For the specific details, it is highly recommended to read the Distributed OTP Applications chapter for learnyousomeerlang (and of course the previous link: http://www.erlang.org/doc/design_principles/distributed_applications.html)
Also, your "service" might depend on other external systems like databases, so you should consider fault tolerance and redundancy there, too. The whole architecture needs to be fault tolerance and distributed for "the service" to work in this way.
Hope it helps!
This answer is a general overview to high availability for networked applications, not specific to Erlang. I don't know too much about what is available in the OTP framework yet because I am new to the language.
There are a few different problems here:
Client connection must be moved to the backup machine
The session may contain state data
How to detect a crash
Problem 1 - Moving client connection
This may be solved in many different ways and on different layers of the network architecture. The easiest thing is to code it right into the client, so that when a connection is lost it reconnects to another machine.
If you need network transparency you may use some technology to sync TCP states between different machines and then reroute all traffic to the new machine, which may be entirely invisible for the client. This is much harder to do than the first suggestion.
I'm sure there are lots of things to do in-between these two.
Problem 2 - State data
You obviously need to transfer the session state from the crashed machine unto the backup machine. This is really hard to do in a reliable way and you may lose the last few transactions because the crashed machine may not be able to send the last state before the crash. You can use a synchronized call in this way to be really sure about not losing state:
Transaction/message comes from the client into the main machine.
Main machine updates some state.
New state is sent to backup machine.
Backup machine confirms arrival of the new state.
Main machine confirms success to the client.
This may potentially be expensive (or at least not responsive enough) in some scenarios since you depend on the backup machine and the connection to it, including latency, before even confirming anything to the client. To make it perform better you can let the client check with the backup machine upon connection what transactions it received and then resend the lost ones, making it the client's responsibility to queue the work.
Problem 3 - Detecting a crash
This is an interesting problem because a crash is not always well-defined. Did something really crash? Consider a network program that closes the connection between the client and server, but both are still up and connected to the network. Or worse, makes the client disconnect from the server without the server noticing. Here are some questions to think about:
Should the client connect to the backup machine?
What if the main server updates some state and send it to the backup machine while the backup have the real client connected - will there be a data race?
Can both the main and backup machine be up at the same time or do you need to shut down work on one of them and move all sessions?
Do you need some sort of authority on this matter, some protocol to decide which one is master and which one is slave? Who is that authority? How do you decentralise it?
What if your nodes loses their connection between them but both continue to work as expected (called network partitioning)?
See Google's paper "Chubby lock server" (PDF) and "Paxos made live" (PDF) to get an idea.
Briefly,this solution involves using a consensus protocol to elect a master among a group of servers that handles all the requests. If the master fails, the protocol is used again to elect the next master.
Also, see gen_leader for an example in leader election which works with detecting failures and transferring service ownership.

When to use local vs remote actors?

When should I use Actors vs. Remote Actors in Akka?
I understand that both can scale a machine up, but only remote actors can scale out, so is there any practical production use of the normal Actor?
If a remote actor only has a minor initial setup overhead and does not have any other major overhead to that of a normal Actor, then I would think that using a Remote Actor would be the standard, since it can scale up and out with ease. Even if there is never a need to scale production code out, it would be nice to have the option (if it doesn't come with baggage).
Any insight on when to use an Actor vs. Remote Actor would be much appreciated.
Remote Actors cannot scale up, they are only remote references to a local actor on another machine.
For Akka 2.0 we will introduce clustered actors, which will allow you to write an Akka application and scale it up only using config.
Regular Actors can be used in sending out messages in local project.
As for the Remote Actors, you can used it in sending out messages to dependent projects that are connected to the project sending out the message.
Please refer here for the Remote Akka Actors
http://doc.akka.io/docs/akka/snapshot/scala/remoting.html
The question asks "If a remote actor only has a minor initial setup overhead and does not have any other major overhead then I would think that using a Remote Actor would be the standard". Yet the Fallacies of distributed computing make the point that it is a design error to assume that remoting with any technology has no overhead. You have the overhead of copying the messages to bytes and transmitting it across the network interface. You also have all the complexity of different processes being up, down, stalled or unreachable and of the network having hiccups leading to lost, duplicated or reordered messages.
This great article has real world examples of weird network errors which make remoting hard to make bullet proof. The Akka project lead Roland Kuln in his free video course about akka says that in his experience for every 1T of network messages being sent he sees a corruption. Notes on Distributed Systems for Young Bloods says "distributed systems tend to need actual, not simulated, distribution to flush out their bugs" so even good unit tests wont make for a perfect system. There is lots of advice that remoting is not "free" but hard work to get perfect.
If you need to use remoting for availability, or to move to huge scale, then note that akka does at-least-once delivery with possible duplication. So you must ensure that duplicated messages don't create bad results.
The moment you start to use remoting you have a distributed system which creates challenges which are discussed in Distributed systems for fun and profit. Unless you are doing very simply things like stateless calculators that are idempotent to duplicated messages things get tricky. One of assignments on that akka video course at the link above is to make a replicated key-value store which can deal with lost messages by writing the logic yourself. Its far from being an easy assignment. State distributed across different processes gets very hard, actors encapsulate state, therefore distributing actors can get very hard, depending on the consistency and availability requirements of the system you are building.
This all implies that if you can avoid remoting and achieve what you need to achieve then you would be wise to avoid it. If you do need remoting then Akka makes it easy due to its location transparency. So whilst its a great toolbox to take with you on the job; you should double check if the job needs all the tools or only the simplest ones in the box.