Store data off-blockchain using a smart contract - blockchain

I found a paper which is talking about a way to store data off-chain using the blockchain. The data are sent to the blockchain with a transaction which subsequently routes it to an off-blockchain store, while retaining only a pointer to the data on the public ledger.
In particular the paper says:
Consider the following example: a user installs an application that uses our platform for preserving her privacy. As the user signs up for the first time, a new shared (user, service) identity is generated and sent, along with the associated permissions, to the blockchain in a Taccess transaction. Data collected on the phone (e.g., sensor data such as location) is encrypted using a shared encryption key and sent to the blockchain in a Tdata transaction, which subsequently routes it to an off-blockchain key-value store, while retaining only a pointer to the data on the public ledger (the pointer is the SHA-256 hash of the data).
What I cannot understand is how they do it! If all the nodes on the blockchain have to execute that very transaction, it means that they all have to save those information off-blockchain causing a duplication of contents. Did I get it wrong?

After a quick glance at the paper in question, it makes no mention of storage replication. The use case they are describing here is to use blockchain transactions as references to physical data that is stored somewhere. The data can be accessed by anyone who has the reference to it; i.e. access to that particular blockchain system, however the data is encrypted such that only parties with the encryption key can actually decipher it. This approach allows for quick validation of data integrity while maintaining privacy.
From the perspective of the blockchain node all they see is a transaction that will be added to their local ledger, they don't actually save the data themselves.

Related

How would the data schema of bitcoin look like?

Since bitcoin is a blockchain and blockchain has been described as a kind of database, how would the data schema of bitcoin look like? Is it a single table database? If yes, which columns are inside this table?
The data is stored in an application-specific format optimized for compact storage, and wasn't really intended to be easily parsed by other applications.
See https://bitcoin.stackexchange.com/q/10814
For this custom format, see https://en.bitcoin.it/wiki/Protocol_documentation#block
There are various databases for various usages. As a reference client I would use bitcoin-core and describe its standard structure that is stored via the client. It actually uses "leveldb" and "berkleydb-4.8" for storing all kind of data.
Wallet database
Saves your transactions, generated public/private keys. That is usually encrypted ;)
Source: Wallets
Index Database
It's usually OPTIONAL, but usually stores a list of all transactions and in which block they occurred
Block Database
It's the most important db which locally stored and share via the network to communicate about newly created blocks and verify them. Every client has a copied version of it.
They usually store all blocks that ever occurred and also include fork-off blocks and also obsolete blocks.
Source: Blockchain / Transactions
Peers Database
Obviously there also is a database for all peers you have seen in the past. It rates each peer by giving it a ban-score, stores their IP addresses, ports and last seen status.
Conclusion:
That would be all databases. They mostly have "one table" which includes exactly the previously described data structures.
More information about the p2p network structure can be found right here.

Corda: What is the difference between Off-ledger and on-ledger data

After going through the Vault documentation of Corda, it is still not clear between how on-ledger & off-ledger data works in a Vault. A good explanation is appreciated.
On-ledger data is states that resulted from Corda transactions; the notary attests that the transaction inputs haven't been used before, and registers the transaction outputs as unconsumed states. The on-ledger data is cryptographically secured, any tampering with the node's tables will lead to a faulty ledger. All states are final and cannot be updated or modified. Every transaction has a set of required signers; so there's always an audit trail showing who approved the transaction and when.
Off-ledger data is data that is not tracked by your distributed ledger; meaning it's not finalized by the notary or cryptographically secured; but there's nothing stopping you from using the node's database schema to add your own tables, and you can even insert data into those tables from inside of your flows, but that data is not protected; it's like the data of any non-blockchain related app, anyone with database access can change the data and there's no audit trail showing that the data was tampered with, or by who.
Have a look at my article here, it shows an example of on-ledger and off-ledger data; the on-ledger data is the tokens (e.g. FungibleToken states) which result from using the Tokens SDK flows (issue and move tokens); while the custom table that I created for reporting purposes is the off-ledger data, even though I insert data into it from inside of my flows; that data is not the result of a transaction that is finalized by the notary and signed by a quorum of parties, so anyone can login to the database and modify it.

BlockChain for Storing personal Data

I want to store Personel data to BlockChain for a company. We want to prove that the data is unchangeable. A Customer in the blockchain will not access or see any other customer data.
But Company will access all customer data and can make any operation and also can follow any operation, any access Log.
Company will store new form type(Personal data) and flag it as a personal data card.
Is it possible with Blockchain?
The best method would be to encrypt the data, but it really depends upon what you are doing with it. If you need to do operations on it, then you will have to use zk-SNARKs, but these are a new field and you would have to do a lot of research to get it working. If you aren't using the data for anything; it's just metadata, then why would you need it to be on a public ledger and validated?
Plus, there is one big problem about storing sensitive data on the blockchain: the blockchain is immutable and once something is on the blockchain, it is stored forever. So what if there comes a time when quantum computers become so powerful that they can break all encryption we have today? Then all your users' personal data will be public on the blockchain.

Data storage on Corda

I am working on a use case. The requirement being I need to create an order. The order has some customer parameters. The order can be modified multiple times. Initially, I thought of implementing it in Ethereum. So, I thought of capturing the customer details from a UI and store it on the smart contract. However, the issue is once the contract is deployed I cannot change it since it is immutable. This drawback prevents me to use Ethereum. Deliberating on Corda, can I store the customer data as a single records and make modifications to it such that modifications are stored on the ledger which we can query. For instance, I want to store customer ID, Customer name, customer purchase order number, order type, and order status.
How would I do that in Corda data model?
Will that data be modifiable based on the rules coded on smart contract?
The data can be queried?
Yes.
In Corda, you store information using States. In your case, you might create a CustomerDataState class. You'd then create instances of this class on the ledger to represent your various customers.
This data can then be updated using transactions. But not all transactions are allowed. Which transactions are valid is based on the rules in the associated CustomerDataContract. The CustomerDataContract will be stateless, and simply exists to say how CustomerDataStates can evolve over time.
You can then easily query this data from your node's vault or transaction storage using an SQL-style syntax.

how does delState work in Fabric?

I'm new to IBM Hyperledger Fabric.
While trying to go over documents, I see there are couple states
getState, putState, delState., etc
https://github.com/hyperledger/fabric/blob/master/core/chaincode/shim/chaincode.go
I'm wondering if ledger is 'immutable and chained', how can we 'delete' the state?
Given that it is a ledger which is chained by each transaction or transactions, wouldn't it be impossible to delete state or at least corrupt the chains of hash?
Thank you!
There is a state database that stores keys and their values. This is different from the sequence of blocks that make up the blockchain. A key and its associated value can be removed from the state database using the DelState function. However, this does not mean that there is an alteration of blocks on the blockchain. The removal of a key and value would be stored as a transaction on the blockchain just as the prior addition and any modifications were stored as transactions on the blockchain.
Concerning different hashes, it is possible that block hashes could diverge if there is non-deterministic chaincode. Creating chaincode that is non-deterministic should be avoided. Here is a documentation topic that discusses non-deterministic chaincode.
The history of a key can be retrieved after the key is deleted. There is a GetHistoryForKey() API that retrieves the history and part of its response is an IsDeleted flag that indicates if the key was deleted. It would be possible to create a key, delete the key, and then create the key again; the GetHistoryForKey() API would track such a case.
The state database stores the current state, so the key and its value are deleted from the state database. The GetHistoryForKey() API reviews the chain history and not the state database to find prior key values.
There is an example that illustrates use of the GetHistoryForKey() API. See the getHistoryForMarble function.