I'm ramping up on learning Solidity, and have some ideas. At the moment I am curious if files/images can be put on the blockchain. I'm thinking an alternative would be some hybrid approach where some stuff is on the blockchain, and some stuff is in a more traditional file storage and uses address references to grab it. One issue I foresee is gas price of file uploads.
Is it possible to store images on the Ethereum blockchain?
It's absolutely possible!
Should you do it? Almost certainly not!
One issue I foresee is gas price of file uploads.
The cost of data storage is 640k gas per kilobyte of data.
The current gas price is approximately 15 Gwei (or 0.000000015 ETH).
At today's price, 1 ETH is approximately $200.
That works out at just under $2 per kilobyte.
It's not for me to tell you if this is too expensive for your application, but you should also consider that the price of both Gas and Ether vary dramatically over time, and you should expect to experience periods when this number will be significantly higher.
Note: I tried to store that +10,000 long base64 string of a 100kb image, but it did't accept. but when i tried 1kb image, it worked.
yes. This is the solidity code to do it:-
// SPDX-License-Identifier: GPL-3.0
pragma solidity >=0.7.0 <0.9.0;
contract ImgStorage {
uint i=0;
mapping(uint => string[]) public base64_images;
function push(string memory base64_img) public {
base64_images[i].push(base64_img);
i++;
}
function returnImage(uint n) public view returns(string[] memory){
return base64_images[n];
}
}
working code image:
You can convert image to base64 and vise versa online.
Here is NodeJS code to convert image to base64 string:
const imageToBase64 = require('image-to-base64');
const fs=require('fs')
imageToBase64("img/1kb.png")
.then(data => {fs.writeFile('1kb_png.md',data, (err)=>{console.log(err)})})
.catch(err =>console.log(err))
I totally agree with #Peter Hall that storing the image on ethereum is too costly.
So, what you can do instead of this?
You can store the image on IPFS. IPFS gives you a fixed length of a hash. Now, you can store this hash on Ethereum and it cost less than another way.
Technically, yes, you could store very small images. But you shouldn't.
Preferred Alternative
Store the image in a distributed file store (for example, Swarm or IPFS), and store a hash of the image on-chain, if it's really important for the image to provably untampered. If that's not important, then maybe don't put anything on chain.
What technical limit is there?
Primarily, the block's gas limit. Currently, Ethereum mainnet has an 8Mgas block limit. Every new 32bytes of storage uses 20k gas. So you can't store data that sums to more than 12.8kb, because it doesn't fit in the block.
Why shouldn't I use it for small files?
The blockchain wasn't designed for that usage (which is why other projects like Swarm and IPFS exist). It bloats and slows everything down, without providing you any benefit over other file storage systems. By analogy, you typically don't store files in a SQL database, either.
You can store images on Ethereum blockchain, but it is too expensive because of "blockspace premium" of high quality blockchain.
Other, more affordable decentralised storage solutions include
Chia
Filecoin (where Filecoin VM smart contracts can manipulate files)
Arweave
Storj
Storing images on-chain is an emphatic NO!
Storing images in a database is also not a good practice, I'm assuming you just mean file storage solutions like S3 / firebase. Storing images on a central server is okay but it depends what you want to achieve, there are decentralized storage solutions like IPFS and Swarm that you could look into.
Ethereum is too heavy as well as expensive to store large blobs like images,
video, and so on. Hence, some external storage is necessary to store bigger
objects. This is where the Interplanetary File System (IPFS) comes into the
picture. The Ethereum Dapp can hold a small amount of data, whereas for
saving anything more or bigger such as images, words, PDF files, and so on,
we use IPFS.
IPFS is an open-source protocol and network designed to create a peer-to-peer method of storing and sharing data. It is similar to Bit Torrent.
If you want to upload a PDF, Word, or image file to
IPFS.
1- You put the PDF, Word, or image file in your working directory.
2- You inform IPFS to add this file, which generates a hash of the file. Note an IPFS hash always starts with “Qm....”
3- Your file is available on the IPFS network.
Now you uploaded the file and want to share the file with Bob. you send the hash of the file to Bob, Bob uses the hash and calls IPFS for the file. The file is now downloaded at Bob’s end. The issue here is that anyone who can get access to the hash will also be able to get access to the file.
Sharing Data on IPFS by Asymmetric Cryptography
Let' say you uploaded a file to IPFS and you want to share it only with Bob.
Bob will send you a public key. you will encrypt the file with Bob's public key and then upload it to IPFS network.
You send the hash of the file to Bob. Bob uses this hash and gets the file.
Bob decrypts the file using his private key of the public key that was used to encrypt the file.
In Asymmetric Cryptography, public keys are generated by the private key and if you lock something with a public key, the only key that will unlock that thing is the private key that the given public key is generated from.
The better way of dealing with files on the blockchain is to use some sort of file storage services like AWS-S3, IPFS, Swarm, etc.
You can upload the files to one of the above file storage servers and generate the hash key(Which is used to access the file), and keep this key in the blockchain.
The advantages of using this method are -
Low-cost solution
Easy and fast access to files using file storage searching algorithms
Lightweight
Flexibility to move from blockchain to DB or vice-a-versa
If the file storage system has good security from tampering, then be assured as these files will not be accessed without the right hash key
Easy to perform migration of file storage from one service to another
Related
I have a finished art project and I am looking for a way to store the data on-chain. Like deafbeef for example. The source of my generated 6x6 pixel images are pictures, so I cannot recreate them by code. I guess I have to store their raw data in some form on-chain.
I am new to blockchain stuff. I know that tokens are stored on the chain, I have minted some artworks on hen and opensea, but I generally do not understand how it all works in the background.
Can you recommend some well explained tutorials or articles on this topic?
Thank you!
Any reason why they need to be stored on-chain?
Normally images are stored off chain to reduce the cost of transactions.
You can upload the image onto ipfs via pinata and then create a json file that point to it and upload that through pinata as well. The tokenURI would then be that json file.
For example,
TokenURI=ipfs://<json_file_hash>"
Then in your JSON file:
{
"image": "<image_file_hash>"
}
I'm in the process of developing a blockchain-based application for a client that wishes to store files securely. For this purpose I am using IPFS to store the files and the blockchain(more specifically an ethereum network) to store the hashes for the files. As is the case in most such applications.
However, the client is insistent on storing the files directly on the blockchain because of the linked list feature that ensures that the hash of every block on the blockchain is dependent on the previous block and as such every single hash depends on each other.
Does IPFS have a similar feature in it's data structure? I realize that the Merkle Tree system ensure that any tampering with any of the data chunks that the root hash references will change the root hash and as such allows verification of shared files. However, is there any feature that makes the hashes of files dependent on each other?
Perhaps if the files were in some sort of directory structure?
IPFS blocks form a DAG - Directed Acyclic Graph. A blockchain is a specific kind of DAG where each node has only one child. As you say, the root block of a file contains an array of the hashes of the component blocks. Similarly, a directory object contains a dictionary that maps filenames to the hashes of those root blocks. So, if you add a directory to ipfs, you will have a single hash that validates the entire contents of the directory.
What are the main set of files that are required for orchestration of new network from old data from old sawtooth network ( I don't want to extend old sawtooth network).
I want to backup the essential files that are crucial for the operation of the network from the last block in the ledger.
I have list of files that were generated in sawtooth validator and with poet concenses:
block-00.lmdb
poet-key-state-0371cbed.lmdb
block-00.lmdb-lock
poet_consensus_state-020a4912.lmdb
block-chain-id
poet_consensus_state-020a4912.lmdb-lock
merkle-00.lmdb
poet_consensus_state-0371cbed.lmdb
merkle-00.lmdb-lock
txn_receipts-00.lmdb
poet-key-state-020a4912.lmdb
txn_receipts-00.lmdb-lock
poet-key-state-020a4912.lmdb-lock
What is the significance of each file and what are the consequences if not included when restarting the network or creation of new network with old data in ledger.
Answer for this question could bloat, I will cover most part of it here for the benefit of folks who have this question, especially this will help when they want to deploy the network through Kubernetes. Also similar questions are being asked frequently in the official RocketChat channel.
The essential set of files for the Validator and the PoET are stored in /etc/sawtooth (keys and config directory) and /var/lib/sawtooth (data directory) directories by default, unless changed. Create a mounted volume for these so that they can be reused when a new instance is orchestrated.
Here's the file through which the default validator paths can be changed https://github.com/hyperledger/sawtooth-core/blob/master/validator/packaging/path.toml.example
Note that you've missed keys in the list of essential files in your question and that plays important role in the network. In case of PoET each enclave registration information is stored in the Validator Registry against the Validator's public key. In case of Raft/PBFT consensus engine makes use of keys (members list info) to send peer-peer messages.
In case of Raft the data directory it is /var/lib/sawtooth-raft-engine.
Significance of each of the file you listed may not be important for the most people. However, here goes explanation on important ones
*-lock files you see are system generated. If you see these, then one of the process must have opened the file for write.
block-00.lmdb it's the block store/block chain, has KV pair of block-id to block information. It's also possible to index blocks by other keys. Hyperledger Sawtooth documentation is right place to understand complete details.
merkle-00.lmdb is to store the state root hash/global state. It's merkle tree representation in KV pair.
txn-receipts-00.lmdb file is where transaction execution status is stored upon success. This also has information about events if any associated with those transactions.
Here is a list of files from the Sawtooth FAQ:
https://sawtooth.hyperledger.org/faq/validator/#what-files-does-sawtooth-use
I was wondering what you recommend for running a user upload system with s3. I plan on using MongoDB for storing metadata such as the uploader, size, etc. How should I go about storing the actual file in s3.
Here are some of my ideas, what do you think is the best? All of these examples would involve saving the metadata to MongoDB.
1.Should I just store all the files in a bucket?
2. Maybe organize them into dates (e.g. 6/8/2014/mypicture.png)?
3.Should I save them all in one bucket, but with an added string (such as d1JdaZ9-mypicture.png) to avoid duplicates.
4. Or should I generate a long string for a folder, and store the file in that folder. (to retain the original file name). e.g. sh8sb36zkj391k4dhqk4n5e4ndsqule6/mypicture.png
This depends primarily on how you intend to use the pictures and which objects/classes/modules/etc. in your code will actually deal with retrieving them.
If you find yourself wanting to do things like - "all user uploads on a particular day" - A simple naming convention with folders for the year, month and day along with a folder at the top level for the user's unique ID will solve the problem.
If you want to ensure uniqueness and avoid collisions in your bucket, you could generate a unique string too.
However, since you've got MongoDB which (i'm assuming) will actually handle these queries for user uploads by date, etc., it makes the choice of your bucket more aesthetic than functional.
If all you're storing in mongoDB is the key/URL, it doesn't really matter what the actual structure of your bucket is. Nevertheless, it makes sense to still split this up in some coherent way - maybe group all a user's uploads and give each a unique name (either generate a unique name or prefix a unique prefix to the file name).
That being said, do you think there might be a point when you might look at changing how your images are stored? You might move to a CDN. A third party might come up with an even cheaper/better product which you might want to try. In a case like that, simply storing the keys/URLs in your MongoDB is not a good idea since you'll have to update every entry.
To make this relatively future-proof, I suggest you give your uploads a definite structure. I usually opt for:
bucket_name/user_id/yyyy/mm/dd/unique_name.jpg
Your database then only needs to store the file name and the upload time stamp.
You can introduce a middle layer in your logic (a new class perhaps or just a helper function/method) which then generates the URL for a file based on this info. That way, if you change your storage method later, you only need to make a small change in this middle layer (after migrating your files of course) and not worry about MongoDB.
Ok, so I need some advice on which encryption method I should use for my current project. All the questions about this subject on here are to do with networking and passing encrypted data from one machine to another.
A brief summary of how the system works is:
I have some data that is held in tables that are in text format. I then use a tool to parse this data and serialize it to a dat file. This works fine but I need to encrypt this data as it will be stored with the application in a public place. The data wont be sent anywhere it is simply read by the application. I just need it to be encrypted so that if it were to fall into the wrong hands, it would not be possible to read the data.
I am using the crypto++ library for my encryption and I have read that it can perform most types of encryption algorithms. I have noticed however that most algorithms use a public and private key to encrypt/decrypt the data. This would mean I would have to store the private key with the data which seems counter intuitive to me. Are there any ways that I can perform the encryption without storing a private key with the data?
I see no reason to use asymmetric crypto in your case. I see two decent solutions depending on the availability of internet access:
Store the key on a server. Only if the user of the program logs in to the server he gets back the key to his local storage.
Use a Key-Derivation-Function such as PBKDF2 to derive the key from a password.
Of course all of this fails if the attacker is patient and installs a keylogger and waits until you access the files the next time. There is no way to secure your data once your machine has been compromised.
Short answer: don't bother.
Long answer: If you store your .DAT file with the application, you'll have to store the key somewhere too. Most probably in the same place (maybe hidden in the code). So if a malicious user wants to break your encryption all he has to do is to look for that key, and that's it. It doesn't really matter which method or algorithm you use. Even if you don't store the decryption key with the application, it will get there eventually, and the malicious user can catch it with the debugger at run time (unless you're using a dedicated secured memory chip and running on a device that has the necessary protections)
That said, many times the mere fact that the data is encrypted is enough protection because the data is just not worth the trouble. If this is your case - then you can just embed the key in the code and use any symmetric algorithm available (AES would be the best pick).
Common way to solve your issue is:
use symetric key algorithm to cipher your data, common algorithm are AES, twofish. most probably, you want to use CBC chaining.
use a digest (sha-256) and sign it with an asymetric algorithm (RSA), using your private key : this way you embed a signature and a public key to check it, making sure that if your scrambling key is compromised, other persons won't be able to forge your personal data. Of course, if you need to update these data, then you can't use this private key mechanism.
In any case, you should check
symetric cipher vs asymetric ones
signature vs ciphering
mode of operation, meaning how you chain one block to the next one for block ciphers, like AES, 3DES (CBC vs ECB)
As previously said, if your data is read andwritten by same application, in any way, it will be very hard to prevent malicious users to steal these data. There are ways to hide keys in the code (you can search for Whitebox cryptography), but it will be definitely fairly complex (and obviously not relying on a simple external crypto library which can be easily templated to steal the key).
If your application can read the data and people have access to that application, someone with enough motivation and time will eventually figure out (by disassembling your application) how to read the data.
In other words, all the information that is needed to decipher the encrypted data is already in the hand of the attacker. You have the consumer=attacker problem in all DRM-related designs and this is why people can easily decrypt DVDs, BluRays, M4As, encrypted eBooks, etc etc etc...
That is called an asymmetric encryption when you use public/private key pairs.
You could use a symmetric encryption algorithm, that way you would only require one key.
That key will still need to be stored somewhere (it could be in the executable). But if the user has access to the .dat, he probably also has access to the exe. Meaning he could still extract that information. But if he has access to the pc (and the needed rights) he could read all the information from memory anyways.
You could ask the user for a passphrase (aka password) and use that to encrypt symmetrically. This way you don't need to store the passphrase anywhere.