I was working on IBM block chain examples and I deployed car-lease-demo sample on a Linux system. I am not able to understand how the database is storing. I see that there is a location "/var/hyperledger/production" where the database is located but I did not find any location like that.
Can anyone explain me how the data is stored and how hyperledger fabric uses the database to store key-value pairs and where is the location of the db where all the data is stored?
Also I would like to know if we can use a different db configuration like NOSQL databases like Neo4j, MongoDB ??
The default implementation uses LevelDB as the backend store for data and is present on all peer nodes. You can enter the docker container in cli mode and see it for yourself.
Yes, you can change the default DB to any other NoSQL DB. Here's is an example of setting up CouchDB with Hyperledger fabric.
As you can see, CouchDB is hosted in a separate container linked to the peer node via an open port (Look at the Docker compose file for details of connection). You can do the same for any other NoSQL DB and use the correct PUT and GET APIs in chain code to access them. But you will have to make sure that data gets replicated in all the DBs in time to maintain the consistency of the Blockchain network
Related
What is the best way to replicate data from Oracle Goldengate On premise to AWS (SQL or NOSQL)?
I was just checking this for azure,
My company is looking for solutions of moving data to the cloud
Minimal impact for on-prem legacy/3rd party systems.
No oracle db instances on the cloud side.
Minimum "hops" for the data between the source and destination.
Paas over IaaS solutions.
Out of the box features over native code and in-house development.
oracle server 12c or above
some custom filtering solution
some custom transformations
** filtering can be done in goldengate, in nifi, azure mapping, ksqldb
solutions are divided into:
If solution is alolwed to touch.read the logfile of the oracle server
you can use azure ADF, azure synapse, K2view, apache nifi, Orcle CDC adapter for BigData (check versions) to directly move data to the cloud buffered by kafka however the info inside the kafka will be in special-schema json format.
If you must use GG Trail file as input to your sync/etl paradigm you can
use a custom data provider that would translate the trailfile into a flowfile for nifi (you need to write it, see this 2 star project on github for a direction
use github project with gg for bigdata and kafka over kafkaconect to also get translated SQL dml and ddl statements which would make the solution much more readable
other solutions are corner cases, but i hope this gives you what you needed
In my company's case we have Oracle as a source db and Snowflake as a target db. We've built the following processing sequence:
On-premise OGG Extract works with on-premise Oracle DB.
Datapump sends trails to another host
On this host we have OGG for Big data Replicat that processes trails and then sends result as json to AWS S3 bucket.
Since Snowflake DB can handle JSON as a source of data and works with S3 bucket it loads jsons into staging tables where further processing takes place.
You can read more about this approach here: https://www.snowflake.com/blog/continuous-data-replication-into-snowflake-with-oracle-goldengate/
I'm using docker for a project, the main focus for its usage is to make the application available even if one of the node (it's a 6 nodes cluster with docker swarm) is down.
The application is basically a Django App that can save some images from users and others models. I'm currently saving the images as files, but since I need to specify a volume locally for a single machine, I would like to know if it would be better to save the images on database cluster, so it would be available even if the whole node goes down. Or is there another way?
#Edit
Note: The cluster runs locally and doesn't have internet access
The two options are two perform the file sharing via database or via the file system.
For file system sharing, you can use something like GlusterFS, so for each container it seems like they are mounting a host-local volume, but it's actually shared via GlusterFS between the hosts.
To my mind, if it's your application (e.g you can modify it at will), saving stuff in database would be the easier approach for most developers.
The best solution is often to go for a hosted option (such as MongoDB Atlas). Making a database resilient and highly available is really hard, and unless you are an expert on docker and mongo I would strongly recommend you to go for a hosted option.
I have hyperledger fabric network setup on my local machine with a single validating node. I am developing a chaincode and would like to clear my blockchain. I have read that the hyperledger fabric stores the database under /var/hyperledger. However, I do not see this hyperledger directory under /var. Is there another location for this directory? My development platform is MAC and I am using docker-compose to start my hyperledger fabric network.
The Hyperledger Fabric stores the database in /var/hyperledger/production/db within the file system for the validating peer. You can navigate to the validating peer file system by using a command like docker exec -it substitute_container_name bash. I am not aware of another location of the database. If the instructions at https://hub.docker.com/r/ibmblockchain/fabric-peer/ for using Hyperledger Docker images are followed, then the database location should be /var/hyperledger/production/db.
To clear the blockchain, the easiest way is to stop and run docker container again, given that you only have 1 validator peer you should not be worried about data consistency. Also, try to use the latest version of fabric releases since they improved a lot these kind of problems. And regarding the issue,
no rows in result set
make sure that youe have specified the right organization name and department when requesting user validation agains CA. The parameters you send, must exist on CA database, otherwise, you will receive that error.
My Django application (a PoC, not a final product) with a backend library uses a SQLite database - read only. The SQLite database is part of the repo and deployed to Heroku. This is working fine.
I have the requirement to allow updates to this database via the Django admin interface. This is not a Django managed database, so from Django's point of view just a binary file.
I could allow for a FileField to handle this, overwriting the database; I guess this would work in a self-managed server, but I am on Heroku and have the constraints imposed by Disk Backed Storage. My SQLite is not my webapp database, but limitations apply the same: I can not write to the webapp's filesystem and get any guarantee the new data will be visible by the running webapp.
I can think of alternatives, all with drawbacks:
Put the SQLite database in another server (a "media" server), and access it remotely: this will severely impact performance. Besides, accessing SQLite databases over the network does not seem easy.
Create a deploy script for the customer to upload the database via the usual deploy mechanisms. Since the customer is not technically fit, and I can not provide direct support, this is unfeasible.
Move out of Heroku to a self-managed server, so I can implement this quick-and-dirty upload without complications.
Do you have another suggestion?
PythonAnywhere.com
deploy your app and you can easily access all of your files and update them and your Sqlite3 database is going to be updated in real time without losing data.
herokuapp.com erase your Sqlite3 database every 24 hours that's why it's not preferred for Sqlite3 having web apps
What are some options to avoid the latency of pointing local django development servers to a remote MySQL database?
If developers use local MySQL databases to avoid the latency, what are some useful tools to sync schema updates of the remote db with the local db and avoid manually creating, downloading, and loading dumps?
Thanks!
One possibility is to configure the remote MySQL database to replicate to the developers local machine - assuming you have control of the remote database's configuration.
See the MySQL docs for replication notes. Using MySQL replication the remote node would be the Master and the developer machines would be Slaves. The main advantage of this approach is your developer machines would always remain synchronized to the Master database. One possible disadvantage (depending on the number of developer machines you are slaving) is a degradation in the remote database's performance due to extra load introduced by replication.
It sounds like you want to do schema migrations. Basically it's a way to log schema changes so that you can update and even roll back along with your source changes (if you change a model you also check in a new migration that has up and down commands). While this will likely become an official feature at some point, there are several third-party solutions to choose from. It's really a personal preference, here are some popular ones:
South
Django Evolution
dmigrations
I use a combination of South for schema migrations, and storing JSON fixtures (or SQL dumps) of useful test data in the VCS repo for the project. Works pretty seamlessly.