There seems to be an increasing overlapping and proliferation of cloud database technologies.
In order to make sense of it a comparative approach might help.
What are the exact differences between Google Cloud Firestore vs Google Cloud Spanner ?
Cloud Firestore is:
A flexible, NoSQL (non-relational) scalable database for mobile, web, and server development from Firebase and Google Cloud Platform.
On the other hand, Cloud Spanner:
Horizontally scalable, strongly consistent, relational database service.
So the main difference between them is that one is a non-relational database while the other is relational. Furthermore, Cloud Firestore is also a real-time database, which means that for every change that takes place in the database you are instantly notified.
Cloud Firestore is a fast, fully managed, serverless, cloud-native NoSQL document
database that simplifies storing, syncing, and querying data for your mobile, web, and
IoT apps at global scale. Its client libraries provide live synchronization and offline
support, and its security features and integrations with Firebase and GCP accelerate
building truly serverless apps.
Cloud Firestore supports ACID transactions, with automatic multi-region replication and strong consistency, your data is safe
and available, even when disasters strike. Cloud Firestore even allows you to run
sophisticated queries against your NoSQL data without any degradation in
performance.
Cloud Spanner is a service built for the cloud specifically to combine the benefits of
relational database structure with non-relational horizontal scale.
This service can provide petabytes of capacity and offers transactional consistency at
global scale, schemas, SQL, and automatic, synchronous replication for high
availability. Use cases include financial applications and inventory applications
traditionally served by relational database technology.
Related
I deploy my app to AWS.
On AWS there are RDS which support some industrial standard DBMS like PostgreSQL/MySQL/Oracle.
These dbms can be make available on development machine (docker) as well, make it easy to achieve dev/prod parity.
I'm looking for a time series specialized database that I can achieve dev/prod as well.
AWS has Timestream that is specialized for time series, but I'm clueless of a local equivalent database for it.
There probably some EC2-hosted database possible, but I prefer to be lazy and have Amazon take care of manage the database cluster for me.
What options do I have?
Apache Druid is a very good Time-Series Database that can be deployed on local development environments and on multiple cloud environments easily.
Druid is ofered as a fully managed cloud service, on AWS, by Imply.
The fully managed variant of Druid is called Imply Cloud.
More information: https://imply.io/product/imply-cloud
You should try Amazon Timestream
It is as a nonrelational, fully managed service built specifically to collect, store and process time-series data. The arrival of masses of IoT data is expected to push time-series technology into wider use,that's why Amazon came with Timestream.
I am new in the AWS Cloud services.
I am doing the learning project to better understand AWS services.
I assigned a project to prepare a new environment in the cloud, to which my team will later migrate their applications. The Stakeholders have come up with some Technical and Business requirements:
Due to the budget issue, the company cannot afford a dedicated DB engineer, so they are willing to outsource the DB management from a Cloud provider, to store and maintain the customer information received by PHP application. You must pick the right solution from AWS, which should be a Platform as a Service.It should also provide high availability, patching and back-ups. (hint: Create DB subnet group)
Which AWS Cloud service I could use to implement this requirement?
Please let me know if I need to provide more details.
Thank you in advance.
You can utilize the Database as a Service(DaaS) options from AWS. There are plenty of choices depending on the data and its requirements. Here is a snapshot from the official documentation:
Analyze your requirements and learn about the above choices to make an educated decision.
If you are looking for a relation database,
Its amazon RDS, A PaaS service provided my amazon web services. it is a highly available, scalable relational database solution service that you can setup and run in the cloud.
In regards to backups, you can enable automated daily backups and also make use of the backup retention policy to define the number of days you want to keep the backups.
Amazon RDS supports different relational databases such as mysql, postgres, oracle ,etc
If you are looking for a document database,
you got some options mainly DynamoDB and new DocumentDb solution.
In general any cloud service provider, GCP in this context, is it not relevant and mandatory for Google to specifically allow consumers to choose data residency and data processing region option for all services? Else serverless option will have serious adoption issue. Please clarify.
Google Cloud have two types of the products available: that have specified location and available globally.
You can deploy resources in specific location, multi-regional for:
Compute: Compute Engine, App Engine, Google Kubernetes Engine, Cloud Functions
Storage & Databases: Cloud Storage, Bigtable, Spanner, Cloud SQL, Firestore, Memorystore, Persistent Disk...
BigData & Machine learning: BigQuery, Composer, Dataflow, Dataproc, AI training,
Networking: VPC, Cloud Load Balancing,
Developer Tools...
Following products are available only globally: Networking, Big Data Pub/Sub, Machine Learning like vision API, Management Tools, Developer Tools, IAM.
For detailed list please check Google Cloud Locations Documentation
Even if the product is available globally, for example PubSub: it is possible to specify where messages are stored.
If the data in transit are the concern, you have to be aware that Google Cloud Platform uses data encryption at Rest. It consists on several layers of encryption to protect customer data.
Let's say a company has an application with a database hosted on AWS and also has a read replica on AWS. Then that same company wants to build out a data analytics infrastructure in Google Cloud -- to take advantage of data analysis and ML services in Google Cloud.
Is it necessary to create an additional read replica within the Google Cloud context? If not, is there an alternative strategy that is frequently used in this context to bridge the two cloud services?
While services like Amazon Relational Database Service (RDS) provides read-replica capabilities, it is only between managed database instances on AWS.
If you are replicating a database between providers, then you are probably running the database yourself on virtual machines rather than using a managed service. This means the databases appear just like any resource on the Internet, so you can connect them exactly the way you would connect two resources across the internet. However, you would be responsible for managing, monitoring, deploying, etc. This takes away from much of the benefit of using cloud services.
Replicating between storage services like Amazon S3 would be easier since it is just raw data rather than a running database. Also, Big Data is normally stored in raw format rather than being loaded into a database.
If the existing infrastructure is on a cloud provider, then try to perform the remaining activities on the same cloud provider.
We are currently using AWS RDS as our databases. In tables, we defined some insert or update triggers on tables. I would like to know if Bigquery also support triggers?
thanks
BigQuery is a data warehouse product, similar to AWS Redshift and AWS Athena and there is no trigger support.
If you used AWS RDS so far, you need to check Google CloudSQL.
Google Cloud SQL is an easy-to-use service that delivers fully managed
SQL databases in the cloud. Google Cloud SQL provides either MySQL or
PostgreSQL databases.
If you have a heavy load, then check out Google Cloud Spanner it's even better for full scalable relational db.
Cloud Spanner is the only enterprise-grade, globally-distributed, and
strongly consistent database service built for the cloud specifically
to combine the benefits of relational database structure with
non-relational horizontal scale.
Big Query doesn't have the feature as stated by the colleague above.
However it has an event api based on it's audit logs. You can inspect it and trigger events with cloud functions as per:
https://cloud.google.com/blog/topics/developers-practitioners/how-trigger-cloud-run-actions-bigquery-events
Regards