I need to design and display a compute engine snapshot report for different projects in the cloud in data-studio. For this, I am trying to use the below Google Compute Engine snapshot-api for retrieving data.
https://compute.googleapis.com/compute/v1/projects/my-project/global/snapshots
The data may change everyday depending on the snapshots created from the disks. So the report should display all the updated data.
Can this rest-api be called directly from Google data-studio?
Alternatively, what is the best/simplest way to display the response in data-studio?
You can use a Community Connector in Data Studio to directly pull the data from the API.
Currently, their is no way to connect GCP Compute Engine (GCE) resource data or use the REST API in Data Studio. The only products that are available on connecting data from GCP are the following:
BigQuery
Cloud Spanner
Cloud SQL for MySQL
Google Cloud Storage
MySQL
PostgreSQL
A possible way to design and display a Compute Engine Snapshot Report for different projects in the Cloud in Data Studio is by creating a Google App Script (to call the snapshot REST API) with a Google Sheet, and then import the data into the sheet on Data Studio.
Additionally, if you have any questions in regards to Data Studio, I would suggest reviewing the following documents below:
Data Studio Help Center
Data Studio Help Community
EDIT: My apologies, it seems that their is a way to show snapshot API response data in Data Studio by using a Community Connector to directly pull the data from the API.
Related
What is the best way to replicate data from Oracle Goldengate On premise to AWS (SQL or NOSQL)?
I was just checking this for azure,
My company is looking for solutions of moving data to the cloud
Minimal impact for on-prem legacy/3rd party systems.
No oracle db instances on the cloud side.
Minimum "hops" for the data between the source and destination.
Paas over IaaS solutions.
Out of the box features over native code and in-house development.
oracle server 12c or above
some custom filtering solution
some custom transformations
** filtering can be done in goldengate, in nifi, azure mapping, ksqldb
solutions are divided into:
If solution is alolwed to touch.read the logfile of the oracle server
you can use azure ADF, azure synapse, K2view, apache nifi, Orcle CDC adapter for BigData (check versions) to directly move data to the cloud buffered by kafka however the info inside the kafka will be in special-schema json format.
If you must use GG Trail file as input to your sync/etl paradigm you can
use a custom data provider that would translate the trailfile into a flowfile for nifi (you need to write it, see this 2 star project on github for a direction
use github project with gg for bigdata and kafka over kafkaconect to also get translated SQL dml and ddl statements which would make the solution much more readable
other solutions are corner cases, but i hope this gives you what you needed
In my company's case we have Oracle as a source db and Snowflake as a target db. We've built the following processing sequence:
On-premise OGG Extract works with on-premise Oracle DB.
Datapump sends trails to another host
On this host we have OGG for Big data Replicat that processes trails and then sends result as json to AWS S3 bucket.
Since Snowflake DB can handle JSON as a source of data and works with S3 bucket it loads jsons into staging tables where further processing takes place.
You can read more about this approach here: https://www.snowflake.com/blog/continuous-data-replication-into-snowflake-with-oracle-goldengate/
I am totaly new to the cloud in any way. I started some weeks ago with the Azure cloud and we setting up a project using many different products of Azure.
At the moment we think about setting up the project on a way that we are not trapped by Microsoft and are able to switch to GCP or AWS. For most products we use I have found similar ones in the other Clouds but I wonder if there is somthing like Azure Data Factory in AWS or CGP? I could not find something in my first google research.
Best and thanks for your help
If you need a good comparison between different cloud (Azure, AWS, Google, Oracle, and Alibaba) use this site: http://comparecloud.in/
Example for your case with "Azure Data Factory":
You could use a mix of those products:
[Cloud Data Fusion]https://cloud.google.com/composer
Cloud DataPrep: This is a version of Trifacta. Good for data cleaning.
If you need to orchestrate workflows / etls, Cloud composer will do it for you. It is a managed Apache Airflow. Which means it will handle complex dependencies.
If you just need to trigger a job on a daily basis, Cloud Scheduler is your friend.
You can check the link here which is cloud services mapping
I have a database on a Google Cloud SQL instance. I want to connect the database to pgBadger which is used to analyse the query. I have tried finding various methods, but they are asking for the log file location.
I believe there are 2 major limitations preventing an easy set up that would allow you to use pgBadger with logs generated by a Cloud SQL instance.
The first is the fact that Cloud SQL logs are processed by Stackdriver, and can only be accessed through it. It is actually possible to export logs from Stackdriver, however the outcome format and destination will still not meet the requirements for using pgBadger, which leads to the second major limitation.
Cloud SQL does not allow changes in all required configuration directives. The major one is the log_line_prefix, which currently does not follow the required format and it is not possible to change it. You can actually see what flags are supported in Cloud SQL in the Supported flags documentation.
In order to use pgBadger you would need to reformat the log entries, while exporting them to a location where pgBadger could do its job. Stackdriver can stream the logs through Pub/Sub, so you could develop an app to process and store them in the format you need.
I hope this helps.
Could anyone help me in providing some pointers on how do we implement Data lineage on a DW type solution built on Google BigQuery using Google Cloud storage as source and Google Cloud Composer as the workflow manager to implement a series of SQL's?
If you have your data in Cloud Storage, you would maybe like to use something like GoogleCloudStorageToBigQueryOperator to first load your data in Bigquery, then use BigQueryOperator to run your queries.
Then you could see how your different DAGs,tasks etc are running in the Airflow Web UI inside Composer.
We have in house project which we are developing in our cloud environment.
So we wanted to use PowerBi as the visualization tool. So can you please suggest if we can publish PowerBi files into our Cloud environment other than azure.
This functionality is coming to some degree in SQL Server Reporting Services vNext.