BigQuery Multi Table has no outputs. Please check that the sink calls addOutput at some point error from Multiple database table plugin - google-cloud-platform

I'm trying to ingest data from different tables with in same database using Data fusion Multiple database tables plugin to bigquery tables using multiple big query tables sink. I write 3 different custom SQL and add them inside the plugin section which is under "Data Section Mode" > "Custom SQL Statements".
The problem is When I preview or deploy and run the pipeline I get the error "BigQuery Multi Table has no outputs. Please check that the sink calls addOutput at some point."
What I try to figure out this problem;
Run custom SQL on database and worked properly.
Create pipelines that are specific for custom SQLs but it's like 1 table ingestion from sql server to bigquery table as sink. it worked properly.
Try different Data Section Mode under multiple database tables plugin that is Table Allow List , works but it's just insert all data with no option to transform any column or filtering. Did that one to see if plugin can reach the database and able to read data ,it can read.
Data Pipeline - Multiple Database Tables Plugin Config - 1
Data Pipeline - Multiple Database Tables Plugin Config - 2
As a conclusion I would like to ingest data from one database with multiple tables with in one data pipeline. If possible I would like to do it with writing custom sqls for each tables.
Open for any advice and try.
Thank you.

Related

Create PowerBI Datamart from Azure Analysis Service

I am trying to create PowerBI Datamart from Azure Analyis service. There is a datamodel available in the Azure Analysis Service and I can connect using URL and Database Name. The datamodel has ~100 tables present in it and relationship also setup. So my question is, if I want to create a PowerBI datamart from the Azure Analyis service datamode, I need to do the Get Data option of PowerBI datamart and connect to Azure Analyis service, select table, select fields 100 time for getting all the tables of Azure Analyis service datamode into my PowerBI datamart? Is there any import function available where I can import all the tables in a single time?
Why do you want to copy data from AAS into a database?
The reason you find it difficult is that it's an odd thing to do. The query designer for AAS/SSAS generates MDX queries which are indented to run aggregate queries that return a handful of rows, and are wholly unsuitable for extracting whole tables. If you try, the queries will just run forever and fail.
It's possible to extract data from AAS/SSAS tabular models, but you must use DAX not MDX, and so you need to the Power Query or "Transform Data" window, and use the advanced editor.
Each query to load a table should look like this, eg to load the 'Customer' table:
let
Dax = "evaluate Customer",
Source = AnalysisServices.Database("asazure://southcentralus.asazure.windows.net/myserver", "mydatabase", [Query=Dax])
in
Source

Data Fusion replication pipeline is not syncing data in Google Bigquery

Hi we want to replicate the data from Mysql(source) to GoogleBigquery(destination) we adopted the method described by google Docs with Data fusion replication pipeline as mentioned in Link
https://cloud.google.com/data-fusion/docs/tutorials/replicating-data/mysql-to-bigquery
Berief of what we are doing:
Enabling bin log in MY SQL for CDC(Change data Capture)
creating a replication pipeline in data fusion
starting the pipeline and syncing the data
we are successfully able to create MySql data in comupute engine and enabling bin-log for CDC and provided all necessary permission to user for the data replication pipeline in my SQL
we are successful in creating a data Fusion instance and able to create a replication pipeline
replication pipeline is able to fetch our SQL database details and target Big query is also set
On starting the pipeline it is tracking the Changes successfully (Insert,update and delete ) and table Schema is also created in Bigquery Successfully automatically.
But we are getting PROBLEM that no data is getting transsferred to Bigquery table. In log what i have seen is loading batch of 1 event in to statging Bucket
sharing the screenshot also
able to fetch every change from MYSQL but data is not transferring to bigquery
table schema was created but data is not transferred
loading batch of 1 event in to statging Bucket we are using developer mode and waited for more than 90 mins
The issue might be happening because there may be a schema/data type mismatch with the BigQuery table and the source MYSQL database table on the columns.
For example: if you have a column in source table, in BigQuery this column is of INT64 datatype with a length of 19, while in the source database table, it is Integer type with a length of 10, so you need to update the length of columns as per your datasize.

Optimize data load from Azure Cosmos DB to Power BI

Currently we have a problem with loading data when updating the report data with respect to the DB, since it has too many records and it takes forever to load all the data. The issue is how can I load only the data from the last year to avoid taking so long to load everything. As I see, trying to connect to the COSMO DB in the box allows me to place an SQL query, but I don't know how to do it in this type of non-relational database.
Example
Power BI has an incremental refresh feature. You should be able to refresh the current year only.
If that still doesn’t meet expectations I would look at a preview feature called Azure Synapse Link which automatically pulls all Cosmos DB updates out into analytical storage you can query much faster in Azure Synapse Analytics in order to refresh Power BI faster.
Depending on the volume of the data you will hit a number of issues. First is you may exceed your RU limit, slowing down the extraction of the data from CosmosDB. The second issue will be the transforming of the data from JSON format to a structured format.
I would try to write a query to specify the fields and items that you need. That will reduce the time of processing and getting the data.
For SQL queries it will be some thing like
SELECT * FROM c WHERE c.partitionEntity = 'guid'
For more information on the CosmosDB SQL API syntax please see here to get you started.
You can use the query window in Azure to run the SQL commands, or Azure Storage Explorer to test the query, then move it to Power BI.
What is highly recommended is to extract the data into a place where is can be transformed into a strcutured format like a table or csv file.
For example use Azure Databricks to extract, then turn the JSON format into a table formatted object.
You do have the option of using running Databricks notebook queries in CosmosDB, or Azure DataBricks in its own instance. One other option would to use change feed to send the data and an Azure Function to send and shred the data to Blob Storage and query it from there, using Power BI, DataBricks, Azure SQL Database etc.
In the Source of your Query, you can make a select based on the CosmosDB _ts system property, like:
Query ="SELECT * FROM XYZ AS t WHERE t._ts > 1609455599"
In this case, 1609455599 is the timestamp which corresponds to 31.12.2020, 23:59:59. So, only data from 2021 will be selected.

Migrate data to SQL DW for multiple tables

I'm currently using Azure Data Factory to move over data from an Azure SQL database to an Azure DW instance.
This works fine for one table, but I have a lot of tables I'd like to move over. Using Azure Data Facory, it looks like I need to create a set of Source/Sink datasets and pipelines for every table in the database.
Is there a way to move multiple tables across without have to set up each table in the manner described above?
The copy operation allows you to select multiple tables to move in a single pipeline. From the Azure SQL Data Warehouse portal you can follow this process to setup a multi-table pipeline:
Click on the Load Data button
Select Azure Data Factory
Create a new data factory or use an existing one - ensure that the Load Data select is chosen
Select the Run once now option
Choose your Azure SQL Database source and enter the credentials
On the Select Tables screen, select multiple tables
Continue the Pipeline, save and execute

AWS Glue control column order in console

I'm just starting to experiment with AWS glue and I've successfully been able to pull data from my Aurora MySQL environment into my PostgreSQL DB. When the crawler creates the data catalog for the table I'm experimenting with, all the columns are out of order, and then when the job creates the destination table, the columns again are out of order, I'm assuming because it's created based off of what the crawler generated. How can I make the table structure in the catalog match what's in the source DB?
You can simply open the tabke that create by crawler then click on "edit schema", click on the number at the start of each row and change them, that are the order number of the rows.