I am performing Mysql to bigquery data migration using jdbc to bigquery template in dataflow.
But while performing "select * from teable1" command on mysql, i also want to insert the selected data to another table in same database for some reason.
How can i perform both select and insert queries in dataflow template? I got error when used semicolon between two queries.
The Jdbc to Bigquery template will write all data you read to the table specified under "Bigquery output table" (<my-project>:<my-dataset>.<my-table>), so there is no need to write the insert statement.
(The parameter is "outputTable" for gcloud/REST)
As #PeterKim mentioned the JDBC to BigQuery termplate could be not the best approach for your use case.
You could try to use that template as reference and modify it to write into MySQL, in this post you will find an implementation about how to make an insert into MYSQL database.
After modifying the pipeline source code you can create a custom template.
Related
I'm trying to ingest data from different tables with in same database using Data fusion Multiple database tables plugin to bigquery tables using multiple big query tables sink. I write 3 different custom SQL and add them inside the plugin section which is under "Data Section Mode" > "Custom SQL Statements".
The problem is When I preview or deploy and run the pipeline I get the error "BigQuery Multi Table has no outputs. Please check that the sink calls addOutput at some point."
What I try to figure out this problem;
Run custom SQL on database and worked properly.
Create pipelines that are specific for custom SQLs but it's like 1 table ingestion from sql server to bigquery table as sink. it worked properly.
Try different Data Section Mode under multiple database tables plugin that is Table Allow List , works but it's just insert all data with no option to transform any column or filtering. Did that one to see if plugin can reach the database and able to read data ,it can read.
Data Pipeline - Multiple Database Tables Plugin Config - 1
Data Pipeline - Multiple Database Tables Plugin Config - 2
As a conclusion I would like to ingest data from one database with multiple tables with in one data pipeline. If possible I would like to do it with writing custom sqls for each tables.
Open for any advice and try.
Thank you.
I am creating a dataset in AWS Quicksight using custom SQL which I prepare/test in Athena. However, unless I define each join/table "databasename".table, the QS custom SQL fails. I have tried the below but it has failed. Is it possible to instruct the query to fun against a specific DB at the beginning of the query?
USING AwsDataCatalog."databasename"
In the data preparation, in the custom SQL page, on the left pane, you should be able to choose the database name (Schema).
If you do not set that, then it will use Athena's default schema so you have to fully qualify all table names.
I am really new to GCP and I am trying to Query in a GCP BigQuery to fetch all data from one BigQuery table and Insert all into another BigQuery table
I am trying the Following query where Project 1 & Dataset.Table1 is the Project where I am trying to read the data. and Project 2 and Dataset2.Table2 is the Table where I am trying to Insert all the data with the same Naming
SELECT * FROM `Project1.DataSet1.Table1` LIMIT 1000
insert INTO `Project2.Dataset2.Table2`
But am I receiving a query error message?
Does anyone know how to solve this issue?
There may be a couple of comments...
The syntax might be different => insert into table select and so on - see DML statements in the standard SQL
Such approach of data coping might not be very optimal considering time and cost. It might be better to use bq cp -f ... commands - see BigQuery Copy — How to copy data efficiently between BigQuery environments and bq command-line tool reference - if that is possible in your case.
The correct syntax of the query is as suggested by #al-dann. I will try to explain further with a sample query as below:
Query:
insert into `Project2.Dataset2.Table2`
select * from `Project1.DataSet1.Table1`
Input Table:
This will insert values into the second table as below:
Output Table:
I am new in informatica cloud. I have list of queries ready in my table. Like below.
Now I want to take one by one query from this table which work as a source query and whatever results return which I need to load into target. All tables were already created in source and target.
I just need to copy the data based on dynamic queries which kept in my one of sql tables.
If anyone have any idea then please share your toughs with me. It great helps to me.
The source connection will be the connector to your source database and the Source Type will be query. From there it depends how you are managing your variables. See thread on Informatica Network for links to multiple examples.
Read the table like normally you would do in the cloud. Then pass each of the record into the sql transformation for execution. configure where the sql transformation has to execute and it will run the queries in the database you want.
you can use a SQL task to run dynamic SQL queries.
link to using SQL task approach: https://www.datastackpros.com/2019/12/informatica-cloud-incremental-load_14.html
I want to provide runtime values to the query in Select & Create table statements. What are the ways to parameterize Athena SQL queries?
I tried with PREPARE & EXECUTE statements from Presto however it is not working in Athena console. Do we need any external scripts like Python to call it?
PREPARE my_select1
FROM SELECT * from NATION;
EXECUTE my_select1 USING 1;
The SQL and HiveQL Reference documentation does not list PREPARE nor EXECUTE as available commands.
You would need to fully construct your SELECT statement before sending it to Amazon Athena.
You have to upgrade to Athena engine version 2 and now this seems to be supported as of 2021-03-12 but I can't find an official report:
https://docs.aws.amazon.com/athena/latest/ug/querying-with-prepared-statements.html
Athena does not support Parameterized queries. How ever you can create user-defined functions that you can call in the body of a query. Refer to this to know more about UDFs.