I was tasked to automate updating rows in Coda. I am currently testing out upsert rows. I want to configure it to only run if the value of {{steps.trigger.event.body.references[2].name}} is ‘add_to_Coda’. How am I going to do that in pipe dream ?
Related
When a query job is executed from bq command line tool with --batch option, if it is a single statement, it gets a BATCH priority. But if it is a set of statements, the parent SCRIPT job is assigned BATCH but individual statements are assigned INTERACTIVE priority. Same thing with a CALL to a stored procedure.
The priorities were observed from the information_schema.jobs view. The same behavior happens from Python API as well.
When a parent script job runs with BATCH priority, shouldn't the child jobs get BATCH priority as well? I did not find anything in the documentation that explains this. Maybe there is a reason for this.
Steps to reproduce:
bq query --batch --use_legacy_sql=False "select current_timestamp();"
-- This produces one entry in INFORMATION_SCHEMA.JOBS: QUERY/SELECT/BATCH
bq query --batch --use_legacy_sql=False "select current_timestamp();select current_timestamp();"
-- This produces 3 entries, the parent SCRIPT jobs is assigned batch, but the two child select jobs get INTERACTIVE. (see image)
Note: the behavior without the --batch flag, all three entries in JOBS is INTERACTIVE:
It is possible to get INTERACTIVE job priority even if your query is scheduled as BATCH priority. If the query has not started or queued within 24 hours, it will change to interactive priority which makes your query to be executed as soon as possible. BATCH and INTERACTIVE queries use the same resources.
You can go to this link for reference.
I am working on a use case,where I need to trigger DAG when a bigquery table is inserted with some records.
I am using Eventarc , and listening for insertJob event provided by Eventarc for bigquery.
It working almost fine, but I am getting 2 events whenever I insert the records. Event is also getting generated,when I query the table, and DAG is getting triggered 2 times.
This is my eventrc setting
Your eventarc configuration works well. When you perform a manual query, on the UI, you have, at least 2 insertjob entries.
Let's have a deeper look:
You have that first
Then that
Focus your attention on the latest lines. You could see a "dryrun" attribute.
Indeed, on the UI, you have a first dry run query performed to validate it and to get the bytebilled value (the volume of data processed by the query, displayed in the upper right corner).
Therefore 2 insert jobs: one with dry run, one without (the real query execution)
That being said, you have to check, in your Cloud Functions, if the dry run parameter is set or not in the event body.
I am trying to create an ETL process, I have the desired data stored in Big Query. Every time I want to run my process in Dataprep this error pops up:
The schema of the BigQuery table does not match the recipe (...)
To solve it I have to manually re-import the table from Big Query, my question is:
Is there a way to automate the manual re-import of the table on every scheduled run in order to solve this error?
Note: I found this question with a similar issue but the solution given was manual and I want an automated solution.
I am trying to run a SQL activity in Redshift cluster through data pipeline. After SQL activity, few logs need be written to a Table in redshift [such as number of rows affected, the error message(if any)].
Requirement:
If the sql Activity is finished successfully, the mentioned table will be written with 'error' column as null,
else if the sql Activity fails on any error, that particular error message is need to be updated into the 'error' column in Redshift table.
Can we able to achieve this through pipeline? If yes, How can we achieve this?
Thanks,
Ravi.
Unfortunately you cannot do this directly with SqlActivity in DataPipeline. The work around is to write a java program (or any executable) that does what you want and schedule it via Datapipeline using ShellCommandActivity.
I am a newbie in ETL and will be using Informatica soon for one of the requirements we have.
The requirement is that Informatica needs to monitor a table in Oracle for certain "trigger data" and as soon as that data is available in that table, Informatica should start executing steps in its workflow.
Is it possible to do this? If yes, could someone please point me to a link/document where this is explained.
Many thanks.
No, it is not possible (checked in PowerCenter 9.5.1).
The Event-Wait task supports only two types of events:
predefined events (the task instructs the Integration Service to wait for the specified indicator file to appear before continuing),
user-defined events (the event is triggered by an Event-Raise task somewhere in the workflow).
Yes it is possible and you will be needing a script that can be created with following steps.
--create a shell script that checks if data is present in table on not you can use this just by taking count of the table
--if count is grater than create an empty file say DUMMY.txt (by using touch command) at a specified path.
--in you Informatica scheduling either by scheduler or by script check every 5 mins if file is present.
--if file is present call you Informatica workflow and delete the DUMMY file.
--once workflow is completed start the process again.