AWS Oracle DMS show full row each time

AWS Oracle DMS show full row each time - amazon-web-services

I have an Oracle RDS instance configured with DMS with an S3 target.
After full load I ongoing replication, when I update a row with a new value, the DMS file that is created only shows those columns that were updated, but I want the whole row in its current state in the database.
Example:
| client_id | client_name | age |
| :---: | :---: | :----: |
| 1 | John Smith| 46|
| 2 | Jane Doe | 25 |
I then update Johns age to be 47, I would expect the DMS to look like this:
| Op | DMS_TIMESTAMP | client_id | client_name | age |
| :---: | :----: | :---: | :---: | :---: |
| u | 2022-01-01 12:00:00 | 1 | John Smith | 47 |
However the file I receive looks like this:
| Op | DMS_TIMESTAMP | client_id | client_name | age |
| :---: | :----: | :---: | :---: | :---: |
| u | 2022-01-01 12:00:00 | 1 | null | 47 |
According to the docs the DMS row should represent the current state of the row but all of my columns that are not a primary key seem to be missing, despite the row having correct values in the database. Am I missing a configuration?

I was missing a part of the documentation that explains that if you want the values of all the columns of a row, you need to apply the following to the table:
alter table table_name ADD SUPPLEMENTAL LOG DATA (all) columns';
As I needed to apply this for all the tables in a schema, I created this loop to apply it.
BEGIN
FOR I IN (
SELECT
table_name,
owner
FROM
ALL_TABLES
WHERE
owner = 'SCHEMA_OWNER'
) LOOP
-- Print table name
BEGIN
DBMS_OUTPUT.PUT_LINE('Attempting to alter ' || I.table_name || ' at ' || current_timestamp);
EXECUTE IMMEDIATE 'alter table SCHEMA_OWNER.' || I.table_name || ' ADD SUPPLEMENTAL LOG DATA (all) columns';
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE(I.table_name || ' alteration failed at ' || current_timestamp);
END;
END LOOP;
END;

Related

how to make a query in sql to use in c++ code on counting the borrowing report in every month

I want to make a query to count borrowing report every month . But i'd saved my data in unixtime.
tablename:borrow
attributes:borrowingID,dateOfBorrow,dateOfReturn,statusBook
For example the dateOfBorrow is 167077440 and i just want to count the specific month for jan,feb,etc..
i am expecting
| Month | Total |
| ------| ----- |
| Jan | 2 |
| Feb | 5 |
| Mar | 5 |
...etc

select from_unixtime(167077440),from_unixtime(167077440,'%b')
+--------------------------+-------------------------------+
| from_unixtime(167077440) | from_unixtime(167077440,'%b') |
+--------------------------+-------------------------------+
| 1975-04-18 19:24:00 | Apr |
+--------------------------+-------------------------------+
1 row in set (0.001 sec)
See manual https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_from-unixtime and https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_date-format
But are you really interested in 1975?

Basic Join Question to Join A Table to Itself based on a condition in Big Query

I want to join a table to itself to apply values of a column to all the values
in another column. LOGIC IS- WHEN WORKFLOW = "PICK" then TAKE THE DAY and USE THAT DAY FOR THE WHOLE ID.
TABLE
ID | Workflow | Day |
1 | PICK | 2022.11.01 |
1 | tew | 2022.11.02 |
1 | wte | 2022.11.03 |
1 | | 2022.11.04 |
1 | | 2022.11.05 |
2 | PICK | 2022.11.06 |
The answer would be
ID | Workflow | Day |
1 | PICK | 2022.11.01 |
1 | tew | 2022.11.01 |
1 | wte | 2022.11.01 |
1 | | 2022.11.01 |
1 | | 2022.11.01 |
2 | PICK | 2022.11.06 |
The Workflow = 'PICK' date was applied to the whole ID as the date.

You might consider a window function instead of self join.
WITH sample_data AS (
SELECT 1 ID, 'PICK' Workflow, '2022.11.01' Day UNION ALL
SELECT 1 ID, 'tew' Workflow, '2022.11.02' Day UNION ALL
SELECT 1 ID, 'wte' Workflow, '2022.11.03' Day UNION ALL
SELECT 1 ID, null Workflow, '2022.11.04' Day UNION ALL
SELECT 1 ID, null Workflow, '2022.11.05' Day UNION ALL
SELECT 2 ID, 'PICK' Workflow, '2022.11.06' Day
)
SELECT ID, Workflow,
STRING_AGG(IF(Workflow = 'PICK', Day, NULL)) OVER (PARTITION BY ID) AS Day
FROM sample_data;
Query results
If you need to use self join, you can consider below query as well having same output above.
SELECT t.* EXCEPT(Day), s.Day
FROM sample_data t
LEFT JOIN sample_data s ON t.ID = s.ID AND s.Workflow = 'PICK';

How do I repeat row labels in a matrix?

I have data showing me the dates grouped like this:
For security reasons, I had to remove the Customer Description detail, due to confidentiality.
How do I repeat the date column the same way you repeat the Row Labels in an Excel Pivot?
I've looked, but couldn't find a solution to this - this option should be available.
EDIT
When you have the following source data in Excel:
Date | Customer | Item Description | Qty Out | Unit Price | Sales
--------------------------------------------------------------------------------------------------------------------------------------------
14/08/2020 | Customer 1 | Item 11 | 4.00 | 65.00 | 260.00
14/08/2020 | Customer 2 | Item 12 | 56.00 | 12.00 | 672.00
14/08/2020 | Customer 3 | Item 13 | 64.00 | 35.00 | 2,240.00
14/08/2020 | Customer 4 | Item 14 | 29.00 | 65.00 | 1,885.00
15/08/2020 | Customer 2 | Item 15 | 746.00 | 12.00 | 8,952.00
15/08/2020 | Customer 3 | Item 16 | 14.00 | 75.00 | 1,050.00
15/08/2020 | Customer 4 | Item 17 | 45.00 | 741.00 | 33,345.00
15/08/2020 | Customer 5 | Item 18 | 456.00 | 125.00 | 57,000.00
15/08/2020 | Customer 6 | Item 19 | 925.00 | 17.00 | 15,725.00
16/08/2020 | Customer 4 | Item 20 | 6.00 | 532.00 | 3,192.00
16/08/2020 | Customer 5 | Item 21 | 56.00 | 94.00 | 5,264.00
16/08/2020 | Customer 6 | Item 22 | 546.00 | 37.00 | 20,202.00
You then pivot this data using Microsoft Excel, where you get the following:
You then choose the option to Repeat Item Labels as can be seen below:
After selecting this, you get my expected results I require in Power BI:
Is there not a function available like this in Power BI?

Just adding this for your reference as a work around. Check this below image with a custom column created in the Power Query Editor-
date_customer = Date.ToText([Date]) &" : "& [Customer]
Then added both Date and date_customer in the Matrix row level. The output is as below- (using your sample data)
ANOTHER OPTION Another option is to add Date and Customer in the Matrix row and the output is will be as below- (using your sample data)
This is also a meaningful output as date are showing as a group header. But in case of requirement of having redundant date to show, you can consider the first option.

query to give workflow statistics like source count,target count,start time and end time of each sessions

I have one workflow which contain five sessions. I am looking for a query by using informatica repository tables/views which give me output like below. I am not able to get a query which give me desired result.
workflow-names session-names source-count target-count session-start time session-end time.

If you have access to Repository metadata tables, then you can use below query
Metadata Tables used in query:
OPB_SESS_TASK_LOG
OPB_TASK_INST_RUN
OPB_WFLOW_RUN
Here the Repository user is INFA_REP, and workflow name is wf_emp_load.
SELECT w.WORKFLOW_NAME,
t.INSTANCE_NAME,
s.SRC_SUCCESS_ROWS,
s.TARG_SUCCESS_ROWS,
t.START_TIME,
t.END_TIME
FROM INFA_REP.OPB_SESS_TASK_LOG s
INNER JOIN INFA_REP.OPB_TASK_INST_RUN t
ON s.INSTANCE_ID=t.INSTANCE_ID
AND s.WORKFLOW_RUN_ID=t.WORKFLOW_RUN_ID
INNER JOIN INFA_REP.OPB_WFLOW_RUN w
ON w.WORKFLOW_RUN_ID=t.WORKFLOW_RUN_ID
WHERE w.WORKFLOW_RUN_ID =
(SELECT MAX(WORKFLOW_RUN_ID)
FROM INFA_REP.OPB_WFLOW_RUN
WHERE WORKFLOW_NAME='wf_emp_load')
ORDER BY t.START_TIME
Output
+---------------+---------------+------------------+-------------------+--------------------+--------------------+
| WORKFLOW_NAME | INSTANCE_NAME | SRC_SUCCESS_ROWS | TARG_SUCCESS_ROWS | START_TIME | END_TIME |
+---------------+---------------+------------------+-------------------+--------------------+--------------------+
| wf_emp_load | s_emp_load | 14 | 14 | 10-JUN-18 18:31:24 | 10-JUN-18 18:31:26 |
| wf_emp_load | s_emp_revert | 14 | 14 | 10-JUN-18 18:31:27 | 10-JUN-18 18:31:28 |
+---------------+---------------+------------------+-------------------+--------------------+--------------------+

DynamoDB with daily/weekly/monthly aggregated values

My application is creating a log file every 10min, which I want to store in DynamoDB in an aggregated way, e.g. 144 log files per day, 1008 log files per week or ~4400 log files per month.
I have different partition keys, but for sake of simplicity I have used only a single partition key in the following examples.
The straight forward solution would be to have different tables, e.g.
Table "TenMinLogsDay":
id (=part.key) | date (=sort key) | cntTenMinLogs | data
-------------- | ---------------- | ------------- | -------------------------------
1 | 2017-04-30 | 144 | some serialized aggregated data
1 | 2017-05-01 | 144 | some serialized aggregated data
1 | 2017-05-02 | 144 | some serialized aggregated data
1 | 2017-05-03 | 144 | some serialized aggregated data
Table "TenMinLogsWeek":
id (=part.key) | date (=sort key) | cntTenMinLogs | data
-------------- | ---------------- | ------------- | -------------------------------
1 | 2017-05-01 | 1008 | some serialized aggregated data
1 | 2017-05-08 | 1008 | some serialized aggregated data
1 | 2017-05-15 | 1008 | some serialized aggregated data
Table "TenMinLogsMonth":
id (=part.key) | date (=sort key) | cntTenMinLogs | data
-------------- | ---------------- | ------------- | -------------------------------
1 | 2017-05-01 | 4464 | some serialized aggregated data
1 | 2017-06-01 | 4320 | some serialized aggregated data
1 | 2017-07-01 | 4464 | some serialized aggregated data
I would prefer however a combined table. Out of the box DynamoDB does not seem to support this.
Also, I want to query either the daily OR the weekly OR the monthly aggregated items, thus I don't want to use the filter feature for this.
The following solution would be possible, but seems like a poor hack:
Table "TenMinLogsCombined":
id (=part.key) | date (=sort key) | week (=LSI sort key) | month (=LSI sort key) | cntTenMinLogs | data
-------------- | ---------------- | -------------------- | --------------------- | ------------- | -----
1 | 2017-04-30 | (empty) | (empty) | 144 | ...
1 | 2017-05-01 | (empty) | (empty) | 144 | ...
1 | 0017-05-01 | 2017-05-01 | (empty) | 1008 | ...
1 | 1017-05-01 | (empty) | 2017-05-01 | 4464 | ...
1 | 2017-05-02 | (empty) | (empty) | 144 | ...
1 | 2017-05-03 | (empty) | (empty) | 144 | ...
Explanation:
By using the year "0017" and "1017" instead of "2017" I can query the date range for, e.g. 2017-05-01 to 2017-05-04 and DynamoDB won't read the items starting with 0017 or 1017
For week or month range queries, such a hack is not required, as empty LSI sort keys are possible.
Does anybody know of a better way to achieve this?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

AWS Oracle DMS show full row each time - amazon-web-services

Related

how to make a query in sql to use in c++ code on counting the borrowing report in every month

Basic Join Question to Join A Table to Itself based on a condition in Big Query

How do I repeat row labels in a matrix?

query to give workflow statistics like source count,target count,start time and end time of each sessions

DynamoDB with daily/weekly/monthly aggregated values

Categories

Resources