query to give workflow statistics like source count,target count,start time and end time of each sessions - informatica

I have one workflow which contain five sessions. I am looking for a query by using informatica repository tables/views which give me output like below. I am not able to get a query which give me desired result.
workflow-names session-names source-count target-count session-start time session-end time.

If you have access to Repository metadata tables, then you can use below query
Metadata Tables used in query:
OPB_SESS_TASK_LOG
OPB_TASK_INST_RUN
OPB_WFLOW_RUN
Here the Repository user is INFA_REP, and workflow name is wf_emp_load.
SELECT w.WORKFLOW_NAME,
t.INSTANCE_NAME,
s.SRC_SUCCESS_ROWS,
s.TARG_SUCCESS_ROWS,
t.START_TIME,
t.END_TIME
FROM INFA_REP.OPB_SESS_TASK_LOG s
INNER JOIN INFA_REP.OPB_TASK_INST_RUN t
ON s.INSTANCE_ID=t.INSTANCE_ID
AND s.WORKFLOW_RUN_ID=t.WORKFLOW_RUN_ID
INNER JOIN INFA_REP.OPB_WFLOW_RUN w
ON w.WORKFLOW_RUN_ID=t.WORKFLOW_RUN_ID
WHERE w.WORKFLOW_RUN_ID =
(SELECT MAX(WORKFLOW_RUN_ID)
FROM INFA_REP.OPB_WFLOW_RUN
WHERE WORKFLOW_NAME='wf_emp_load')
ORDER BY t.START_TIME
Output
+---------------+---------------+------------------+-------------------+--------------------+--------------------+
| WORKFLOW_NAME | INSTANCE_NAME | SRC_SUCCESS_ROWS | TARG_SUCCESS_ROWS | START_TIME | END_TIME |
+---------------+---------------+------------------+-------------------+--------------------+--------------------+
| wf_emp_load | s_emp_load | 14 | 14 | 10-JUN-18 18:31:24 | 10-JUN-18 18:31:26 |
| wf_emp_load | s_emp_revert | 14 | 14 | 10-JUN-18 18:31:27 | 10-JUN-18 18:31:28 |
+---------------+---------------+------------------+-------------------+--------------------+--------------------+

Related

How to use prefetch_related to retrieve multiple rows similar to SQL result

I’ve a question about the usage of prefetch_related. Based on my understanding I need to use prefetch_related for reverse foreign key relationships
As an example I have a User(id, name) model and SchoolHistory(id, start_date, school_name, user_id[FK user.id]) model. A user can have multiple school history records.
If I’m querying the database using the following SQL query:
SELECT
user.id,
name,
start_date,
school_name
FROM user
INNER JOIN school_history ON school_history.user_id = user.id
the expected result would be:
| User ID | Name | Start Date | School |
| 1 | Human | 1/1/2022 | Michigan |
| 1 | Human | 1/1/2021 | Wisconsin |
| 2 | Alien | | |
This is the current result that I’m getting instead with ORM and a serializer:
| User ID | Name | school_history
| 1 | Human | [{start_date:1/1/2022 , school:Michigan}, {start_date:1/1/2021 , school:Wisconsin}] |
| 2 | Alien | [] |
This is the ORM query that I’m using:
User.objects.prefetch_related(
Prefetch(
‘school_history’
query_set=SchoolHistory.objects.order_by(‘start_date’)
)
)
Is there a way for the ORM query to have a similar result as SQL? I want multiple rows if there are multiple schools associated with that user

AWS Oracle DMS show full row each time

I have an Oracle RDS instance configured with DMS with an S3 target.
After full load I ongoing replication, when I update a row with a new value, the DMS file that is created only shows those columns that were updated, but I want the whole row in its current state in the database.
Example:
| client_id | client_name | age |
| :---: | :---: | :----: |
| 1 | John Smith| 46|
| 2 | Jane Doe | 25 |
I then update Johns age to be 47, I would expect the DMS to look like this:
| Op | DMS_TIMESTAMP | client_id | client_name | age |
| :---: | :----: | :---: | :---: | :---: |
| u | 2022-01-01 12:00:00 | 1 | John Smith | 47 |
However the file I receive looks like this:
| Op | DMS_TIMESTAMP | client_id | client_name | age |
| :---: | :----: | :---: | :---: | :---: |
| u | 2022-01-01 12:00:00 | 1 | null | 47 |
According to the docs the DMS row should represent the current state of the row but all of my columns that are not a primary key seem to be missing, despite the row having correct values in the database. Am I missing a configuration?
I was missing a part of the documentation that explains that if you want the values of all the columns of a row, you need to apply the following to the table:
alter table table_name ADD SUPPLEMENTAL LOG DATA (all) columns';
As I needed to apply this for all the tables in a schema, I created this loop to apply it.
BEGIN
FOR I IN (
SELECT
table_name,
owner
FROM
ALL_TABLES
WHERE
owner = 'SCHEMA_OWNER'
) LOOP
-- Print table name
BEGIN
DBMS_OUTPUT.PUT_LINE('Attempting to alter ' || I.table_name || ' at ' || current_timestamp);
EXECUTE IMMEDIATE 'alter table SCHEMA_OWNER.' || I.table_name || ' ADD SUPPLEMENTAL LOG DATA (all) columns';
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE(I.table_name || ' alteration failed at ' || current_timestamp);
END;
END LOOP;
END;

Add column to existing table in rds

I have table in RDS which consists two columns id and user activity at some time exactly values active/away.I get user activity every day so I need to add user activity column every day to that table.Any ideas how to do it?Now I have table with first two columns in RDS,but I am in stuck with how to add columns to that table
+-------------+------------+------------+
| id | 2020-08-13 | 2020-08-14 |
-----------------------------------------
| 12345 | active | away |
You could use an alter table ... add column, but this is not the right way to solve the problem.
In a relational database, you add additional rows for repeated data, not additional columns. So your table should look like this:
+-------------+-------------+------------+
| id | status_date | status |
------------------------------------------
| 12345 | 2020-08-13 | active |
| 12345 | 2020-08-14 | away |
Then you add a new row using an insert.

Django 1.4/1.5 control GROUP BY combined with HAVING

Say I have the following data:
| id | user_id | time |
| 1 | 1 | 10.0 |
| 2 | 1 | 12.0 |
| 3 | 2 | 11.0 |
| 4 | 2 | 13.0 |
What I want is to query this table such that I get the MIN(time) per user_id:
| id | user_id | time |
| 1 | 1 | 10.0 |
| 3 | 2 | 11.0 |
So my SQL would like this:
SELECT id, user_id, time FROM table GROUP BY user_id HAVING MIN(time)
However, when trying to use .annotate(Min('time')), Django will GROUP BY on arbitrary (wrong) fields. For example see the following code and the resulting (simplified) SQL:
>>> Table.objects.annotate(Min('time')).query
SELECT id, user_id, time, MIN(time) FROM table GROUP BY id, user_id, time
>>> Table.objects.values('id', 'time').annotate(Min('time')).query
SELECT id, time, MIN(time) FROM table GROUP BY id, time
The resulting SQL is far from my desired output. I'm currently working around this by using raw SQL, however this defeats the purpose of using an ORM in the first place. Also, the resulting code is difficult to reuse as normal .filter() cannot be applied.
There are similar questions about this type of querying, however they are rather old and do not incorporate changes to Django since 1.3.
It's not pretty code but. Using django's 'extra' queryset method and where as an argument you can achieve getting back the desired result set. i.e.
Table.objects.extra(where=['id IN (SELECT id FROM table_name'
' GROUP BY user_id HAVING MIN(time))'])

How to get a list of hosts connected to a mysql server

I am trying to get a list of hosts connected to a mysql server. How can i get this?
What should i do after connecting to the mysql server.
Code snippets will really help.
Also whats the best api to use to connect to mysql using c++?
One way you could do it is to execute the query show processlist, which will give you a table with Id, User, Host, db, Command, Time, State and Info columns. Remember that your show processlist query will be part of the output.
You can try this query: select distinct host from information_schema.processlist;
For example, there are multiple connections from 10.9.0.10 and one local connection.
mysql> select distinct host from information_schema.processlist;
+-----------------+
| host |
+-----------------+
| 10.9.0.10:63668 |
| 10.9.0.10:63670 |
| 10.9.0.10:63664 |
| 10.9.0.10:63663 |
| 10.9.0.10:63666 |
| 10.9.0.10:63672 |
| 10.9.0.10:63665 |
| 10.9.0.10:63671 |
| 10.9.0.10:63669 |
| 10.9.0.10:63667 |
| localhost |
| |
+-----------------+
12 rows in set (0,00 sec)
If you want only hosts (not different connections), you can try something like this: select distinct substring_index(host,':',1) from information_schema.processlist;
Example:
mysql> select distinct substring_index(host,':',1) from information_schema.processlist;
+-----------------------------+
| substring_index(host,':',1) |
+-----------------------------+
| 10.9.0.10 |
| localhost |
| |
+-----------------------------+
3 rows in set (0,00 sec)
You can see, that MySQL shows me one empty row, it is normal (i have a deamon process):
mysql> select distinct substring_index(host,':',1),`command` from information_schema.processlist;
+-----------------------------+---------+
| substring_index(host,':',1) | command |
+-----------------------------+---------+
| 10.9.0.10 | Sleep |
| localhost | Query |
| | Daemon |
+-----------------------------+---------+
You can remove it with where `command`!="Daemon" or where `host`!=''
And here is good link with query which also count connections from host and show which users are connected: http://blog.shlomoid.com/2011/08/how-to-easily-see-whos-connected-to.html