How can I optimize this Postgres query? - django

I have a very slow query that I found in my logs and I don't know how to optimize it with an index.
This is the query and the explain:
SELECT
"myapp_image"."id",
"myapp_image"."deleted",
"myapp_image"."title",
"myapp_image"."subject_type",
"myapp_image"."data_source",
"myapp_image"."objects_in_field",
"myapp_image"."solar_system_main_subject",
"myapp_image"."description",
"myapp_image"."link",
"myapp_image"."link_to_fits",
"myapp_image"."image_file",
"myapp_image"."uploaded",
"myapp_image"."published",
"myapp_image"."updated",
"myapp_image"."watermark_text",
"myapp_image"."watermark",
"myapp_image"."watermark_position",
"myapp_image"."watermark_size",
"myapp_image"."watermark_opacity",
"myapp_image"."user_id",
"myapp_image"."plot_is_overlay",
"myapp_image"."is_wip",
"myapp_image"."size",
"myapp_image"."w",
"myapp_image"."h",
"myapp_image"."animated",
"myapp_image"."license",
"myapp_image"."is_final",
"myapp_image"."allow_comments",
"myapp_image"."moderator_decision",
"myapp_image"."moderated_when",
"myapp_image"."moderated_by_id",
"auth_user"."id",
"auth_user"."password",
"auth_user"."last_login",
"auth_user"."is_superuser",
"auth_user"."username",
"auth_user"."first_name",
"auth_user"."last_name",
"auth_user"."email",
"auth_user"."is_staff",
"auth_user"."is_active",
"auth_user"."date_joined",
"myapp_userprofile"."id",
"myapp_userprofile"."deleted",
"myapp_userprofile"."user_id",
"myapp_userprofile"."updated",
"myapp_userprofile"."real_name",
"myapp_userprofile"."website",
"myapp_userprofile"."job",
"myapp_userprofile"."hobbies",
"myapp_userprofile"."timezone",
"myapp_userprofile"."about",
"myapp_userprofile"."premium_counter",
"myapp_userprofile"."company_name",
"myapp_userprofile"."company_description",
"myapp_userprofile"."company_website",
"myapp_userprofile"."retailer_country",
"myapp_userprofile"."avatar",
"myapp_userprofile"."exclude_from_competitions",
"myapp_userprofile"."default_frontpage_section",
"myapp_userprofile"."default_gallery_sorting",
"myapp_userprofile"."default_license",
"myapp_userprofile"."default_watermark_text",
"myapp_userprofile"."default_watermark",
"myapp_userprofile"."default_watermark_size",
"myapp_userprofile"."default_watermark_position",
"myapp_userprofile"."default_watermark_opacity",
"myapp_userprofile"."accept_tos",
"myapp_userprofile"."receive_important_communications",
"myapp_userprofile"."receive_newsletter",
"myapp_userprofile"."receive_marketing_and_commercial_material",
"myapp_userprofile"."language",
"myapp_userprofile"."seen_realname",
"myapp_userprofile"."seen_email_permissions",
"myapp_userprofile"."signature",
"myapp_userprofile"."signature_html",
"myapp_userprofile"."show_signatures",
"myapp_userprofile"."post_count",
"myapp_userprofile"."autosubscribe",
"myapp_userprofile"."receive_forum_emails"
FROM "myapp_image"
LEFT OUTER JOIN "myapp_apps_iotd_iotd"
ON ("myapp_image"."id" = "myapp_apps_iotd_iotd"."image_id")
INNER JOIN "auth_user"
ON ("myapp_image"."user_id" = "auth_user"."id")
LEFT OUTER JOIN "myapp_userprofile"
ON ("auth_user"."id" = "myapp_userprofile"."user_id")
WHERE ("myapp_image"."is_wip" = FALSE
AND NOT ("myapp_image"."id" IN (SELECT
U0."id" AS Col1
FROM "myapp_image" U0
LEFT OUTER JOIN "myapp_apps_iotd_iotdvote" U1
ON (U0."id" = U1."image_id")
WHERE U1."id" IS NULL)
)
AND "myapp_apps_iotd_iotd"."id" IS NULL
AND "myapp_image"."id" < 372320
AND "myapp_image"."deleted" IS NULL)
ORDER BY "myapp_image"."id" DESC
LIMIT 1;
QUERY PLAN:
Limit (cost=36302.74..36302.75 rows=1 width=1143) (actual time=1922.839..1923.002 rows=1 loops=1)
-> Sort (cost=36302.74..36302.75 rows=1 width=1143) (actual time=1922.836..1922.838 rows=1 loops=1)
Sort Key: myapp_image.id DESC
Sort Method: top-N heapsort Memory: 26kB
-> Nested Loop Left Join (cost=17919.42..36302.73 rows=1 width=1143) (actual time=1332.216..1908.796 rows=3102 loops=1)
-> Nested Loop (cost=17919.14..36302.40 rows=1 width=453) (actual time=1332.195..1867.675 rows=3102 loops=1)
-> Hash Left Join (cost=17918.85..36302.09 rows=1 width=321) (actual time=1332.164..1815.315 rows=3102 loops=1)
Hash Cond: (myapp_image.id = myapp_apps_iotd_iotd.image_id)
Filter: (myapp_apps_iotd_iotd.id IS NULL)
Rows Removed by Filter: 722
-> Seq Scan on myapp_image (cost=17856.32..35882.67 rows=135958 width=321) (actual time=1329.110..1801.409 rows=3824 loops=1)
Filter: ((NOT is_wip) AND (deleted IS NULL) AND (NOT (hashed SubPlan 1)) AND (id < 372320))
Rows Removed by Filter: 305733
SubPlan 1
-> Gather (cost=1217.31..17856.31 rows=1 width=4) (actual time=36.399..680.882 rows=305712 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Left Join (cost=217.31..16856.21 rows=1 width=4) (actual time=52.855..536.185 rows=101904 loops=3)
Hash Cond: (u0.id = u1.image_id)
Filter: (u1.id IS NULL)
Rows Removed by Filter: 2509
-> Parallel Seq Scan on myapp_image u0 (cost=0.00..14672.82 rows=128982 width=4) (actual time=0.027..175.375 rows=103186 loops=3)
-> Hash (cost=123.25..123.25 rows=7525 width=8) (actual time=52.601..52.602 rows=7526 loops=3)
Buckets: 8192 Batches: 1 Memory Usage: 358kB
-> Seq Scan on myapp_apps_iotd_iotdvote u1 (cost=0.00..123.25 rows=7525 width=8) (actual time=0.038..33.074 rows=7526 loops=3)
-> Hash (cost=35.57..35.57 rows=2157 width=8) (actual time=3.013..3.015 rows=2157 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 117kB
-> Seq Scan on myapp_apps_iotd_iotd (cost=0.00..35.57 rows=2157 width=8) (actual time=0.014..1.480 rows=2157 loops=1)
-> Index Scan using auth_user_id_pkey on auth_user (cost=0.29..0.31 rows=1 width=132) (actual time=0.012..0.012 rows=1 loops=3102)
Index Cond: (id = myapp_image.user_id)
-> Index Scan using myapp_userprofile_user_id on myapp_userprofile (cost=0.29..0.33 rows=1 width=690) (actual time=0.008..0.008 rows=1 loops=3102)
Index Cond: (auth_user.id = user_id)
Planning time: 1.722 ms
Execution time: 1925.867 ms
(34 rows)
There is a long Seq Scan on myapp_image and I have tried adding the following index but it made it even slower:
create index on myapp_image using btree (is_wip, deleted, id);
What could be my optimization strategy?
This query is generated by the Django ORM and at this time I don't know yet what code path made it.

Based on generated query:
SELECT *
FROM "myapp_image"
LEFT OUTER JOIN "myapp_apps_iotd_iotd"
ON ("myapp_image"."id" = "myapp_apps_iotd_iotd"."image_id")
INNER JOIN "auth_user"
ON ("myapp_image"."user_id" = "auth_user"."id")
LEFT OUTER JOIN "myapp_userprofile"
ON ("auth_user"."id" = "myapp_userprofile"."user_id")
WHERE ("myapp_image"."is_wip" = false
AND NOT ("myapp_image"."id" IN (SELECT U0."id" AS Col1
FROM "myapp_image" U0
LEFT OUTER JOIN "myapp_apps_iotd_iotdvote" U1
ON (U0."id" = U1."image_id")
WHERE U1."id" IS NULL))
AND "myapp_apps_iotd_iotd"."id" IS NULL
AND "myapp_image"."id" < 372320
AND "myapp_image"."deleted" IS NULL)
ORDER BY "myapp_image"."id" DESC
LIMIT 1;
I propose to add index:
create index on myapp_image using btree (id, is_wip, deleted);
-- id as a first column

Related

Fast query become super slow into a procedure

I have a problem on a query, that for privacy reasons I can't show you (however I'll provide you the execution plan).
The problem is that, when I execute this query outside a stored procedure, it's quite fast (20 sec for over 70.000 rows), but when i execute it into a stored porcedure it becomes super slow (5 minutes for the same data). How is this possible?
What can I try to do to improve this performances?
I've already tried to change the condition of execution of this query by changing it in a dyanamic form and putting it into a temporary table but the performances did not change.
Subquery Scan on t (cost=16149.34..16152.92 rows=53 width=775) (actual time=4978.471..5382.467 rows=77616 loops=1)
Buffers: shared hit=11847, temp read=6433 written=6439
-> Unique (cost=16149.34..16151.86 rows=53 width=711) (actual time=4978.465..5268.859 rows=77616 loops=1)
Buffers: shared hit=11847, temp read=6433 written=6439
-> Sort (cost=16149.34..16149.47 rows=53 width=711) (actual time=4978.464..5139.298 rows=171110 loops=1)
Sort Key: HIDDEN_TABLE1.aaa, HIDDEN_TABLE1.bbb, HIDDEN_TABLE1.ccc, HIDDEN_TABLE1.ddd, HIDDEN_TABLE1.eee, HIDDEN_TABLE1.rrr, HIDDEN_TABLE1.jjj, HIDDEN_TABLE1.contract_element, HIDDEN_TABLE1.hhh, HIDDEN_TABLE1.lll, HIDDEN_TABLE1.abc, (sum(HIDDEN_TABLE1.original_amount) OVER (?)), HIDDEN_TABLE1.uuu, (sum(HIDDEN_TABLE1.amount) OVER (?)), (max(HIDDEN_TABLE1.posting_date) OVER (?)), HIDDEN_TABLE1.qqq, HIDDEN_TABLE2.ttt, (CASE WHEN ((COALESCE(HIDDEN_TABLE1.asdf, '0'::numeric) = '0'::numeric) AND ((HIDDEN_TABLE1.gfd)::text <> 'YTD'::text)) THEN '1'::numeric WHEN ((HIDDEN_TABLE1.gfd)::text = 'YTD'::text) THEN CASE WHEN (COALESCE(fx2.uyt, '0'::numeric) = '0'::numeric) THEN '1'::numeric ELSE (fx1.uyt / fx2.uyt) END ELSE HIDDEN_TABLE1.asdf END)
Sort Method: external merge Disk: 25896kB
Buffers: shared hit=11847, temp read=6433 written=6439
-> WindowAgg (cost=16144.51..16147.82 rows=53 width=711) (actual time=2394.118..3317.757 rows=171110 loops=1)
Buffers: shared hit=11847, temp read=3196 written=3199
-> Sort (cost=16144.51..16144.64 rows=53 width=645) (actual time=2394.081..2785.327 rows=171110 loops=1)
Sort Key: HIDDEN_TABLE1.aaa, HIDDEN_TABLE1.bbb, HIDDEN_TABLE1.ddd, HIDDEN_TABLE1.rrr, HIDDEN_TABLE1.jjj, HIDDEN_TABLE1.h, HIDDEN_TABLE1.hhh, HIDDEN_TABLE1.uuu, HIDDEN_TABLE1.qqq, HIDDEN_TABLE2.ttt, HIDDEN_TABLE1.lll, HIDDEN_TABLE1.abc
Sort Method: external merge Disk: 25568kB
Buffers: shared hit=11847, temp read=3196 written=3199
-> Gather (cost=1590.77..16142.99 rows=53 width=645) (actual time=13.657..503.346 rows=171110 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=11847
-> Hash Left Join (cost=590.77..15137.69 rows=22 width=645) (actual time=19.160..283.159 rows=57037 loops=3)
Hash Cond: ((((HIDDEN_TABLE1.lll)::text || 'LEA'::text) = (fx2.cod_scenario)::text) AND ((HIDDEN_TABLE1.abc)::numeric(18,0) = fx2.cod_periodo) AND ((HIDDEN_TABLE1.qqq)::text = (fx2.cod_valuta)::text))
Buffers: shared hit=11847
-> Hash Left Join (cost=300.18..14830.75 rows=22 width=640) (actual time=7.164..206.819 rows=57037 loops=3)
Hash Cond: ((((HIDDEN_TABLE1.lll)::text || 'LEA'::text) = (fx1.cod_scenario)::text) AND ((HIDDEN_TABLE1.abc)::numeric(18,0) = fx1.cod_periodo) AND ((HIDDEN_TABLE1.uuu)::text = (fx1.cod_valuta)::text))
Buffers: shared hit=11616
-> Hash Join (cost=9.58..14523.82 rows=22 width=635) (actual time=0.166..111.711 rows=57037 loops=3)
Hash Cond: ((HIDDEN_TABLE1.field_x)::text = (HIDDEN_TABLE2.field_y)::text)
Buffers: shared hit=11363
-> Parallel Seq Scan on HIDDEN_TABLE1 (cost=0.00..14510.62 rows=905 width=130) (actual time=0.013..67.993 rows=145561 loops=3)
Filter: (((uuu)::text <> (qqq)::text) AND ((COALESCE(booking_type, 'JOURNAL'::character varying))::text = 'JOURNAL'::text) AND ((measure)::text = '-'::text))
Rows Removed by Filter: 5918
Buffers: shared hit=11197
-> Hash (cost=9.57..9.57 rows=1 width=1042) (actual time=0.081..0.082 rows=11 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=104
-> Nested Loop (cost=0.27..9.57 rows=1 width=1042) (actual time=0.049..0.076 rows=11 loops=3)
Buffers: shared hit=104
-> Seq Scan on HIDDEN_TABLE2 (cost=0.00..1.24 rows=1 width=1032) (actual time=0.017..0.020 rows=11 loops=3)
Filter: (((setting_field_x)::text = 'VVVVVVV'::text) AND ((active)::text = 'Y'::text))
Rows Removed by Filter: 5
Buffers: shared hit=3
-> Index Only Scan using pk_conto on conto c (cost=0.27..8.29 rows=1 width=10) (actual time=0.004..0.004 rows=1 loops=33)
Index Cond: (cod_conto = (HIDDEN_TABLE2.field_y)::text)
Heap Fetches: 33
Buffers: shared hit=101
-> Hash (cost=154.67..154.67 rows=7767 width=22) (actual time=6.931..6.931 rows=7767 loops=3)
Buckets: 8192 Batches: 1 Memory Usage: 484kB
Buffers: shared hit=231
-> Seq Scan on HIDDEN_TABLE_3 fx1 (cost=0.00..154.67 rows=7767 width=22) (actual time=0.012..1.413 rows=7767 loops=3)
Buffers: shared hit=231
-> Hash (cost=154.67..154.67 rows=7767 width=22) (actual time=11.962..11.962 rows=7767 loops=3)
Buckets: 8192 Batches: 1 Memory Usage: 484kB
Buffers: shared hit=231
-> Seq Scan on HIDDEN_TABLE_3 fx2 (cost=0.00..154.67 rows=7767 width=22) (actual time=0.005..5.334 rows=7767 loops=3)
Buffers: shared hit=231
Planning Time: 2.465 ms
Execution Time: 5395.802 ms
I would like to bring the execution time similar to the query executed outside the stored procedure.

Big query analytical function not giving expected results

I am trying to write a sql in bigquery and I have a requirement to filter records based on a group by column and another column in the table
what I mean is I want to check if the group by column(column name:mnt) has more than one row then I have to check if col2 (col name: zel) value, then I have to apply a filter saying col2 ='X' and only pass that record else pass i.e dont filter the records if the col1 has only distinct one value per group
So I have written a sql to do this I have used row_number as well as rank , dense rank function but I noticed the value of rank and dense rank and row number functions return same value for a group
Please see the below code
#standardsql
with t1 as (SELECT mnt,
case when rank() over (partition by ltrim(rtrim(mnt)) order by
ltrim(rtrim(mnt)) asc) >1 then 'Y' else 'N' end
as flag,
rank() over (partition by mnt order by mnt) as rn,
dense_rank() over (partition by mnt order by mnt) as drn, FROM
projectname.datasetname.tablename1),
t2 as ( SELECT
mnt,
rel,
lif,
lts,
lokez FROM projectname.datasetname.tablename2
WHERE lts <> "" AND _PARTITIONTIME = TIMESTAMP(CURRENT_DATE()) ) ,
t3 as (SELECT
lif,
lifn,
lts,
par FROM `projectname.datasetname.tablename3`)
,t4 as (SELECT rcv FROM `projectname.datasetname.tablename4` WHERE mes
= 'PRO')
select * from (
SELECT t1.mnt as mnt,
t1.flag,
t1.rn,
t1.drn
t2.rel as zel,
t2.lokez as ZLOEKZ,
t4.rcv as Zrcv
FROM t1 left join t2 on replace(t1.mnt, '00000000', '') =
REPLACE(t2.mnt, '00000000', '') AND t1.lif = t2.lif and t2.lts <> ""
and
case when t1.flag = 'Y' and t2.rel ='X' then 1
when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and t2.rel
is null) then 1
when t1.flag = 'Y' and t2.rel <>'X' then 2
else 3
end = 1
left join t3 ON t1.lif = t3.lif AND t2.lts = t3.lts AND
t3.par = 'BA' left join t4 on t4.rcv = t3.lifn and t2.lokez is null )
where ZLOEKZ is null order by mnt
As you can see I am using a case statement and even it seems to be not working fine. I am pasting the case condition below again
case when t1.flag = 'Y' and t2.rel ='X' then 1
when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and
t2.rel
is null) then 1
when t1.flag = 'Y' and t2.rel <>'X' then 2
else 3
end = 1
But the expected record count did not match so I added the above sql lines to see if my analytical functions were giving me result I wanted
rank() over (partition by mnt order by mnt) as rn,
dense_rank() over (partition by mnt order by mnt) as drn
strangely for same mnt number the rank , dense rank and row_number function are assigning the same value what am i doing wrong here.
mnt flag rn drn rel lokez rcv
100 N 1 1 X abc 123
100 N 1 1 null xyz 123
100 N 1 1 null def 234
This is my output
I mean as per my code for same mnt number I am seeing flag set to N instead of Y and for the rank and dense rank are giving me same number for all 3 mnt it is generating 1 instead of 123 (note for rank function I understand) but dense rank should not do that
I tried to convey the issue as efficiently as I could please let me know if there is any clarifications I can provide.
any help appreciated
thanks
SELECT * EXCEPT(ct) FROM (
SELECT *, COUNT() OVER(PARTITION BY mnt) AS ct
) WHERE ct=1 or zel='X'
This is the code snippet for the problem you mentioned. Use this in your code according to the logic.

How should I set the distkey for a left join with conditionals in Redshift?

I have a query that looks like this:
select
a.col1,
a.col2,
b.col3
from
a
left join b on (a.id=b.id and b.attribute_id=3)
left join c on (a.id=c.id and c.attribute_id=4)
Even setting the distkey to id gets me a DS_BCAST_INNER in the query plan and I end up with extraordinary query time for a mere 1 million rows.
Setting the id to be the distribution key should co-locate the data and remove the need for the broadcast.
create table a (id int distkey, attribute_id int, col1 varchar(10), col2 varchar(10));
create table b (id int distkey, attribute_id int, col3 varchar(10));
create table c (id int distkey, attribute_id int);
You should see an explain plan something like this:
admin#dev=# explain select
a.col1,
a.col2,
b.col3
from
a
left join b on (a.id=b.id and b.attribute_id=3)
left join c on (a.id=c.id and c.attribute_id=4);
QUERY PLAN
--------------------------------------------------------------------------
XN Hash Left Join DS_DIST_NONE (cost=0.09..0.23 rows=3 width=99)
Hash Cond: ("outer".id = "inner".id)
-> XN Hash Left Join DS_DIST_NONE (cost=0.05..0.14 rows=3 width=103)
Hash Cond: ("outer".id = "inner".id)
-> XN Seq Scan on a (cost=0.00..0.03 rows=3 width=70)
-> XN Hash (cost=0.04..0.04 rows=3 width=37)
-> XN Seq Scan on b (cost=0.00..0.04 rows=3 width=37)
Filter: (attribute_id = 3)
-> XN Hash (cost=0.04..0.04 rows=1 width=4)
-> XN Seq Scan on c (cost=0.00..0.04 rows=1 width=4)
Filter: (attribute_id = 4)
(11 rows)
Time: 123.315 ms
If the tables contain 3 million rows or less and have a low frequency of writes it should be safe to use DIST STYLE ALL. If you do use DIST STYLE KEY, verify that distributing your tables does not cause row skew (check with the following query):
select "schema", "table", skew_rows from svv_table_info;
"skew_rows" is the ratio of data between the slice with the most and the least data. It should be close 1.00.

Informatica Repository Query to get Workflow, Session, Mapping and Source/Target of Mapping

for cleaning up unused IPC-Sources I need a Repository Query for getting Workflow, Session, Mapping and Source/Target of Mapping.I have startet by joining REP_LOAD_SESSIONS and REP_TBL_MAPPING on mapping_id but only a fraction of mappings seem to be present in the joined output.
I can't find the right tables to join to get the job done.
Any help will be greatly appreciated!
I was struggling with the same issue. Here is my query. Hope it helps
SELECT SUBJECT_AREA,SESSIONNAME,MPGANDP MAPPINGNAME,SOURCENAMES,TARGET_NAMES,INSTANCE_NAME,LOOKUPTABLENAME,CASE WHEN OBJECTTYPE='Lookup ' THEN CONNECTION ELSE CNX_NAME END CONNECTIONNAME,USER_NAME
FROM
( SELECT * FROM
(SELECT SUBJECT_AREA,SESSION_ID,MPGANDP, MPNGID,OBJECTTYPE,INSTANCE_NAME,MAX(LOOKUPTABLE) LOOKUPTABLENAME, MAX(CONNECTION) CONNECTION
--,LISTAGG(SQLQUERY, '' ) WITHIN GROUP (ORDER BY SQLQUERY) SQLOVERRIRDE
FROM
(
SELECT CASE WHEN MAPPING_NAME=PARENT_MAPPING_NAME THEN MAPPING_NAME ELSE MAPPING_NAME||','||PARENT_MAPPING_NAME END MPGANDP, B.MAPPING_ID MPNGID,
SUBSTR(WIDGET_TYPE_NAME,1,INSTR(WIDGET_TYPE_NAME,' ')) OBJECTTYPE, INSTANCE_NAME, CASE WHEN UPPER(ATTR_NAME) ='CONNECTION INFORMATION' THEN ATTR_VALUE ELSE NULL END CONNECTION,
ATTR_NAME, ATTR_VALUE,SUBJECT_AREA, --A.*,B.*,C.*
--CASE WHEN ATTR_NAME='Sql Query' OR ATTR_NAME='Lookup Sql Override' THEN ATTR_VALUE END SQLQUERY,
CASE WHEN ATTR_NAME='Lookup table name' THEN ATTR_VALUE END LOOKUPTABLE,
CASE WHEN ATTR_NAME='Sql Query' OR ATTR_NAME='Lookup Sql Override' THEN SUBSTR(ATTR_VALUE,INSTR(UPPER(ATTR_VALUE),'FROM'),15) END SQLQUERYV
FROM REP_WIDGET_INST A
INNER JOIN REP_ALL_MAPPINGS B
ON A.MAPPING_ID = B.MAPPING_ID
INNER JOIN REP_WIDGET_ATTR C
ON A.WIDGET_ID = C.WIDGET_ID
WHERE A.WIDGET_TYPE IN (2, 11,3)
--AND MAPPING_NAME<>PARENT_MAPPING_NAME
--AND B.MAPPING_ID=515
--AND PARENT_SUBJECT_AREA='EDW'
AND ATTR_NAME IN ( 'Connection Information','Lookup Sql Override','Lookup table name','Sql Query')
) , OPB_SESSION
WHERE MPNGID=MAPPING_ID
GROUP BY SUBJECT_AREA,MPGANDP, MPNGID,OBJECTTYPE,INSTANCE_NAME,SESSION_ID
) T1
INNER JOIN
(SELECT OPB_TASK_INST.WORKFLOW_ID,OPB_TASK_INST.TASK_ID ,OPB_TASK_INST.INSTANCE_NAME SESSIONNAME
FROM OPB_TASK_INST
WHERE OPB_TASK_INST.TASK_TYPE IN (68) --,70)
START WITH WORKFLOW_ID IN (SELECT TASK_ID FROM OPB_TASK WHERE TASK_TYPE = 71 AND /* **************SPECIFY WORKFLOW NAME HERE*********/ TASK_NAME='wf_TEST')
CONNECT BY PRIOR OPB_TASK_INST.TASK_ID = OPB_TASK_INST.WORKFLOW_ID ) WFSESSCONN
ON TASK_ID=SESSION_ID
INNER JOIN
( SELECT MAPPING_ID MAPID,LISTAGG(SOURCE_NAME,',') WITHIN GROUP (ORDER BY SOURCE_NAME) SOURCENAMES
FROM REP_SRC_MAPPING E
GROUP BY SUBJECT_AREA,MAPPING_NAME,MAPPING_ID ) SOURCENAMES
ON MAPID=MPNGID
LEFT JOIN
(SELECT DISTINCT SUBJECT_AREA SA,TASK_NAME,INSTANCE_NAME INSNAME,CNX_NAME,SESSION_ID SSID
FROM
REP_ALL_TASKS A,
REP_SESS_WIDGET_CNXS B
WHERE
A.TASK_ID = B.SESSION_ID
) T2
ON SESSION_ID=SSID
AND INSNAME=INSTANCE_NAME
AND SUBJECT_AREA=SA
LEFT JOIN
( SELECT SUBJECT_AREA SAT, SESSION_NAME SESSNT, SESSION_ID SSIDT, LISTAGG(WIDGET_NAME,',') WITHIN GROUP (ORDER BY WIDGET_NAME) AS TARGET_NAMES
FROM (SELECT distinct SUBJECT_AREA,SESSION_NAME,SESSION_ID,WIDGET_NAME
FROM REP_SESS_TBL_LOG
WHERE TYPE_NAME='Target Definition' )
GROUP BY SUBJECT_AREA,SESSION_NAME,SESSION_ID
)
ON SESSION_ID=SSIDT
)
LEFT JOIN OPB_CNX
ON TRIM(OBJECT_NAME)=TRIM(CASE WHEN OBJECTTYPE='Lookup ' THEN CONNECTION ELSE CNX_NAME END)
ORDER BY SUBJECT_AREA,SESSIONNAME,MPGANDP,INSTANCE_NAME

Common table expression from bottom-top approach

I have an Agent table and a hierarchy table.
CREATE TABLE [dbo].[Agent](
[AgentID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [varchar](50) NULL,
[LastName] [varchar](50) NULL,
CONSTRAINT [PK_Agent] PRIMARY KEY CLUSTERED
(
[AgentID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Hierarchy](
[HierarchyID] [int] IDENTITY(1,1) NOT NULL,
[AgentID] [int] NULL,
[NextAgentID] [int] NULL,
CONSTRAINT [PK_Hierarchy] PRIMARY KEY CLUSTERED
(
[HierarchyID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
--Insert to Agent
INSERT INTO [Agent]([FirstName],[LastName])VALUES('C1','C1');
INSERT INTO [Agent]([FirstName],[LastName])VALUES('C2','C2');
INSERT INTO [Agent]([FirstName],[LastName])VALUES('C3','C3');
INSERT INTO [Agent]([FirstName],[LastName])VALUES('C4','C4');
SELECT * FROM Agent;
AgentID FirstName LastName
1 C1 C1
2 C2 C2
3 C3 C3
4 C4 C4
--Insert to Hierarchy
INSERT INTO [Hierarchy] ([AgentID],[NextAgentID]) VALUES (1,NULL);
INSERT INTO [Hierarchy] ([AgentID],[NextAgentID]) VALUES (2,1);
INSERT INTO [Hierarchy] ([AgentID],[NextAgentID]) VALUES (3,2);
INSERT INTO [Hierarchy] ([AgentID],[NextAgentID]) VALUES (2,4);
INSERT INTO [Hierarchy] ([AgentID],[NextAgentID]) VALUES (4,NULL);
SELECT * FROM Hierarchy;
HierarchyID AgentID NextAgentID
1 1 NULL
2 2 1
3 3 2
4 2 4
5 4 NULL
I used a common table expression to determine the bottom to top levels
WITH AgentHierarchy(AgentID, NextAgentID, HierarchyLevel)
AS
(
SELECT
H1.AgentID,
H1.NextAgentID,
1 HierarchyLevel
FROM Hierarchy H1
WHERE NOT EXISTS (SELECT 1 FROM Hierarchy H2 WHERE H2.NextAgentID = H1.AgentID)
UNION ALL
SELECT
H.AgentID,
H.NextAgentID,
(AgentHierarchy.HierarchyLevel + 1) HierarchyLevel
FROM Hierarchy H
INNER JOIN AgentHierarchy ON AgentHierarchy.NextAgentID = H.AgentID
)
SELECT DISTINCT
AgentID,
NextAgentID,
HierarchyLevel
FROM AgentHierarchy
ORDER BY AgentID, NextAgentID, HierarchyLevel;
Result is:
AgentID NextAgentID HierarchyLevel
1 NULL 3
2 1 2
3 2 1
4 NULL 1
2 4 1
My requirement is to show this in the below way:
AgentID NextAgentID HierarchyLevel
1 NULL 1
2 1 1
3 2 1
3 1 2
4 NULL 1
2 4 1
3 4 2
In short, recursively all the hierarchy with levels should be pulled with bottom-to-top approach. Please help me...
I found the answer:
WITH AgentHierarchy(AgentID, NextAgentID, HierarchyLevel)
AS
(
SELECT
H1.AgentID,
H1.NextAgentID,
1 HierarchyLevel
FROM Hierarchy H1
--WHERE NOT EXISTS (SELECT 1 FROM Hierarchy H2 WHERE H2.NextAgentID = H1.AgentID)
UNION ALL
SELECT
AgentHierarchy.AgentID,
H.NextAgentID,
(AgentHierarchy.HierarchyLevel + 1) HierarchyLevel
FROM Hierarchy H
INNER JOIN AgentHierarchy ON AgentHierarchy.NextAgentID = H.AgentID
)
SELECT
AgentHierarchy.AgentID,
NextAgentID,
HierarchyLevel
FROM AgentHierarchy
WHERE NOT (NextAgentID IS NULL AND HierarchyLevel > 1);
I did the following changes:
Removed the Anchor query WHERE Clause.
Added the CTE's AgentID in the second select after UNION.
Added WHERE Clause in the CTE to remove junk records for the
bottom-most level with NULL NextAgentID.
Let me know if anyone has questions.