I'm not getting the first record below returned in my CTE query (shown later):
Here's my table:
Key ParentID ChildID (Removed DateJoined Field here)
1 0 1
3 1 83
4 1 84
6 83 85
7 85 86
8 83 87
My CTE Query produces the following results:
ID Name Date Joined Parent ID Parent Name Level
83 Hanks, James 2014-09-13 1 Golko, Richard 1
84 Hanks, James 2014-09-13 1 Golko, Richard 1
85 Walker, Jamie 2014-09-13 83 Hanks, James 2
87 Newman, Betty 2014-09-20 83 Hanks, James 2
86 Adams, Ken 2014-09-13 85 Walker, Jamie 3
How can i also return the first record with ParentID = 0?
When I call the following sproc like this:
EXEC UCU_RTG_ProgramStructure_GetMemberTree 0,4
I still only get results starting with parentID=1 as shown above
Here's my CTE Query:
CREATE PROCEDURE [dbo].[UCU_RTG_ProgramStructure_GetMemberTree]
#ParentID int,
#MaxLevel int
AS
WITH matrix
AS
(
--initialization
SELECT UserID, DateJoined, ParentID, 1 AS lvl
FROM dbo.UCU_RTG_ProgramStructure
WHERE ParentID = #ParentID
UNION ALL
--recursive execution
SELECT p.UserID,p.DateJoined,p.ParentID, lvl+1
FROM dbo.UCU_RTG_ProgramStructure p INNER JOIN matrix m
ON p.ParentID = m.UserID
WHERE lvl < #MaxLevel
)
SELECT matrix.UserID, u.LastName + ', ' + u.FirstName AS Member ,DateJoined,ParentID,u2.LastName + ', ' + u2.FirstName AS Parent,lvl
FROM matrix
INNER JOIN dbo.Users u
ON u.UserID = matrix.UserID
INNER JOIN dbo.Users u2
ON u2.UserID = matrix.ParentID
ORDER BY ParentID
THE CTE Query is fine except it doesn't return the parentID=0 record(s)
Thanks...
I figured it out finally after looking at my post to make sure it was correct: the final select clause is wrong:
SELECT matrix.UserID, u.LastName + ', ' + u.FirstName AS Member ,DateJoined,ParentID,u2.LastName + ', ' + u2.FirstName AS Parent,lvl
FROM matrix
INNER JOIN dbo.Users u
ON u.UserID = matrix.UserID
INNER JOIN dbo.Users u2
ON u2.UserID = matrix.ParentID
the last INNER JOIN has to be changed to LEFT JOIN because there is no UserID 0 to join the ParentID 0 to.
Hope this helps someone else with recursive CTE queries.
Related
How can be modified dataframe below:
df <- data.frame (ID = c(1, 2, 2, 3), Name = c("Luke", "Pete", "Marie", "Frank"), Age = c(25, 34, 66, 45))
ID Name Age
1 Luke 25
2 Pete 34
2 Marie 66
3 Frank 45
To remove ID duplicated, and change it for next ID available.
ID Name Age
1 Luke 25
2 Pete 34
4 Marie 66
3 Frank 45
Thanks for help
I have a Redshift database with the following entries:
table name = subscribers
time_at
calc_subscribers
calc_unsubscribers
current_subscribers
2021-07-02 07:30:00
0
0
0
2021-07-02 07:45:00
39
8
0
2021-07-02 08:00:00
69
17
0
2021-07-02 08:15:00
67
21
0
2021-07-02 08:30:00
48
23
0
The goal is to calculate current_subscribers with the previous value.
current_subscribers = calc_subscribers - calc_unsubscribers + previous_current_subscribers
I do the following:
UPDATE subscribers sa
SET current_subscribers = COALESCE( sa.calc_subscribers - sa.calc_unsubscribers + sub.previous_current_subscribers,0)
FROM (
SELECT
time_at,
LAG(current_subscribers, 1) OVER
(ORDER BY time_at desc) previous_current_subscribers
FROM subscribers
) sub
WHERE sa.time_at = sub.time_at
The problem is that in the sub query "sub" a table is generated that is based on the current values in the table, and thus previous_current_subscribers is always 0. Instead of going through this row by row. So the result is: current_subscribers = calc_subscribers - calc_unsubscribers + 0 I have also already tried it with CTE, unfortunately without success:
The result should look like this:
time_at
calc_subscribers
calc_unsubscribers
current_subscribers
2021-07-02 07:30:00
0
0
0
2021-07-02 07:45:00
39
8
31
2021-07-02 08:00:00
69
17
83
2021-07-02 08:15:00
67
21
129
2021-07-02 08:30:00
48
95
82
I am grateful for any ideas.
The problem you are running into is that you want to use the result of one row in the calculation of the current row. This is recursive which I think you can do in this case but is expensive.
The result you are looking for is the sum of all calc_subscribers for this row and previous rows minus the sum of all calc_unsubscribers for this row and previous rows. This is the difference between 2 window functions - sum over.
sum(calc_subscribers) over (order by time_at desc rows unbounded preceding) - sum(calc_unsubscribers) over (order by time_at desc rows unbounded preceding) as current_subscribers
I have a MS Access 2010 application which writes back to (backend) sql server.The table has student id, test score and rank as columns. The application has a form, which takes input from users. When a new student enters his/her ID, score and rank, based on inserted rank the rest of the ranks must be updated.
For eg, if a new student has a score 79, and rank 5, the current student at 5 must be changed to 6, sixth rank to seventh and so on, in the SQL table
Before:
Student_ID Score Rank
1 89 1
16 88 2
25 84 3
3 81 4
7 78 5
15 75 6
12 72 7
17 70 8
56 65 9
9 64 10
After:
Student_ID Score Rank
1 89 1
16 88 2
25 84 3
3 81 4
7 78 6
15 75 7
12 72 8
17 70 9
56 65 10
9 64 11
10 75 5
Remove the rank field and create a query that calculates the rank (row number) on the fly. To speed this up, use a collection as shown here:
Public Function RowCounter( _
ByVal strKey As String, _
ByVal booReset As Boolean, _
Optional ByVal strGroupKey As String) _
As Long
' Builds consecutive RowIDs in select, append or create query
' with the possibility of automatic reset.
' Optionally a grouping key can be passed to reset the row count
' for every group key.
'
' Usage (typical select query):
' SELECT RowCounter(CStr([ID]),False) AS RowID, *
' FROM tblSomeTable
' WHERE (RowCounter(CStr([ID]),False) <> RowCounter("",True));
'
' Usage (with group key):
' SELECT RowCounter(CStr([ID]),False,CStr[GroupID])) AS RowID, *
' FROM tblSomeTable
' WHERE (RowCounter(CStr([ID]),False) <> RowCounter("",True));
'
' The Where statement resets the counter when the query is run
' and is needed for browsing a select query.
'
' Usage (typical append query, manual reset):
' 1. Reset counter manually:
' Call RowCounter(vbNullString, False)
' 2. Run query:
' INSERT INTO tblTemp ( RowID )
' SELECT RowCounter(CStr([ID]),False) AS RowID, *
' FROM tblSomeTable;
'
' Usage (typical append query, automatic reset):
' INSERT INTO tblTemp ( RowID )
' SELECT RowCounter(CStr([ID]),False) AS RowID, *
' FROM tblSomeTable
' WHERE (RowCounter("",True)=0);
'
' 2002-04-13. Cactus Data ApS. CPH
' 2002-09-09. Str() sometimes fails. Replaced with CStr().
' 2005-10-21. Str(col.Count + 1) reduced to col.Count + 1.
' 2008-02-27. Optional group parameter added.
' 2010-08-04. Corrected that group key missed first row in group.
Static col As New Collection
Static strGroup As String
On Error GoTo Err_RowCounter
If booReset = True Then
Set col = Nothing
ElseIf strGroup <> strGroupKey Then
Set col = Nothing
strGroup = strGroupKey
col.Add 1, strKey
Else
col.Add col.Count + 1, strKey
End If
RowCounter = col(strKey)
Exit_RowCounter:
Exit Function
Err_RowCounter:
Select Case Err
Case 457
' Key is present.
Resume Next
Case Else
' Some other error.
Resume Exit_RowCounter
End Select
End Function
Study the in-line comments for typical usage.
I am trying to find the mean value of all other observations in the same group.
My data is like
Value Name Group Mean_all_other
544 Pete 1 ....
997 Sara 1 ....
772 Tom 1 ....
725 Tris 2 ....
872 Lulu 2 ....
434 Mica 2 ....
728 Tina 2 ....
827 Bo 3 ....
322 Zu 3 ....
..... ... ... ...
I know that proc means can give you the mean value within groups.
But here I want o create the Mean value of all other in the same group.
In this case, Pete under Mean_all_other will show 884.5, which equals to (997+772)/2.
And Sara= (544+772)/2=658; Tris=(872+434+728)/3=678
Anyone has any idea ?
Consider a proc sql solution using a subquery to average same Group for each row and conditioning out the current Name. Below query uses SAS's not equal operator ^= and mean() function which in regular SQL would use the <> operator and avg() -both still compliant in SAS).
proc sql;
create table NewTable as
select * from
(select main.Value, main.Name, main.Group,
(select mean(sub.Value)
from CurrentTable sub
where sub.Group = main.Group
and sub.Name ^= main.Name) As Mean_all_other
from CurrentTable as main)
quit;
* Value Name Group Mean_all_other
* 544 Pete 1 884.5
* 997 Sara 1 658
* 772 Tom 1 770.5
* 725 Tris 2 678
* 872 Lulu 2 629
* 434 Mica 2 775
* 728 Tina 2 677
* 827 Bo 3 322
* 322 Zu 3 827
Once you have the mean for each whole group, the case-deleted mean for each observation is much easier to calculate. I'd suggest doing this via double DOW loop:
data have;
input Value Name $ Group;
cards;
544 Pete 1
997 Sara 1
772 Tom 1
725 Tris 2
872 Lulu 2
434 Mica 2
728 Tina 2
827 Bo 3
322 Zu 3
;
run;
data want;
do _N_ = 1 by 1 until(last.group);
set have;
by group;
value_sum = sum(value_sum,value);
value_count = sum(value_count,1);
end;
do _N_ = 1 to _N_;
set have;
mean_all_other = (value_sum - value)/(value_count - 1);
output;
end;
drop value_:;
run;
PROC SQL will happily remerge the summary statistics for you. Note that this syntax probably will not work in other SQL implementations, but works fine in SAS. You can use the DIVIDE function to avoid dividing by zero for groups with only one member.
create table want as
select *
, divide(sum(value) - value, n(value) - 1) as mean_all_other
from have
group by group
;
For other SQL implementations you will need to re-merge the aggregate results yourself.
create table want as
select a.*
, divide(b.sum_value - a.value, b.n_value - 1) as mean_all_other
from have a
, (select group,sum(value) as sum_value,n(value) as n_value
from have
group by group
) b
where a.group = b.group
;
If the value of VALUE could be missing then you need to add a CASE statement to handle those cases.
create table want as
select *
, case when (missing(value)) then mean(value)
else divide(sum(value) - value, n(value) - 1)
end as mean_all_other
from have
group by group
;
I have a bit of a complicated sql query I need to do, and I'm a bit stuck. I'm using SQLite if that changes anything.
I have the following table structure:
Table G
---------
G_id (primary key) | Other cols ...
====================================
21
22
23
24
25
26
27 (no g_to_s_map)
28
.
Table S
---------
S_id (primary key) | S_num | Other cols.....
====================================
1 1101
2 1102
3 1103
4 1104
5 1105
6 1106
7 1107 (no g_to_s_map, no s_to_t_map)
8 1108 (no g_to_s_map, there IS an s_to_t_map)
9 1109 (there is an g_to_s_map, but no s_to_t map)
.
Table T
---------
T_id (primary key) | Other cols...
==================================
1
2
Then I also have two mapping tables:
Table G_to_S_Map (1:1 mapping, unique values of both g_id and s_id)
----------
G_id (foreign key ref g)| S_id (foreign key ref s)
===================================================
21 1
22 2
23 3
24 4
25 5
26 6
28 9
.
Table S_to_T_Map (many:1 mapping, many unique s_id to a t_id)
----------
S_id (foreign key ref s) | T_id (foreign key ref s)
===================================================
1 1
2 1
3 1
4 2
5 2
6 2
8 2
Given only a T_id and a G_id, I need to be able to update the G_to_S_Map with the first S_id corresponding to the specified T_id (in the S_to_T_Map) that is NOT in the G_to_S_Map
The first thing I was thinking of was just getting any S_id's that corresponded to the T_id in the S_to_T_Map:
SELECT S_id FROM S_to_T_Map where T_id = GIVEN_T_ID;
Then presumably I would join those values somehow with the G_to_S_Map using a left/right join maybe, and then look for the first value which doesn't exist on one of the sides? Then I'd need to do an insert into the G_to_S_Map based on that S_id and the GIVEN_G_ID value or something.
Any suggestions on how to go about this? Thanks!
Edit: Added some dummy data:
I believe this should work:
INSERT INTO G_To_S_Map (G_id, S_id)
(SELECT :inputGId, a.S_id
FROM S_To_T_Map as a
LEFT JOIN G_To_S_Map as b
ON b.S_id = a.S_id
AND b.G_id = :inputGId
WHERE a.T_id = :inputTId
AND b.G_id IS NULL
ORDER BY a.S_id
LIMIT 1);
EDIT:
If you're wanting to do the order by a different table, use this version:
INSERT INTO G_To_S_Map (G_id, S_id)
(SELECT :inputGId, a.S_id
FROM S_To_T_Map as a
JOIN S as b
ON b.S_id = a.S_id
LEFT JOIN G_To_S_Map as c
ON c.S_id = a.S_id
AND c.G_id = :inputGId
WHERE a.T_id = :inputTId
AND c.G_id IS NULL
ORDER BY b.S_num
LIMIT 1);
(As an aside, I really hope your tables aren't actually named like this, because that's a terrible thing to do. The use of Map, especially, should probably be avoided)
EDIT:
Here's some example test data. Have I missed something? Did I conceptualize the relationships incorrectly?
S_To_T_Map
================
S_ID T_ID
1 1
2 1
3 1
1 2
1 3
3 3
G_To_S_Map
==================
G_ID S_ID
1 1
3 1
2 1
3 2
2 3
3 3
Resulting joined data:
(CTEs used to generate cross-join test data)
Results:
=============================
G_TEST T_TEST S_ID
1 1 3
2 1 2
1 3 3
EDIT:
Ah, okay, now I get the problem. My issue was that I was assuming there was some sort of many-one relationship between S and G. As this is not the case, use this amended statement:
INSERT INTO G_To_S_Map (G_id, S_id)
(SELECT :inputGId, a.S_id
FROM S_To_T_Map as a
JOIN S as b
ON b.S_id = a.S_id
LEFT JOIN G_To_S_Map as c
ON c.S_id = a.S_id
OR c.G_id = :inputGId
WHERE a.T_id = :inputTId
AND c.G_id IS NULL
ORDER BY b.S_num
LIMIT 1);
Specficially, the line checking G_To_S_Map for a row containing the G_Id needed to be switched from using an AND to an OR - the key requirement which had not been specified previously was the fact that both G_Id and S_Id were unique in G_To_S_Map.
This statement will not insert a line if either the provided G_Id has been mapped previously, or if all S_Ids mapped to the given T_Id have been mapped.
Hmm, the following seems to work nicely, although I haven't combined an "insert" with it yet.
Select s.S_ID from S as s
inner join(
Select st.S_ID from s_to_t_map as st
where st.T_ID=???? AND not exists
(Select * from g_to_s_Map as gs where gs.S_ID = st.S_ID)
) rslt on s.S_ID=rslt.S_ID ORDER BY s.s_Num ASC limit 1;