Replicated Latin Square Design Analysis Using lm and ANOVA - lm

I'm looking to run lm and ANOVA for the following Latin Square design:
Time
Subject 1 2 3
Square 1 1 9 (B) 3 (C) 6 (A)
2 18(A) 6 (B) 12(C)
3 12(C) 15(A) 5 (B)
Square 2 4 14(C) 8 (B) 11(A)
5 17(A) 9 (C) 9 (B)
6 7 (B) 7 (A) 7 (C)
I found that, if the replication occurs with different rows and the same columns, as in we'll have 6 different subjects and the subjects are nested in each replication (R/S), then the command is:
lm.1 <- lm(A ~ R/S + D + T + R)
anova(lm.1)
But, if we assume now that the replication occurs with the same rows and columns, as in S1=S4, S2=S5, and S3=S6, how would the lm code change?
Thank you.

Related

How to calculate sum distinct in quicksight

Partner
UserID
Marks
Group
A
1
4
AM
A
2
7
AM
A
1
4
AM
B
3
5
CM
C
4
6
TM
B
3
5
CM
I want to calculate sum of 'Marks' for each partner excluding double rows.
I've tried (sum(maxOver(Marks, [UserID, Partner], PRE_AGG))). But it's giving me a table like :
Partner
Marks
A
15
B
10
C
6
Whereas, I want a table as below :
Partner
Marks
A
11
B
5
C
6
Thank you for your help, cheers!
You can create a calculated field with a countOver() function to detect the duplicate rows, and then use it as a filter in a sumIf() function.
Example:
sumIf({Marks},countOver({Marks,[{Partner},{UserID},{Marks},{Group}],PRE_AGG)=1)

Problem defining appropriate measure in report

The table below indicates a minimal example of my raw data:
Product
Order Day
Customer
Units Ordered
Units Delivered
P
Apr 1
X
4
3
P
Apr 2
X
4
3
P
Apr 1
Y
3
1
P
Apr 1
Z
3
1
Q
Apr 1
Z
3
1
Q
Apr 2
W
3
2
R
Apr 3
X
1
0
R
Apr 4
Y
2
0
R
Apr 5
Z
8
8
R
Apr 6
Z
6
6
Based on this I am able to create the following table as a PBI report where I give a product summary:
Product
# diff. Customers Ordered
Total Ordered
Total Delivered
Service Rate
Product-level Service Test
P
3
14
8
0.57
1
Q
2
6
3
0.5
0
R
3
17
14
0.82
1
This final column in the above summary checks whether more then 50% is being delivered. This report can be filtered on Order Day (time filter) as well as Customers.
Now, in a similar fashion, I would like to create a customer summary report:
Customer
# diff. Products Ordered
# Products Passing Service Test
X
2
1
Y
2
0
Z
3
1
W
1
1
It basically summarizes the product report after filtering for specific customers. My problem is the final column called "Products Passing Service Test". I am not able to define an appropriate measure in order to get the right numbers displayed in this column. I tried some other approaches but then it does not work well with the time filter on Customer Orders.
Anyone that can help? Thank you very much!
JD

cumulative average powerbi by month

I have below dataset.
Math Literature Biology date student
4 2 5 2019-08-25 A
4 5 4 2019-08-08 A
5 4 5 2019-08-23 A
5 5 5 2019-08-15 A
5 5 5 2019-07-19 A
5 5 5 2019-07-15 A
5 5 5 2019-07-03 A
5 5 5 2019-06-26 A
1 1 2 2019-06-18 A
2 3 3 2019-06-14 A
5 5 5 2019-05-01 A
2 1 3 2019-04-26 A
I need to develop a solution in powerbi so in output I have cumulative average per subject per month
For example
April May June July August
Math | 2 3.5 3 3.75 4
Literature | 1 3 3 3.75 3.83
Biology | 3 4 3.6 4.125 4.33
Can you help?
You can use a matrix visualization for this.
Create a month-year variable and use it in the columns.
Use Average of Math,Literature and Biology in values
Under the format pane --> Values --> Show on rows --> Select this
This should give the view you are looking for. You can edit the value headers to your requirement.

Amazon Redshift - Joining table and finding out unmatched rows

I have two tables whose pseudo structure would be something as follows:
User_master
user pfid
------------
reno 2
andrew 3
reno 4
rosh 5
rosh 8
john 7
HR_master
user pfid
-------------
andrew 3
reno 4
rosh 9
john 12
Roaster_master
user pfid
--------------
andrew 3
reno 4
rosh 10
john 12
I need to join all 3 tables on column user and find the rows in HR_master where pfid doesn't match with any equivalent entry in User_master. If you note one of the entry for "reno" matches, while none of the entry for "rosh" matches.
It would have been an easy tasks if there were only one entry in User_master,the complication arise because of multiple rows.
The expected output is
USM.user USM.pfid HRM.pfid RM.pfid
-----------------------------------------
rosh 5|8 9 10
john 7 12 12
As asked, here is the query that I have compiled:
select
UM.email,UM.pfid as UMpfid,
HRM.pfid, RM.pfid
from user_master UM
left join HR_master HRM on (HRM.email=UM.email)
left join Roaster_master RM on (RM.email=UM.email)
where UM.pfid != HRM.pfid
The above query returns "reno" as well, whereas it should not come as one of the row in User_master has pfid matching.

Stuck on exercise dealing with (mathematical) sets and sub-sets [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I've been thinking about this for quite a few days and I can't really seem to be finding any answers for b).
It goes like this fellas:
Johnny has taken a very important course and wants a lot of his
friends to find out about his succes by posting on Facebook (// yes
stupid i know) Johnny knows N users represented by numbers from 1 to
N. Between them there are m friendships with the form i,j where i and
j are users; n,m != 0. A user cannot be friends with himself and a
friendship tells us that each user is friends with the other one.
Johnny wants to find out which are the most 'connected' people in his
friends list so that his post will be well spread accros Facebook. For
this, Johnny has to find out the biggest sub-set of well-known users.
In this sub-set, each user has at least k friends, which are also
present in the sub-set (k != 0). Input : N, m and k on the same line,
separated by a single space, also a sequence of 2*m natural numbers
(which can be found in the interval [1,N] Output (standard: a) The
number of friends of each user in order from 1 to N b) The members of
the biggest sub-set of users, having the property that each user in
this set has at least k friends (which, again, can be found in that
specific sub-set). If there is no such sub-set for a given k, print
"NO"
For this problem you can't use any specialised libraries, so i'm stuck
with the standards.
Again, this is concerning the mathematical concept of sets, NOT the C++ specialised set, multiset, etc libraries.
a) is pretty easy but like I said, b) is giving me some trouble.
Examples: 1)
Input: 5 5
2 1 2 5 1 3 2 4 5 1 4
Output:
a) 3 2 1 2 2 b) 1 4 5
2) Input:
5 5 3
1 2 5 1 3 2 4 5 1 4
Output:
a) 3 2 1 2 2
b) NO
and 3) Input:
11 18 3
1 8 4 7 7 10 11 10 2 1 2 3 8 9 8 3 9 3 9 2 5 6 5 11 1 4 10 6 7 6 2 8 11 7 11 6
Output:
a) 3 4 3 2 2 4 4 4 3 3 4
b) 2 3 6 7 8 9 10 11
Any help would be appreciated. Also, sorry for the bulky content, it had to be roughly translated. :)
Thx a lot
The problem calls for you to compute the k-core of an N-node graph with m edges. There's a simple algorithm for this: while the lowest degree vertex has degree less than k, delete it. The remaining vertices are the desired subset. Use a bucket queue to keep the nodes sorted by degree for efficient operation.
On second thought, we just need to track (1) the degree of each node (2) which nodes have degree less than k. In untested Python:
import collections
def kcore(edges, k):
neighbors = collections.defaultdict(set)
for u, v in edges:
neighbors[u].add(v)
neighbors[v].add(u)
bad = {u for (u, neigh) in neighbors.items() if len(neigh) < k}
while bad:
u = bad.pop()
for v in neighbors[u]:
neighbors[v].remove(u)
if len(neighbors[v]) < k:
bad.add(v)
del neighbors[u]
return set(neighbors)