Calculation using weights - stata

I have a data frame which is shown below (Except Price2 variable):
+------+-----------+----------+-------+------+-------------+
| Name | Day | Time | Price | Size | Price2 |
+------+-----------+----------+-------+------+-------------+
| A | 24-Mar-08 | 10:30:01 | 1 | 3 | 0.333333333 |
| A | 24-Mar-08 | 10:30:01 | 4 | 4 | 1.777777778 |
| A | 24-Mar-08 | 10:30:01 | 3 | 2 | 0.666666667 |
| A | 24-Mar-08 | 11:03:12 | 1 | 4 | 0.8 |
| A | 24-Mar-08 | 11:03:12 | 4 | 1 | 0.8 |
| A | 25-Mar-08 | 10:30:01 | 3 | 4 | 2 |
| A | 25-Mar-08 | 10:30:01 | 8 | 2 | 2.666666667 |
| A | 25-Mar-08 | 11:13:59 | 3 | 2 | 0.428571429 |
| A | 25-Mar-08 | 11:13:59 | 2 | 4 | 0.571428571 |
| A | 25-Mar-08 | 11:13:59 | 5 | 5 | 1.785714286 |
| A | 25-Mar-08 | 11:13:59 | 3 | 3 | 0.642857143 |
| A | 25-Mar-08 | 11:59:01 | 1 | 5 | 1 |
| B | 24-Mar-08 | 10:30:01 | 3 | 6 | 2.571429 |
| B | 24-Mar-08 | 10:30:01 | 4 | 1 | 0.571428 |
| B | 24-Mar-08 | 11:30:01 | 3 | 2 | 2 |
| B | 24-Mar-08 | 11:30:01 | 5 | 1 | 1.666667 |
| B | 25-Mar-08 | 11:30:01 | 7 | 3 | 1.909090909 |
| B | 25-Mar-08 | 11:30:01 | 4 | 6 | 2.181818182 |
| B | 25-Mar-08 | 11:30:01 | 2 | 2 | 0.363636364 |
| B | 25-Mar-08 | 12:00:00 | 6 | 2 | 6 |
+------+-----------+----------+-------+------+-------------+
I want to calculate Price2 in Stata, which is Price multiplied by Size and divided by the sum of Size for each second.

My solution is similar to that of #Andrey Ampilogov, and like him I don't see where all your results come from.
clear
input str1 Name str9 (Day Time) Price Size Price2
A 24-Mar-08 "10:30:01" 1 3 0.333333333
A 24-Mar-08 "10:30:01" 4 4 1.777777778
A 24-Mar-08 "10:30:01" 3 2 0.666666667
A 24-Mar-08 "11:03:12" 1 4 0.8
A 24-Mar-08 "11:03:12" 4 1 0.8
A 25-Mar-08 "10:30:01" 3 4 2
A 25-Mar-08 "10:30:01" 8 2 2.666666667
A 25-Mar-08 "11:13:59" 3 2 0.428571429
A 25-Mar-08 "11:13:59" 2 4 0.571428571
A 25-Mar-08 "11:13:59" 5 5 1.785714286
A 25-Mar-08 "11:13:59" 3 3 0.642857143
A 25-Mar-08 "11:59:01" 1 5 1
B 24-Mar-08 "10:30:01" 3 6 1.8
B 24-Mar-08 "10:30:01" 4 1 0.4
B 24-Mar-08 "11:30:01" 3 2 0.6
B 24-Mar-08 "11:30:01" 5 1 0.5
B 25-Mar-08 "11:30:01" 7 3 1.909090909
B 25-Mar-08 "11:30:01" 4 6 2.181818182
B 25-Mar-08 "11:30:01" 2 2 0.363636364
B 25-Mar-08 "12:00:00" 6 2 6
end
egen den = total(Size), by(Name Day Time)
gen wanted = (Price * Size)/den
list, sepby(Name Day Time)
+------------------------------------------------------------------------+
| Name Day Time Price Size Price2 den wanted |
|------------------------------------------------------------------------|
1. | A 24-Mar-08 10:30:01 1 3 .3333333 9 .3333333 |
2. | A 24-Mar-08 10:30:01 4 4 1.777778 9 1.777778 |
3. | A 24-Mar-08 10:30:01 3 2 .6666667 9 .6666667 |
|------------------------------------------------------------------------|
4. | A 24-Mar-08 11:03:12 1 4 .8 5 .8 |
5. | A 24-Mar-08 11:03:12 4 1 .8 5 .8 |
|------------------------------------------------------------------------|
6. | A 25-Mar-08 10:30:01 3 4 2 6 2 |
7. | A 25-Mar-08 10:30:01 8 2 2.666667 6 2.666667 |
|------------------------------------------------------------------------|
8. | A 25-Mar-08 11:13:59 3 2 .4285714 14 .4285714 |
9. | A 25-Mar-08 11:13:59 2 4 .5714286 14 .5714286 |
10. | A 25-Mar-08 11:13:59 5 5 1.785714 14 1.785714 |
11. | A 25-Mar-08 11:13:59 3 3 .6428571 14 .6428571 |
|------------------------------------------------------------------------|
12. | A 25-Mar-08 11:59:01 1 5 1 5 1 |
|------------------------------------------------------------------------|
13. | B 24-Mar-08 10:30:01 3 6 1.8 7 2.571429 |
14. | B 24-Mar-08 10:30:01 4 1 .4 7 .5714286 |
|------------------------------------------------------------------------|
15. | B 24-Mar-08 11:30:01 3 2 .6 3 2 |
16. | B 24-Mar-08 11:30:01 5 1 .5 3 1.666667 |
|------------------------------------------------------------------------|
17. | B 25-Mar-08 11:30:01 7 3 1.909091 11 1.909091 |
18. | B 25-Mar-08 11:30:01 4 6 2.181818 11 2.181818 |
19. | B 25-Mar-08 11:30:01 2 2 .3636364 11 .3636364 |
|------------------------------------------------------------------------|
20. | B 25-Mar-08 12:00:00 6 2 6 2 6 |
+------------------------------------------------------------------------+

At first, generate a sum of size for each group of Name - Day - Time. Then do the rest of the math - multiply the size by the price and divide by the sum of the sizes:
bys Name Day Time: egen sumPrice = total(Size)
gen Price2 = Price * Size / sumPrice
And also check a group of Name="B", Day = "24-Mar-08", Time = "10:30:01". The Price2 from your example and re-calculated Price2 do not match. Other values match.

Related

Conditional count Measure

I have data looking like this:
| ID |OpID|
| -- | -- |
| 10 | 1 |
| 10 | 2 |
| 10 | 4 |
| 11 |null|
| 12 | 3 |
| 12 | 4 |
| 13 | 1 |
| 13 | 2 |
| 13 | 3 |
| 14 | 2 |
| 14 | 4 |
Here OpID 4 means 1 and 2.
I would like to count the different occurrences of 1, 2 and 3 in OpID of distinct ID.
If the counts of OpID having 1 would be 4, 2 would be 4, 3 would be 2.
If ID has OpID of 4 but already has data of 1, 2 it wouldn't be counted. But if 4 exists and only 1 (2) is there, count for 2 (1) would be incremented.
The expected output would be:
|OpID|Count|
| 1 | 4 |
| 2 | 4 |
| 3 | 2 |
(Going to be using the results in a column chart)
Hope this makes sense...
edit: there are other columns too and an ID and OpID can be duplicated hence need to do a groupby clause before.

How to Sum all working days for each month but restart from 0 for every month in power Bi Dax

I would like to know how could I get the Sum of all working days for specific month but in the table starting each month's Sum over again.
This is my DateTable Now with this query for Work Days Sum:
Work Days Sum =
CALCULATE (
SUM ( 'DateTable'[Is working Day] ),
ALL ( 'DateTable' ),
'DateTable'[Date] <= EARLIER ( 'DateTable'[Date] )
)
Date | Month Order | Is working day | Work Days Sum |
January - 21 331
2022/01/01 | 1 | 0 | |
2022/01/02 | 1 | 0 | |
2022/01/03 | 1 | 1 | 1 |
2022/01/04 | 1 | 1 | 2 |
2022/01/05 | 1 | 1 | 3 |
2022/01/06 | 1 | 1 | 4 |
.....
2022/01/27 | 1 | 1 | 19 |
2022/01/28 | 1 | 1 | 20 |
2022/01/29 | 1 | 0 | 20 |
2022/01/30 | 1 | 0 | 20 |
2022/01/31 | 1 | 1 | 21 |
February 20 890
2022/02/01 | 2 | 1 | 22 |
2022/02/02 | 2 | 1 | 23 |
2022/02/03 | 2 | 1 | 24 |
2022/02/04 | 2 | 1 | 25 |
|
|
V
Date | Month Order | Is working day | Work Days Sum |
January - 21 21
2022/01/01 | 1 | 0 | |
2022/01/02 | 1 | 0 | |
2022/01/03 | 1 | 1 | 1 |
2022/01/04 | 1 | 1 | 2 |
2022/01/05 | 1 | 1 | 3 |
2022/01/06 | 1 | 1 | 4 |
.....
2022/01/27 | 1 | 1 | 19 |
2022/01/28 | 1 | 1 | 20 |
2022/01/29 | 1 | 0 | 20 |
2022/01/30 | 1 | 0 | 20 |
2022/01/31 | 1 | 1 | 21 |
February 20 41
2022/02/01 | 2 | 1 | 1 |
2022/02/02 | 2 | 1 | 2 |
2022/02/03 | 2 | 1 | 3 |
2022/02/04 | 2 | 1 | 4 |
2022/02/05 | 2 | 0 | 4 |
.....
Any idea on how I can change my dax query to achieve output of second table below the down arrow would be much appreciated.

Rank categories by sum (Power BI)

I need to rank products for my dashboard. Each day, we store sales of products. In result we have this dataset example:
+-----------+------------+-------+
| product | date | sales |
+-----------+------------+-------+
| coffee | 11/03/2019 | 15 |
| coffee | 12/03/2019 | 10 |
| coffee | 13/03/2019 | 28 |
| coffee | 14/03/2019 | 1 |
| tea | 11/03/2019 | 5 |
| tea | 12/03/2019 | 2 |
| tea | 13/03/2019 | 6 |
| tea | 14/03/2019 | 7 |
| Chocolate | 11/03/2019 | 30 |
| Chocolate | 11/03/2019 | 4 |
| Chocolate | 11/03/2019 | 15 |
| Chocolate | 11/03/2019 | 10 |
+-----------+------------+-------+
My attempt
I actualy managed to Rank my products but not in the way I wanted it; In fact, the ranking process increase by the number of rows. for example, chocolate is first but we record 4 rows so coffee is ranked at 5 and not 2.
+-----------+------------+-------+-----+------+
| product | date | sales | sum | rank |
+-----------+------------+-------+-----+------+
| coffee | 11/03/2019 | 15 | 54 | 5 |
| coffee | 12/03/2019 | 10 | 54 | 5 |
| coffee | 13/03/2019 | 28 | 54 | 5 |
| coffee | 14/03/2019 | 1 | 54 | 5 |
| tea | 11/03/2019 | 5 | 20 | 9 |
| tea | 12/03/2019 | 2 | 20 | 9 |
| tea | 13/03/2019 | 6 | 20 | 9 |
| tea | 14/03/2019 | 7 | 20 | 9 |
| Chocolate | 11/03/2019 | 30 | 59 | 1 |
| Chocolate | 11/03/2019 | 4 | 59 | 1 |
| Chocolate | 11/03/2019 | 15 | 59 | 1 |
| Chocolate | 11/03/2019 | 10 | 59 | 1 |
+-----------+------------+-------+-----+------+
sum field formula formula:
sum =
SUMX(
FILTER(
Table1;
Table1[product] = EARLIER(Table1[product])
);
Table1[sales]
)
rank field formula :
rank = RANKX(
ALL(Table1);
Table1[sum]
)
As you can see, we get the following ranking:
1 : Chocolate
5 : Coffee
9 : Tea
Improvements
I would like to transform the previous result into :
1 : Chocolate
2 : Coffee
3 : Tea
Can you help me improving my ranking system and get a marvelous 1, 2, 3 instead of this ugly and not practical 1, 5, 9 ?
If you don't know the anwser, help by simply upvote the question ♥
Fortunately, this is an easy fix.
If you look at the documentation for the RANKX function, you'll notice an optional ties argument which you can set to Skip or Dense. The default is Skip but you want Dense. Try this:
rank =
RANKX(
ALL(Table1);
Table1[sum];
;;
"Dense"
)
(Those extra ; delimiters are there since we aren't specifying the optional value or order arguments.)

SAS:add one column from tableB to tableA

I have two table looks like and I want to add column score to tableA from tableB, then get tableC, how to do in SAS?
the only rule is to add a column in tableA name "score " and its value is same as column "score" in tableB (which are all the same in tableB)
+----+---+---+---+
| id | b | c | d |
+----+---+---+---+
| 1 | 5 | 7 | 2 |
| 2 | 6 | 8 | 3 |
| 3 | 7 | 8 | 1 |
| 4 | 5 | 7 | 2 |
| 5 | 6 | 8 | 3 |
| 6 | 7 | 8 | 1 |
+----+---+---+---+
tableA
+---+---+-------+
| e | f | score |
+---+---+-------+
| 3 | 7 | 11 |
| 4 | 6 | 11 |
| 5 | 5 | 11 |
+---+---+-------+
tableB
+----+---+---+---+-------+
| id | b | c | d | score |
+----+---+---+---+-------+
| 1 | 5 | 7 | 2 | 11 |
| 2 | 6 | 8 | 3 | 11 |
| 3 | 7 | 8 | 1 | 11 |
| 4 | 5 | 7 | 2 | 11 |
| 5 | 6 | 8 | 3 | 11 |
| 6 | 7 | 8 | 1 | 11 |
+----+---+---+---+-------+
tableC
If the "id" is present in both tables, you can use the following to create Table C:
PROC SQL;
CREATE TABLE tableC AS
SELECT a.*, b.score
FROM tableA a JOIN tableB b
ON a.id = b.id;
QUIT;
Please confirm that this is what you need?

How to increase column values?

How to increase column values from:
1 | 1 | 7.317073
2 | 1 | 14.634146
3 | 1 | 24.390244
4 | 2 | 7.317073
5 | 2 | 14.634146
6 | 2 | 24.390244
To:
1 | 1 | 7.317073
2 | 1 | 14.634146
3 | 1 | 24.390244
4 | 2 | 7.317073
5 | 2 | 14.634146
6 | 2 | 24.390244
7 | 3 | 7.317073
8 | 3 | 14.634146
9 | 3 | 24.390244
10 | 4 | 7.317073
11 | 4 | 14.634146
12 | 4 | 24.390244
I'm using Open Office.
Assuming that the top left corner is A1, set the fourth row such:
A4: =A3+1
B4: =roundup(A4/3)
C4 =C1
And pull them up to row 12
For ColumnA simply selecting the first three rows, grabbing the fill handle (black square at the bottom right of the range) and dragging down to suit should be sufficient.
An alternative here to ROUNDUP is, in B1 and copied down:
=INT((ROW()-1)/3)+1
For ColumnC as for ColumnA but with Crl depressed.