Subtract Set Value at Aggregated Level - powerbi

Values are for two groups by quarter.
In DAX, need to summarize all the data but also need to remove -5 from each quarter (-20 for full year) in 2021 for Group 1, without allowing the value to go below 0.
This only impacts:
Group 1
2021
However, I also need to retain the data details without the adjustment. So I can't do this in Power Query.
Data:
Group
Date
Value
1
01/01/2020
10
1
02/01/2020
9
1
03/01/2020
10
1
04/01/2020
8
1
05/01/2020
10
1
06/01/2020
11
1
07/01/2020
18
1
08/01/2020
2
1
09/01/2020
1
1
10/01/2020
0
1
11/01/2020
1
1
12/01/2020
0
1
01/01/2021
1
1
02/01/2021
12
1
03/01/2021
12
1
04/01/2021
3
1
05/01/2021
13
1
06/01/2021
14
1
07/01/2021
7
1
08/01/2021
1
1
09/01/2021
0
1
10/01/2021
1
1
11/01/2021
2
1
12/01/2021
1
2
01/01/2020
18
2
02/01/2020
7
2
03/01/2020
6
2
04/01/2020
8
2
05/01/2020
12
2
06/01/2020
13
2
07/01/2020
14
2
08/01/2020
8
2
09/01/2020
7
2
10/01/2020
6
2
11/01/2020
5
2
12/01/2020
4
2
01/01/2021
12
2
02/01/2021
18
2
03/01/2021
19
2
04/01/2021
20
2
05/01/2021
12
2
06/01/2021
12
2
07/01/2021
7
2
08/01/2021
18
2
09/01/2021
16
2
10/01/2021
15
2
11/01/2021
13
2
12/01/2021
1
Result:
Qtr/Year
Group 1 Value
Group 2 Value
Total
Q1-2020
29
31
60
Q2-2020
29
33
62
Q3-2020
21
29
50
Q4-2020
1
15
16
2020
80
108
188
Q1-2021
20
49
69
Q2-2021
25
44
69
Q3-2021
3
41
44
Q4-2021
0
29
29
2021
48
271
211

I'd suggest summarizing at the Year/Quarter/Group granularity and summing that up as follows:
SumValue =
VAR Summary =
SUMMARIZE (
Table2,
Table2[Year],
Table2[Qtr],
Table2[Group],
"#RawValue", SUM ( Table2[Value] ),
"#RemoveValue", IF ( Table2[Year] = 2021 && Table2[Group] = 1, 5 )
)
RETURN
SUMX ( Summary, MAX ( [#RawValue] - [#RemoveValue], 0 ) )
(This assumes the amount to remove for a year is the same as for four quarters.)

Related

How to add a row where there is a disruption in series of numbers in Stata

I'm attempting to format a table of 40 different age-race-sex strata to be inputted into R-INLA and noticed that it's important to include all strata (even if they are not present in a county). These should be zeros. However, at this point my table only contains records for strata that are not empty. I can identify places where strata are missing for each county by looking at my strata variable and finding the breaks in the series 1 through 40 (marked with a red x in the image below).
In these places (marked by the red x) I need to add the missing rows and fill in the corresponding county code, strata code, population=0, and the correct corresponding race, sex, age code for the strata.
If I can figure out a way to add an empty row in the spaces with the red Xs from the image, and correctly assign the strata code (and county code) to these empty/missing rows, I am able to populate the rest of the values with the code below:
recode race = 1 & sex= 1 & age =4 if strata = 4
...etc
I'm wondering if there is a way to add the missing rows using an if statement that considers the fact that there are supposed to be forty strata for each county code. It would be ideal if this could populate the correct county code and strata code as well!
Dataex sample data:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float OID str5 fips_statecounty double population byte(race sex age) float strata
1 "" 672 1 1 1 1
2 "" 1048 1 1 2 2
3 "" 883 1 1 3 3
4 "" 1129 1 1 4 4
5 "" 574 1 2 1 5
6 "" 986 1 2 2 6
7 "" 899 1 2 3 7
8 "" 1820 1 2 4 8
9 "" 96 2 1 1 9
10 "" 142 2 1 2 10
11 "" 81 2 1 3 11
12 "" 99 2 1 4 12
13 "" 71 2 2 1 13
14 "" 125 2 2 2 14
15 "" 103 2 2 3 15
16 "" 162 2 2 4 16
17 "" 31 3 1 1 17
18 "" 32 3 1 2 18
19 "" 18 3 1 3 19
20 "" 31 3 1 4 20
21 "" 22 3 2 1 21
22 "" 28 3 2 2 22
23 "" 28 3 2 3 23
24 "" 44 3 2 4 24
25 "" 20 4 1 1 25
26 "" 24 4 1 2 26
27 "" 21 4 1 3 27
28 "" 43 4 1 4 28
29 "" 19 4 2 1 29
30 "" 26 4 2 2 30
31 "" 24 4 2 3 31
32 "" 58 4 2 4 32
33 "" 6 5 1 1 33
34 "" 11 5 1 2 34
35 "" 13 5 1 3 35
36 "" 7 5 1 4 36
37 "" 7 5 2 1 37
38 "" 9 5 2 2 38
39 "" 10 5 2 3 39
40 "" 11 5 2 4 40
41 "01001" 239 1 1 1 1
42 "01001" 464 1 1 2 2
43 "01001" 314 1 1 3 3
44 "01001" 232 1 1 4 4
45 "01001" 284 1 2 1 5
46 "01001" 580 1 2 2 6
47 "01001" 392 1 2 3 7
48 "01001" 440 1 2 4 8
49 "01001" 41 2 1 1 9
50 "01001" 38 2 1 2 10
51 "01001" 23 2 1 3 11
52 "01001" 26 2 1 4 12
53 "01001" 34 2 2 1 13
54 "01001" 52 2 2 2 14
55 "01001" 40 2 2 3 15
56 "01001" 50 2 2 4 16
57 "01001" 4 3 1 1 17
58 "01001" 2 3 1 2 18
59 "01001" 3 3 1 3 19
60 "01001" 6 3 2 1 21
61 "01001" 4 3 2 2 22
62 "01001" 6 3 2 3 23
63 "01001" 4 3 2 4 24
64 "01001" 1 4 1 4 28
65 "01003" 1424 1 1 1 1
66 "01003" 2415 1 1 2 2
67 "01003" 1680 1 1 3 3
68 "01003" 1823 1 1 4 4
69 "01003" 1545 1 2 1 5
70 "01003" 2592 1 2 2 6
71 "01003" 1916 1 2 3 7
72 "01003" 2527 1 2 4 8
73 "01003" 68 2 1 1 9
74 "01003" 82 2 1 2 10
75 "01003" 52 2 1 3 11
76 "01003" 54 2 1 4 12
77 "01003" 72 2 2 1 13
78 "01003" 129 2 2 2 14
79 "01003" 81 2 2 3 15
80 "01003" 106 2 2 4 16
81 "01003" 10 3 1 1 17
82 "01003" 14 3 1 2 18
83 "01003" 8 3 1 3 19
84 "01003" 4 3 1 4 20
85 "01003" 8 3 2 1 21
86 "01003" 14 3 2 2 22
87 "01003" 17 3 2 3 23
88 "01003" 10 3 2 4 24
89 "01003" 4 4 1 1 25
90 "01003" 1 4 1 3 27
91 "01003" 2 4 1 4 28
92 "01003" 2 4 2 1 29
93 "01003" 3 4 2 2 30
94 "01003" 4 4 2 3 31
95 "01003" 10 4 2 4 32
96 "01003" 5 5 1 1 33
97 "01003" 4 5 1 2 34
98 "01003" 3 5 1 3 35
99 "01003" 5 5 1 4 36
100 "01003" 5 5 2 2 38
end
label values race race
label values sex sex
My answer to your previous question
Nested for-loop: error variable already defined
detailed how to create a minimal dataset with all strata present. Therefore you should just merge that with your main dataset and replace missings on the absent strata with whatever your other software expects, zeros it seems.
The complication most obvious at this point is you need to factor in a county variable. I can't see any information on how many counties you have in your dataset, which may affect what is practical. You should be able to break down the preparation into: first, prepare a minimal county dataset with identifiers only; then merge that with a complete strata dataset.

Stata: Changing Number Format

I am using estpost and esttab to export tabulation results in Stata.
sysuse auto, clear
estpost tabulate turn foreign
esttab ., cells("b(fmt(0))") unstack
---------------------------------------------------
(1)
Domestic Foreign Total
b b b
---------------------------------------------------
31 1 0 1
32 0 1 1
33 1 1 2
34 2 4 6
35 2 4 6
36 1 8 9
37 2 2 4
38 1 2 3
39 1 0 1
40 6 0 6
41 4 0 4
42 7 0 7
43 12 0 12
44 3 0 3
45 3 0 3
46 3 0 3
48 2 0 2
51 1 0 1
Total 52 22 74
---------------------------------------------------
N 74
---------------------------------------------------
Although I can change the format of the cells, I couldn't find a way to change the format of the observation number(N) and the total number of observations in each column. I tried adding obs(fmt(%10.2fc)) as an estab option but it didn't work.

Duplication of data entries by id if they meet a certain condition

In the original choice data set, individuals (id) are captured making purchases (choice) among all the product options possible (assortchoice is a product code). Every individual always faces the same set of products to choose from; as a result the value of choice is always either 0 or 1 ("was the product chosen or not?").
clear
input
id assortchoice choice sumchoice
2 12 1 2
2 13 0 2
2 14 0 2
2 15 0 2
2 16 0 2
2 17 0 2
2 18 0 2
2 19 0 2
2 20 0 2
2 21 0 2
2 22 0 2
2 23 1 2
3 12 1 1
3 13 0 1
3 14 0 1
3 15 0 1
3 16 0 1
3 17 0 1
3 18 0 1
3 19 0 1
3 20 0 1
3 21 0 1
3 22 0 1
3 23 0 1
4 12 1 3
4 13 0 3
4 14 1 3
4 15 1 3
4 16 0 3
4 17 0 3
4 18 0 3
4 19 0 3
4 20 0 3
4 21 0 3
4 22 0 3
4 23 0 3
end
I created the following code to understand how many choices were made by each individual:
egen sumchoice=total(choice), by(id)
In this example, an individual 3 (id=3) only chose one product (since sumchoice=1), but individual 2 made two choices (sumchoice=2). Finally, individual 4 made three choices (sumchoice=3).
Since this is a choice data, I need to transform all the instances of multiple choices into sets of single choices.
What I mean by that: if an individual made two purchases, I need to duplicate the choice set for that individual twice; for an individual who made 3 purchases, I need to replicate the choice set three times, so the final structure looks like the data set below.
clear
input
id transaction assortchoice choice
2 1 12 1
2 1 13 0
2 1 14 0
2 1 15 0
2 1 16 0
2 1 17 0
2 1 18 0
2 1 19 0
2 1 20 0
2 1 21 0
2 1 22 0
2 1 23 0
2 2 12 0
2 2 13 0
2 2 14 0
2 2 15 0
2 2 16 0
2 2 17 0
2 2 18 0
2 2 19 0
2 2 20 0
2 2 21 0
2 2 22 0
2 2 23 1
3 1 12 1
3 1 13 0
3 1 14 0
3 1 15 0
3 1 16 0
3 1 17 0
3 1 18 0
3 1 19 0
3 1 20 0
3 1 21 0
3 1 22 0
3 1 23 0
4 1 12 1
4 1 13 0
4 1 14 0
4 1 15 0
4 1 16 0
4 1 17 0
4 1 18 0
4 1 19 0
4 1 20 0
4 1 21 0
4 1 22 0
4 1 23 0
4 2 12 0
4 2 13 0
4 2 14 1
4 2 15 0
4 2 16 0
4 2 17 0
4 2 18 0
4 2 19 0
4 2 20 0
4 2 21 0
4 2 22 0
4 2 23 0
4 3 12 0
4 3 13 0
4 3 14 0
4 3 15 1
4 3 16 0
4 3 17 0
4 3 18 0
4 3 19 0
4 3 20 0
4 3 21 0
4 3 22 0
4 3 23 0
end
***update:
transaction indicates which transaction order this is:
bysort id assortchoice (choice): gen transaction=_n
Hence, choice=1 should appear only once per each transaction.
The answer isn't quite "use expand" as there is a twist that you don't want exact replicates.
expand sumchoice
bysort id assortchoice (choice) : replace choice = 0 if _n != _N & choice == 1
list if id == 2 , sepby(assortchoice)
+-----------------------------------+
| id assort~e choice sumcho~e |
|-----------------------------------|
1. | 2 12 0 2 |
2. | 2 12 1 2 |
|-----------------------------------|
3. | 2 13 0 2 |
4. | 2 13 0 2 |
|-----------------------------------|
5. | 2 14 0 2 |
6. | 2 14 0 2 |
|-----------------------------------|
7. | 2 15 0 2 |
8. | 2 15 0 2 |
|-----------------------------------|
9. | 2 16 0 2 |
10. | 2 16 0 2 |
|-----------------------------------|
11. | 2 17 0 2 |
12. | 2 17 0 2 |
|-----------------------------------|
13. | 2 18 0 2 |
14. | 2 18 0 2 |
|-----------------------------------|
15. | 2 19 0 2 |
16. | 2 19 0 2 |
|-----------------------------------|
17. | 2 20 0 2 |
18. | 2 20 0 2 |
|-----------------------------------|
19. | 2 21 0 2 |
20. | 2 21 0 2 |
|-----------------------------------|
21. | 2 22 0 2 |
22. | 2 22 0 2 |
|-----------------------------------|
23. | 2 23 0 2 |
24. | 2 23 1 2 |
+-----------------------------------+

SQL Update Subsequent Column OFFSET FETCH NEXT

I like to know is there a way to doing auto looping / counter batch, updating SQL column like using OFFSET / FETCH NEXT method
QUESTION : Below table have 20 rows, I like to update DealerId column the First 4 rows as 1,2,3,4 and the next subsequent 4 rows repeating as 1,2,3,4 values
Something like below
NEED TO MODIFY TABLE
Id DealerId
1 1 1
2 2 2
3 3 3
4 4 4
5 5 1
6 6 2
7 7 3
8 8 4
9 9 1
10 10 2
11 11 3
12 12 4
13 13 1
14 14 2
15 15 3
16 16 4
17 17 1
18 18 2
19 19 3
20 20 4
ORIGINAL TABLE
Id DealerId StoreId TerminalId MessageNo CreatedDate
1 1 86 5027 029500021201403031434350039 2014-03-03 14:34:37.347
2 2 86 5027 029500021201403031434350039 2014-03-05 10:31:59.903
3 3 86 5027 029500021201403031434350039 2014-03-05 10:33:41.293
4 4 86 5027 029500021201403031434350039 2014-03-05 10:46:50.057
5 5 86 5027 029500021201403031434350039 2014-03-05 10:50:23.910
6 6 33 5338 004000003201403051508010255 2014-03-05 15:08:03.247
7 7 26 5595 704201181201403061024330013 2014-03-06 10:24:34.590
8 8 26 5595 704201181201403061026180022 2014-03-06 10:26:19.517
9 9 33 5338 004000003201403061043150312 2014-03-06 10:43:16.013
10 10 86 5027 029500021201403031434350039 2014-03-06 14:27:51.717
11 11 86 5027 029500021201403031434350039 2014-03-06 14:38:40.593
12 12 86 5027 029500021201403031434350039 2014-03-06 14:44:25.947
13 13 521 4905 051100003002447 2014-03-07 12:51:07.487
14 14 521 4905 051100003002447 2014-03-07 12:55:07.300
15 15 521 4905 051100003002447 2014-03-07 12:56:24.793
16 16 521 4905 051100003002447 2014-03-07 12:57:43.123
17 17 521 4905 051100003002447 2014-03-07 14:15:11.093
18 18 632 5120 088800003201403071441280026 2014-03-07 14:41:29.733
19 19 632 5120 088800003201403071456500050 2014-03-07 14:56:51.727
20 20 632 5120 088800003201403071459240064 2014-03-07 14:59:24.953
Assuming that all id's are consequently, starting from 1:
In MySQL:
update OriginalTable
set DealreId = mod(id-1, 4)+1
and in Microsoft SQL Server:
update OriginalTable
set DealreId = ((id-1)%4)+1
And if the id's are not consequently (or are not starting from 1) you can use cursor to update it one by one:
DECLARE c1 CURSOR FOR
SELECT id, dealerId
FROM OriginalTable
ORDER BY id, dealerId
OPEN c1
declare #id int
declare #dealerId int
declare #i int
set #i = 1
FETCH NEXT FROM c1
INTO #id, #dealerId
while ##FETCH_STATUS = 0
BEGIN
update OriginalTable
set dealerId = #i
where current of c1
if (#i < 4)
set #i = #i + 1
else
set #i = 1
FETCH NEXT FROM c1
INTO #id, #dealerId
END

Using the jpeg quantization matrix of one image to compress another image

I have two images, A and B, and I need to estimate B's quantization table and compress A using this table. What is the best way to do this, using libjpeg or, even better, opencv?
I've used libjpeg's utility 'djpeg' to find the quantization table of a image, but I'm not sure how to interpret its output and use it with libjpeg. Besides, I need to find this matrix and do the compression from inside my program, which renders (I think) 'djpeg' unusable in this case.
Following, is the output of 'djpeg' for a test image, running it with:
djpeg -v -v cat1.jpg > /dev/null
Start of Image
JFIF APP0 marker: version 1.01, density 96x96 1
Define Quantization Table 0 precision 0
5 3 3 5 7 12 15 18
4 4 4 6 8 17 18 17
4 4 5 7 12 17 21 17
4 5 7 9 15 26 24 19
5 7 11 17 20 33 31 23
7 11 17 19 24 31 34 28
15 19 23 26 31 36 36 30
22 28 29 29 34 30 31 30
Define Quantization Table 1 precision 0
5 5 7 14 30 30 30 30
5 6 8 20 30 30 30 30
7 8 17 30 30 30 30 30
14 20 30 30 30 30 30 30
30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30
Start Of Frame 0xc0: width=450, height=320, components=3
Component 1: 2hx2v q=0
Component 2: 1hx1v q=1
Component 3: 1hx1v q=1
Define Huffman Table 0x00
0 1 5 1 1 1 1 1
1 0 0 0 0 0 0 0
Define Huffman Table 0x10
0 2 1 3 3 2 4 3
5 5 4 4 0 0 1 125
Define Huffman Table 0x01
0 3 1 1 1 1 1 1
1 1 1 0 0 0 0 0
Define Huffman Table 0x11
0 2 1 2 4 4 3 4
7 5 4 4 0 1 2 119
Start Of Scan: 3 components
Component 1: dc=0 ac=0
Component 2: dc=1 ac=1
Component 3: dc=1 ac=1
Ss=0, Se=63, Ah=0, Al=0
End Of Image
Thanks in advance!