Avoid double counting in Power BI sum - powerbi
I'm trying to count the number of factory operators used to manufacture parts during a shift but I am double counting them as this example illustrates:
Machine Groups
Group A : Machines 1 and 2, employing 3 operators per shift
Group B : Machines 3 and 4, employing 2 operators per shift
Shift Output
Group
Machine
Operators
Item
Quantity
Grp A
Mach 1
3
Nuts
1000
Grp A
Mach 2
3
Bolts
500
Grp B
Mach 3
2
Washers
2000
Grp B
Mach 4
2
Springs
1500
Total
10
5000
So the total quantity of parts is correct but the total number of operators is incorrect as it should only be 5. Operators are being double-counted because they make 2 different parts.
I have tried using an implicit sum on the operators column and also a DAX sum
Sum Operators = SUM(Production(Operators))
I have also tried with a matrix rather than a simple table but get the same result.
(There will not always be 2 items per shift. There could sometimes be 3 or 4)
If I understand you correctly, you can
click the down arrow on the Operators Values line.
Select to create a new quick measure
Average (or Min or Max since they would all be the same) by Group
Then you can delete the original Operators entry and rename this new one.
Note: I am very new with this so there may be more efficient methods to do this
I've had a read and play and come up with an alternative which also works:
Related
Trying to find Top 10 products within categories through Regex
I have a ton of products, separated into different categories. I've aggregated each products revenue, within their category and I now need to locate the top 10. The issue is, that not every product have sold within a given timeframe, or some category doesn't even have 10 products, leaving me with fewer than 10 values. As an example, these are some of the values: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,3,3,5,6,20,46,47,53,78,92,94,111,115,139,161,163,208,278,291,412,636,638,729,755,829,2673 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,57,124,158,207,288,547 0,0,90,449,1590,10492 0 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,7,12,14,32,32,37,62,64,64,64,94,100,103,109,113,114,114,129,133,148,152,154,160,167,177,188,205,207,207,209,214,214,224,225,238,238,244,247,254,268,268,285,288,298,301,305,327,333,347,348,359,362,368,373,402,410,432,452,462,462,472,482,495,511,512,532,566,597,599,600,609,620,636,639,701,704,707,728,747,768,769,773,805,833,899,937,1003,1049,1150,1160,1218,1230,1262,1327,1377,1396,1474,1532,1547,1565,1760,1768,1836,1962,1963,2137,2293,2423,2448,2451,2484,2529,2609,3138,3172,3195,3424,3700,3824,4310,4345,4415,4819,4943,5083,5123,5158,5334,5734,6673,7160,7913,9298,9349,10148,11047,11078,12929,18535,20756,28850,63447 63,126 How would you get as close as possible to capturing the top 10 within a category, and how would you ensure that it is only products that have sold, that are included as a possibility? And all of this through Regex. My current setup is only finding top 3 and a very basic setup: Step 1: ^.*\,(.*\,.*\,.*)$ finding top 3 Step 2: ^(.*)\,.*\,.*$ finding the lowest value of the top 3 products Step 3: Checking if original revenue value is higher than, or equal to, step 2 value. Step 4: If yes, then bestseller, otherwise just empty value. Thanks in advance
You didn't specify a programming language so I'm going with Javascript here but this regex is quite compatible with almost any regex flavor: (?:[1-9]\d*,){0,9}[1-9]\d*$ (?:[1-9]\d*,){0,9} - between 0 and 9 times, find numbers followed by a comma; ignore zero revenue [1-9]\d* - guarantee a non-zero revenue one time $ - end line anchor https://regex101.com/r/1xBQD3/1 If your data were to have leading zeros like 0,0,00090,00449,01590,10492 for some reason then you would need this regex which is 33% more expensive: (?:0*[1-9]\d*,){0,9}0*[1-9]\d*$
Limiting decimal length in sas
I have 2 datasets which I am comparing. I have taken difference between each column in the two datasets. However SAS is returning these differences upto 15-16 decimal places. How can I limit the output to 8 decimal places. For example I have column A in dataset 1 and Column A in dataset 2. I have created a new column newA which is data 1 A- data 2 A. The result is coming as 0.0009876543210987654. I want to see the out till 0.00098765 i.e till 8 decimal places.
Use the ROUND function, ROUND(DIFFVAR,10e-8), or format the difference variable 10.8. Or use Proc COMPARE and the FUZZ option.
Log function with negative observations
I have the following data and I would like to apply the log() function: v1 2 3 4 -1 5 Expected output: v1 2 0.30 ~ log(2) 3 0.48 ~ log(3) 4 0.60 ~ log(4) -1 . 5 0.70 ~ log(5) This is just a simplified version of the problem. There are 35000 observations in my dataset and I could not find any simple rules like drop if v1 <= 0 to solve this problem. Without screening my data first, one method in my mind is to use for loop and run the log() function over the observations. However, I couldn't find any websites telling me how to do that.
Stata will return missing if asked to take the logarithm of zero or negative values. But generate log_x = log(x) and generate log_x = log(x) if x > 0 will have precisely the same result, missings in the observations with problematic values. The bigger question here is statistical. Why do you want to take logarithms of such a variable any way? If your idea is to transform a variable, then other transformations are available. If the variable is a response or outcome variable, then a generalized linear model with logarithmic link will work even if there are some zero or negative values; the idea is just that the mean function should remain positive. There have been many, many threads raising these issues on Cross Validated and Statalist. I can't imagine why you think a loop is either needed or helpful here. With generate statements of the kind above, Stata automatically loops over observations.
How to find the minimal number of steps to make an array contain all identical elements?
I have an array of N items, example: 4 2 1 1 I want to make all the numbers the same in minimum operations, and i can perform only one type of operation: If I Add 1 to any number then I have to subtract 1 from another one In our example: adding 1 to 4th element and subtracting 1 from the first one 3 2 1 2 adding 1 to 3rd element and subtracting 1 from 1st one 2 2 2 2 array element can be 0 but not a negative number And i am coding this in c++ .
Here are some hints: Each operation decreases 1 and increases 1 to the total sum of elements, so the total sum does not change. Since average = sum / n - this means the average also does not change. when a1=a2=...=an, they all also equal to the average. Use these hints, and you can figure out an algorithm to do that.
How do I generate a mean by year and industry in Stata
I'm trying to generate in Stata the mean per year (e.g. 2002-2012) for each industry (by 2 digit SIC codes, so c. 50 different industries) I found how to do it for one year with: by sic_2digit, sort: egen test = mean(oancf_at_rsd10) if fyear == 2004 Is there a more efficient way to do this instead of repeating the command 10 times by hand and than adding the values together?
You can specify more than one variable with by:. by sic_2digit fyear, sort: egen test = mean(oancf_at_rsd10) Check out the help for by:, which gives the syntax and an example, and also that for collapse.