PowerBI using customer %% as X-Axis - powerbi

I'm looking to create a line graph with accumulated revenue and -gross profit as 0-100% Value Lines on the Y-axis and then have 0-100% of the amount of customers on the X-axis.
I've managed to get the Y-axis and the accumulated lines working using Months on the X-axis. That's not the hard part though, I need to get customer count 0 - 100% on the X-AXIS and I cannot seem to figure it out.
No particular parameters desired i.r.t. the customer accumulation. We just want to be able to see how much the sales are rising along with the relative distinct count of customers in our database.
This way we can see that the first 20% of customers hold i.e. 50% of the revenue etc.
It's a bit weird, I've tried adding custom columns to calculate the percentage of a customer to the grand total of distinct counts but I cannot seem to get it to accumulate. Perhaps i'm looking entirely in the wrong direction and there's a better solution to this. I'd appreciate any help!
KR,
Maarten

You can make the x-axis as a calculated column. If you don't already have a Customer dimension table, then you can create a new calculated table as follows:
CustomerAxis =
VAR CustomerRev =
ADDCOLUMNS (
DISTINCT ( Data[Customer] ),
"CustomerRevenue", CALCULATE ( SUM ( Data[Revenue] ) )
)
VAR TotalRev = SUM ( Data[Revenue] )
VAR TotalCustomers = DISTINCTCOUNT ( Data[Customer] )
VAR CumulativeCols =
ADDCOLUMNS (
CustomerRev,
"CumulativeRevenue",
SUMX (
FILTER ( CustomerRev, [CustomerRevenue] >= EARLIER ( [CustomerRevenue] ) ),
[CustomerRevenue]
),
"CumulativeCount",
COUNTROWS (
FILTER ( CustomerRev, [CustomerRevenue] >= EARLIER ( [CustomerRevenue] ) )
)
)
RETURN
ADDCOLUMNS (
CumulativeCols,
"% of Customers", DIVIDE ( [CumulativeCount], TotalCustomers ),
"% of Revenue", DIVIDE ( [CumulativeRevenue], TotalRev )
)
Then you can drag and drop these last two % columns into a line chart to get the desired curve.
PBIX file I created: https://www.dropbox.com/s/w6trky7t0h42gkp/Pareto.pbix?dl=0

Related

How can I get DAX to return just the record with a date closest to the slicer?

I'm hoping someone can help as I've completely run out of ideas.
I'm working on performance reporting data, producing a number of visuals to summarise the most recent data. To allow users to retrospectively produce reports from previous quarters, I have added a date slicer as a way to "View data as at xxxx date".
Here's a rough representation of my data table - the due dates are in English format (dd/mm/yyyy):
The ratings are calculated in another system (based on a set of targets), so there are no calculated columns here. In reality, there are a lot more measures that report on different time periods (some weekly, some annually, etc) and there are different lags before the data is "due".
I eventually managed to get a measure that returned the latest actual:
MostRecentActual =
VAR SlicerDate = MAX ( Dates[Day] )
RETURN
CALCULATE (
SUM ( Data[Actual] ),
Data[Data due] <= SlicerDate,
LASTDATE ( Data[Data due] )
)
I'm not completely sure I've done it right but it seems to work. I'd be happier if I understood it properly, so explanations or alternatives would be welcomed.
What I'm trying to do now is a basic summary pie chart at the beginning which shows the proportion of the measures that were red, amber, green or unrated as at the date selected. So I would need it to count the number of each rating, but only one for each measure and only for the date that is closest to (but before) the slicer date, which would vary depending on the measure. So using the above three measures, if the slicer was set to 10/10/2019 (English format - dd/mm/yyyy), it would count the RAGs for Q3 2019/20 for measures A an C and for Q2 2019/20 for measure B as there is a time lag which means the data isn't ready until the end of the month. Results:- A: Amber, B: Green, C:Red.
If I were able to create the measure that counted these RAGs, I would then want to add it to a pie chart, with a legend that is "Rating", so it would split the chart up appropriately. I currently can't seem to be able to do that without it counting all dates before the slicer (not just the most recent) or somehow missing ratings from the total for reasons I don't understand.
Any help would be very gratefully received.
Many thanks
Ben
Further update. I've been working on this for a while!
I have created a COUNTAX measure to try to do what I was wanting to do. In some circumstances, it works, but not all and not in the crucial ones. My measure is:
TestCountaxpt2 =
VAR SlicerDate = MAX ( Dates[Date] )
VAR MinDiff =
MINX (
FILTER (
ALL ( Data ),
Data[Ref] IN VALUES ( Data[Ref] ) &&
Data[Data due] <= SlicerDate
),
ABS ( SlicerDate - Data[Data due] )
)
VAR thisdate =
MINX (
FILTER (
ALL ( Data ),
Data[Ref] IN VALUES ( Data[Ref] ) &&
ABS ( SlicerDate - Data[Data due] ) = MinDiff
),
Data[Data due]
)
RETURN
COUNTAX (
FILTER ( Data, Data[Data due] = thisdate && Data[Ref] IN VALUES ( Data[Ref] ) ),
Data[RAG]
)
It produces the following table for a subset of the performance measures, which looks almost ok:
Table showing the result of the TestCountaxpt2 measure:
The third column is the measure above and it seems to be counting one RAG per measure and the dates look correct as the slicer is set to 3rd January 2020. The total for column 3 confuses me. I don't know what that is counting and I don't understand why it doesn't add up to 7.
If I add in the RAG column from the data table, it goes a bit more wrong:
Same table but with RAG Rating added:
The pie chart that is produced is also wrong. It should show 2 Green, 2 Red, 2 Grey (no rating) and 1 Amber. This is what happens.......
Pie chart for the DAX measure, with RAG Rating in the legend:
I can see what it is doing, which is to work out the most recent due date to the slicer in the whole table and using that (which is 1st Jan 2020) whereas I want it to calculate this separately for each measure.
Link to PBIX:
https://drive.google.com/file/d/1RTokOjAUADGHNXvZcnCCSS3Dskgc_4Cc/view?usp=sharing
Reworking the formula to count the ratings:
RAGCount =
VAR SlicerDate =
MAX ( Dates[Day] )
RETURN
COUNTAX (
ADDCOLUMNS (
SUMMARIZE (
FILTER ( Data, Data[Data due] <= SlicerDate ),
Data[Ref],
"LastDateDue", LASTDATE ( Data[Data due] )
),
"CountRAG", CALCULATE (
COUNTA ( Data[RAG] ),
Data[Data due] = EARLIER ( [LastDateDue] )
)
),
[CountRAG]
)
Here's the table it produces:
The reason for Total = 4 for the third column is straightforward. The SelectDate is maximal over all of the Refs in the table and there are only four Refs that match that date.
To fix this and get the totals you're after, you'll need to iterate over each Ref and calculate the SlicerDate for each independently and only then do your lookups or sums.
I haven't tested this code but it should give you an idea of a direction to try:
MostRecentActual =
VAR SlicerDate = MAX ( Dates[Day] )
RETURN
SUMX (
ADDCOLUMNS (
SUMMARIZE (
FILTER ( Data, Data[Data due] <= SlicerDate ),
Data[Ref],
"LastDateDue", LASTDATE ( Data[Data due] )
),
"SumActual", CALCULATE (
SUM ( Data[Actual] ),
Data[Data due] = EARLIER ( [LastDateDue] )
)
),
[SumActual]
)
Going inside to outside,
FILTER the table to ignore any dates beyond the SlicerDate.
Calculate the LastDateDue for each Ref using SUMMARIZE.
Sum the Actual column for each Ref value using its specific LastDateDue.
Iterate over this summary table to add up SumActual across all Refs in the current scope.
Note that for 4, only the Total row in your visual will contain multiple Refs since the innermost Data table inside FILTER is not the entire Data table but only the piece visible in the local filter context.

Power BI: Use unrelated date table to create a measure

Hi I have been struggling with this for a bit now and I hope I didn't miss a previous question. I am trying to get a count of the vehicles we have based on an EOMONTH date. We are buying and selling cars on a regular basis and for reporting we need to know how many we had at the end of each month and the report is a rolling 12 months.
I've tried creating the relationship with the purchasedate of the vehicle to the date of my date table but when I create the measure (Used to calculate the number of vehicles purchased but haven't been sold):
SalesBlank = CALCULATE(
COUNT(Vehicles[MVANumber]),
FILTER(Vehicles, Vehicles[purchasedate] <= RELATED('Date'[EOMONTH]) && ISBLANK(Vehicles[saledate])))
I only get a count of vehicles purchased that month and don't have a sale date - I'm not surprised because my relationship with the date table is the purchase date.
How can I set up a measure to look at the date table and filter the vehicles table with this logic:
purchasedate <= date[EOMONTH] && ISBLANK(salesdate)
Any help would be greatly appreciated!!
Thanks,
Matt
Sample Data and Desired Results
Relationships
If I understand you correctly, you want to get a count of the vehicles on hand at the end of each month. That could be calculated by counting the vehicles with a purchase date less than or equal to the selected end of month and subtracting the count of vehicles with a sale date less than or equal to the selected end of month.
You can create an active relationship between Vehicle[PurchaseDate] and Date[Date]. Then create an inactive relationship based upon Vehicles[SaleDate] and Date[Date].
You could use a measure that is something like this:
Inventory Count =
VAR MaxDate =
MAX ( 'Date'[Date] )
VAR MinDate =
CALCULATE ( MIN ( 'Date'[Date] ), ALL ( 'Date' ) )
VAR Purch =
CALCULATE (
COUNT ( 'Vehicles'[VehicleID] ),
DATESBETWEEN ( 'Date'[Date], MinDate, MaxDate )
)
VAR Sales =
CALCULATE (
COUNT ( 'Vehicles'[VehicleID] ),
USERELATIONSHIP ( 'Date'[Date], Vehicles[Sale Date] ),
DATESBETWEEN ( 'Date'[Date], MinDate, MaxDate )
)
VAR MaxPurDt =
CALCULATE ( MAX ( 'Vehicles'[Purchase Date] ), ALL ( 'Vehicles' ) )
VAR MaxSlDt =
CALCULATE ( MAX ( 'Vehicles'[Sale Date] ), ALL ( 'Vehicles' ) )
RETURN
IF (
MIN ( 'Date'[Date] ) <= MaxPurDt
|| MIN ( 'Date'[Date] ) <= MaxSlDt,
Purch - Sales
)
This measure gets a cumulative count of purchases and a cumulative count of sales and then subtracts them. The IF check is to avoid propagation of cumulative totals beyond the maximum date in the Vehicle table.
I'm not sure how to interpret your showing of just 3 months in the desired results. This will produce the same answers as what you have, but without a filter applied to the table, it starts at 3/31/2016 (the date of the first sale).
Edit: There's probably a more efficient way along the lines you were thinking, but it is escaping me at the moment.

DAX ALLEXCEPT to sum by category of multiple dimension tables

I would like to calculate total by category. The category is in the dimension table.
Here is sample file:
DAX ALLEXCEPT total by category.pbix
I have the following model:
These are my expected results. Total by Color:
I thought I could achieve expected results by the following measure:
ALLEXCEPT_color =
CALCULATE (
[Sales],
ALLEXCEPT (
FactTable, -- surprisingly 'dim1' table in that place gives wrong results
dim1[Color]
)
)
Or alternatively using method suggested by Alberto Ferrari https://www.sqlbi.com/articles/using-allexcept-versus-all-and-values/:
ALL_VALUES_color =
CALCULATE (
[Sales],
ALL (FactTable), -- again, 'dim1' produces wrong results, has to be FactTable
VALUES ( dim1[Color] )
)
Both these measures work and return proper results. However they multiply displayed results making Cartesian product of all the dimensions. Why? How to prevent it?
I achieve expected results with measure:
Expected_Results_Color =
IF (
ISBLANK ( [Sales] ),
BLANK (),
[ALLEXCEPT_color]
)
Probably I am missing something about ALLEXCEPT function so I do not get what I want for the first shot. What is the logic behind using ALLEXCEPT function with multiple tables, especially with far off dimensions, away from the center of star schema.
What pattern to use? Here I found promising solution which looks like this:
ByCategories =
CALCULATE (
SUM ( FactTable[Sales] ),
ALLEXCEPT (
dim1,
dim1[Color]
),
ALLEXCEPT (
dim2,
dim2[Size]
),
ALLEXCEPT (
dim3,
dim3[Scent]
)
)
But as I tested it before it does not work. It does not aggregate [Sales] by dimensions but produces [Sales] as they are.
So I found out that this is the correct direction:
ByCategories =
CALCULATE (
SUM ( FactTable[Sales] ),
ALLEXCEPT (
FactTable, -- here be difference
dim1[Color],
dim2[Size],
dim3[Scent]
)
)
I speculate there might be also another way.
Measure =
var MyTableVariable =
ADDCOLUMNS (
VALUES ( dim1[color] ),
"GroupedSales", [Sales]
)
RETURN
...
If only we could retrieve single scalar value of GroupedSales from MyTableVariable and match it with appropriate color in table visual.
I would be very grateful for any further insights in calculating total for category.
This is expected behaviour.
Power BI tables will include every row for which any measure in the table does not evaluate to BLANK().
ALLEXCEPT stops the values in the id and size columns from affecting the filter context when [Sales] is computed, and so every possible value for these two columns will give the same (non-blank) result (this causes the cartesian product that you see).
For example, on the (a, black, big) row, the filter context for the measures contains:
FactTable[id] = {"a"}
dim1[color] = {"black"}
dim2[size] = {"big"}
Then CALCULATE([Sales], ALLEXCEPT(...)) removes the FactTable[id] and dim2[size] from the filter context when evaluating [Sales]; so the new filter context is just:
dim1[color] = {"black"}
[Sales] in this filter context is not BLANK(), so the row is included in the result.
The proper way to fix this is to wrap the result in an IF, as you do in your Expected_Results_Color measure, or to add a filter on [Sales] not Blank to the table in Power BI.

DAX AVERAGEX including a 0 for total average

My table represents users working on a production line. Each row in the table provides the number of units a user produced within a 15 minute window. I am trying to calculate Units/Hour per User (which seems to be working fine), but my overall Average seems to be off.
Table and results of my measure:
Row by row it is what I am looking for. But the total average of 179.67 is wrong. It should be 196. I think for the 11:30 timestamp, Leondro did not have any work, and it is including a 0 for him. I would like to exclude that.
Measure:
UPH =
var unitshour = CALCULATE(SUM(Table1[Units]) / (DISTINCTCOUNT(Table1[DateTime])/4))
var users = AVERAGEX( VALUES(Table1[DateTime]), DISTINCTCOUNT(Table1[Username]))
RETURN
unitshour/ users
I don't think 196 is the number you want if you want to treat each time period equally. I'd suggest this alternative:
UPH =
AVERAGEX (
VALUES ( Table1[DateTime] ),
CALCULATE ( 4 * SUM ( Table1[Units] ) / DISTINCTCOUNT ( Table1[Username] ) )
)
If you want each time period to be weighted by the number of users in that time period, then the 196 it what you want.
UPHUserWeighted =
VAR Summary =
SUMMARIZE (
Table1,
Table1[DateTime],
Table1[Username],
"UPH", 4 * SUM ( Table1[Units] ) / DISTINCTCOUNT ( Table1[Username] )
)
RETURN AVERAGEX ( Summary, [UPH] )

Calculate cumulative sum of summarized table column

I having trouble calculating the cumulative sum of a column on PowerBI.
I have a big offer table and I want to run a pareto analysis on it. Following many tutorials, I created a SUMMARIZED table by offer and a sum of their sales. So the table definition is:
summary = SUMMARIZE(big_table; big_table[offer]; "offer sales"; sum(big_table[sales]))
Many of the forums and stackoverflow answers I found have direct me to the following formula for cumulative sum on column:
cum_sales =
CALCULATE(
sum([offer_sales]);
FILTER(
ALLSELECTED(summary);
summary[offer_sales] <= max( summary[offer_sales])
)
)
However the resulting table is not correct:
What I need is simply to have the offers ordered by sales descending and then add the current row's sales amount to the previous row's sales,
So I excepted numbers closer to:
1st row: 1.5M
2nd row: 2.1M
3rd row: 2.6M and so on
But (maybe) because of my data structure and (certainly) lack of knowledge on how PowerBI works, I'm not getting the right results...
Total Amount = SUM ( 'Fact'[Amount] )
Offer Visual Cumulative =
VAR OfferSum =
ADDCOLUMNS (
ALLSELECTED ( 'Offer'[Offer] ),
"amt", [Total Amount]
)
VAR CurrentOfferAmount = [Total Amount]
VAR OffersLessThanCurrent =
FILTER (
OfferSum,
[amt] <= CurrentOfferAmount
)
RETURN
SUMX (
OffersLessThanCurrent,
[amt]
)
There's no need to pre-aggregate to a summary table. We can handle that as in the measure above.
This assumes a single fact table named 'Fact', and a table of distinct offers, 'Offer'.
Depending on what you're doing in terms of other filters on 'Offer', you may need to instead do as below:
Offer Visual Cumulative =
VAR OfferSum =
ADDCOLUMNS (
ALLSELECTED ( 'Offer'[Offer] ),
"amt", CALCULATE ( [Total Amount], ALLEXCEPT ( 'Offer', 'Offer'[Offer] ) )
)
...
The rest of the measure would be the same.
The measure is fairly self-documenting in its VARs. The first VAR, OfferSum is a table with columns ('Offer'[Offer], [amt]). This will include all offers displayed in the current visual. CurrentOfferAmount is the amount for the offer on the current row/axis label of the visual. OffersLessThanCurrent takes OfferSum and filters it. Finally, we iterate OffersLessThanCurrent and add up the amounts.
Here's a sample: