Composing comparison groups based on different categorical columns in PowerBI/DAX - powerbi

I have a dataset from a survey, consisting of a unique rows (representing respondents who have answered a series of questions).
The following columns are in the dataset:
Respondent ID (unique)
Business Unit (categorical)
Level (categorical)
Gender (categorical)
Columns for each question with an answer from 1 to 9
I transposed the data to be able to calculate avg, stdev etc for each of the questions, and make visuals for each of the categories. So far so good.
But now I want to take it one step further:
I want to build a comparison visual, where I would like users of the dashboard to be able to choose which "groups" they want to compare, but from different categories (BU, Gender, Level).
E.g., compare the average scores for all questions between Business Unit = X (=group 1), and Gender = Male (=group 2).
It would be even better if people could "compose" their comparison groups from all of the categories.
E.g., compare the average score for all questions between BU=X-Gender=Male-Level=Manager and BU=X (so in essence compare a subgroup with the BU total), or any of the combinations.
Any insights on how to do this would be much appreciated !
Thanks in advance 🙂
Dieter from Brussels

Related

Looking for DAX formula to calculate percentage per category for a year for a multiple choice ques

I have done almost everything but not able to achieve the desired solution.
I have two surveys for 2020 and 2021 which have multiple-choice questions.
So there are two tables:
Table 1
Table 2
Since it's a multiple-choice question, I want to calculate the percentage of respondents who chose e.g. western Europe as an attractive region based on total number of respondents for a survey. For this, I have linked the above two charts using Survey ID.
I want a measure to calculate the last column in below table 3:
Table 3
For eg. for Western Europe, 3 respondents in 2021 survey have marked it as a attractive region. Now the percentage I want is 3 divided by 4 (respondents in 2021 survey). The percentage I am getting is 3/9 (respondents in 2020+2021 survey)
This is the DAX I am using:
Denominator = calculate(sum('Table 2'[Weightage]),allexcept('Table 2',Table 2[Survey year]))
%category = divide(sum('Table 1'[Weightage]),Denominator,0)
Please suggest the change in DAX function to require desired percentage. Since its a multiple choice question I am facing issues. I really need this to be resolved, please suggest what can be done. Thankyou.

Calculating percentage filtering by rows which have the same key (groups of rows)

I have survey data, a database of student's evaluations of their professors for each course they took on a given semester. Each row is basically a questionnaire, containing a single student's evaluation. Columns carry data about which professor, on which course and at what time, and the questions asked. Its a flat table, with nearly 70k rows per semester.
In the end, i have to be able to calculate the percentage of positive values within each group (likert scale, only considering value 3 and 4 as positive mentions). Each group (or course, basically) is defined by a composite key (course id, section, year, semester, teacher id) which i've put in a calculated column as a string. Each resulting group key is a value on a row, repeated for those rows that pertain to the same group (so that i end up with, say, 20 students evaluating a professor on a course, having the same group key)
My brain is fried, i've tried unpivoting question columns, using declared variables to hold the temporary values, calculate using filters, but i've ended up with nothing. Any assistance or experience of this sort of analysis would be helpful. I can provide sample data if required.
Hope you guys have a better day than i'm having lol

Modify measure to only affect Totals in Matrix Table

I'm new to PBI Desktop & StackOverflow in general. If anyone can help me out with this situation, much appreciated.
I created this matrix table where it displays the count of male/female students in a student division (I just made up these numbers). I created a measure called 'Percentage of males' that calculates the number of male students and divide it by the total number of students in each division.
The calculation wasn't, the hard part, but the issue is when I place the measure in the values field. Instead of just wanting to show it for totals only, it appears under the female & male subcategories as well (refer to image). Does anyone know if there's a way to modify the measure so that it only affects the total?
I tried looking at some videos, but couldn't really find one for this situation.
Proportion of Males = DIVIDE(
COUNTX(
FILTER(sheet2, Sheet[M/F]="M"), Sheet2[M/F], COUNTA(Sheet2[M/F]), 0)

Power BI: Opportunity model (relating 3 tables)

I'm trying to create an "opportunity" calculation model, but the output results that I'm getting are not accurate. Sample problem would be McDonald's and Burger King who sell food in various regions, some regions have both BK and McD in the area and they both sell similar food types, but some have both in the area however they can't fulfill the same order type (an example would be zip code 10049 where BK and McD both exist, but McD sells burgers and BK sells salads; so BK can cover the area, but can't fulfill potential customer want.)
In the example spreadsheet, there are three tabs, first with McD sales, second with BK sales, and the third reconciles the naming convention between McD's and BK's orders.
I started by connecting the tables with relationships. I figured I need to connect McD to BK by Zip, then McD to Crossreference. Due to many-to-many relationship limitations in PBI, I'm forced to create lookup tables with unique values for zips, and order names. Looks a bit messy, but does the job. The problem is that I can look up the zip code connections, but not the sales for the potential orders.
Relationships:
This is a clear example of how things don't work. Zip code 10048 sums up McD's sales and displays it for each BK order type. The expected output would be $5 for angus and $3 for onion soup, $8 in total.
If I try to connect crossreference BK order names to BK orders, then I get an ambiguity error.
Spreadsheet data file:
https://docs.google.com/spreadsheets/d/1WM9OD7voApax7uNJ6_bJk75zfj9FQN9tSf2jU1hXl7c/edit?usp=sharing
Excel and Power BI files: https://drive.google.com/drive/folders/1hOdP5ZglHcqo_dk2GMlXr6Xmc5Ywm6nj?usp=sharing
I don't think you'll be able to do this exactly how you want to. It would basically equivalent to creating multiple relationships between two tables. Power BI doesn't let you do that.
There are some workarounds though. For example, you can pull over the McD[order] values into your BK table using a calculated column:
MDorder = MAXX(FILTER(Crossreference, Crossreference[BK] = BK[order]), Crossreference[McD])
This will allow you to pull across the price from the McD table using a lookup or similar max type function:
Price = LOOKUPVALUE(McD[price], McD[order], BK[McD Order], McD[Zip], BK[Zip])
or
Price = MAXX(FILTER(McD, McD[Zip] = BK[Zip] && McD[order] = BK[McD Order]), McD[price])
Once you do that, you can work entirely on the BK table.
Note that some price rows will have nulls since there was no corresponding McD row with matching zip and order. (I suppose you could take the median price of those orders over the zip codes they do exist in and plug that in those cases...) If the price is uniform across zip codes, then this can be made simpler.
Also, notice that when you put the price into a table and use an implicit measure on it, it will likely default to a sum and you'll get $6 for 10048 and angus since you have duplicate rows. Switching to max will get you the $3 if that's what you prefer.
This type of merging is also possible to do in the query editor, but I couldn't play with that on the pbix you included since I don't have access to the data source on your C: drive.

How to display data from different data source tables in a single table in Power BI

I have a couple of different tables in my Report, for demonstration purposes lets say that I have 1 data source that is Actual Invoice amounts and then I have another table that is Forecasted amounts. Each table has several dimensions that are the same between them, let say Country, Region, Product Classification and Product.
What I want is to be able to display a table/matrix that pulls information from both of these data sources like this
Description Invoice Forecast vs Forecast
USA 300 325 92%
East 150 175 86%
Product Grouping 1 125 125 100%
Product 1 50 75 67%
Product 2 75 50 150%
Product Grouping 3 25 50 50%
Product 3 25 50 50%
West 150 150 100%
Product Grouping 1 75 100 75%
Product 1 25 50 50%
Product 2 50 50 100%
Product Grouping 3 75 50 150%
Product 3 75 50 150%
I have not been able to figure out a way to combine the information from the multiple data source into a single matrix table, so any help would be appreciated. The one thing that I did find was somebody hard coded the structure of the rows into a separate data source and then used DAX expressions to pull in the pieces of information into the columns, but I don't like this solution because the structure of the rows is not constant.
What you're asking about is a common part of the star schema: combining facts from different fact tables together into a single visual or report.
What Not To Do (That You Might Be Tempted To)
What you don't want to do is combine the 2 fact tables into a single table in your Power BI data model. That's a lot of work and there's absolutely no need. Especially, since there are likely dimensions that the 2 fact tables do not have in common (e.g. actual amounts might be associated with a customer dimension, but forecast amounts wouldn't be).
What you also don't want to do is relate the 2 fact tables to each other in any way. Again, that's a lot of work. (Especially since there's no natural way to relate them at the row level.)
What To Do
Generally, how you handle 2 fact tables is the same as you handle a single fact table. First, you have your dimensions (country, region, classification, product, date, customer). Then you load your fact tables, and join them to the dimensions. You do not join your fact tables to each other. You then create measures (i.e. DAX expressions).
When you want to combine measures from the two facts together in a single matrix, you only use rows/columns that are meaningful to both fact tables. For example, actual amounts might be associated with a customer, but forecast amounts aren't. So you can't include customer information in the matrix. Another possibility is that actual amounts are recorded each day, whereas forecasts were done for the whole month. In this situation, you could put month in your matrix (since that's meaningful to both), but you wouldn't want to use date because Power BI wouldn't know how to divide up forecasts to individual dates.
As long as you're only using dimensions & attributes that are meaningful to both fact tables, you can easily create a matrix as you envision above. Simply drag on the attributes you want, then add the measures (i.e. DAX expressions).
The Invoice & Forecast columns would both be measures. The two measures from different fact tables can be combined into a 3rd measure for the vs. Forecast measure. Everything will work as long as you're just using dimensions/attributes that mean something to both fact tables.
I don't see anything in your proposed pivot table that strikes me as problematic.
Other Situations
If you have a situation where forecasts are at a month level and actual is at a date level, then you may be wondering how you'd relate them both to the same date dimension. This situation is called having different granularities, and there's a good article here I'd recommend reading that has advice: https://www.daxpatterns.com/handling-different-granularities/. Indeed, there's a whole section on comparing budget with revenue that you might find useful.
Finally, you mention that someone hard-coded the structure of the rows and used DAX expressions to build everything. This does, admittedly, sound like overkill. The goal with Power BI is flexibility. Once you have your facts, measures & dimensions, you can combine them in any way that makes sense. Hard-coding the rows eliminates that flexibility, and is a good clue that something isn't right. (Another good clue that something isn't right is when DAX expressions seem really complicated for something that should be easy)
I hope my answer helps. It's a general answer since your question is general. If you have specific questions about your specific situation, definitely post additional questions. (Sample data, a description of the model, the problem you're seeing, and what you want to see is helpful to get a good answer.)
If you're brand new to Power BI, data models, and the star schema, Alberto Ferrari and Marco Russo have an excellent book that I'd recommend reading to get a crash course: https://www.sqlbi.com/books/analyzing-data-with-microsoft-power-bi-and-power-pivot-for-excel/