I have a data model in Power BI that, among other things, has the following tables
Employees (Dimension; employee ID/name)
Jobs (Dimension; contains details about the job, including job ID)
Employee history- Contains a record for each day an associate was in a job(snapshot table);
Job Budget History- Contains a record for each day a job was budgeted(snapshot table)
Calendar Table
The table is modeled like so (simplified version):
In Power BI, I am trying to make a simplified table view that contains measures based on both the budget history as well as the employee history for the most recent day in the dataset (simple count rows/distinct count of calendar table)
However, attempting to do so gives me the results below if I try to put both measures on the table. Basically it appears to be doing a cross join between each table and matching associates with jobs they don't have (this happens when the budget is added).
Of course, if I just do one of the singular measures everything works perfectly. I am fairly certain it is because there is no real connection between the 'employee' and the 'budget history' in this relationship, so it is just joining everything on the date without any context.
I have tried several things such as making inactive relationships with userelationship(), using visual level filters etc. but I'm not sure what the best option in this situation would be. (I am trying to avoid bidirectional relationships if possible)
Ideally this information should show on this date that Joe was present as President, Sally was present as an operator, and the Manager position had nobody, but all three were budgeted.
Any advice is appreciated. I have attached a simplified mockup pbix file for reference.
PBIX File
This is a complicated problem for many reasons. I was able to produce this report:
by removing field "Name" from the table and replacing it a measure:
Employee Name =
CALCULATE(
SELECTEDVALUE(Employees[Name]),
CROSSFILTER(Employees[Employee_ID], Employee_History[Employee_ID], BOTH)
)
It looks exactly like the report you want, but if you have additional requirements, you'll need to make sure that such approach works for you.
If this is acceptable, a brief explanation:
the root cause of the issue is missing Employee-Budget relationship. When you put Name in the table as a filter, it doesn't propagate to the budget table and causes a cartesian product.
Removing Name from the table eliminates the need for the filter propagation, but then you won't see employee names. I solved this by pulling employee names with the measure, where required propagation is forced by CROSSFILTER function (essentially, it's like a temporary bi-directional relation only when you need it, so it does not negatively affect the rest of the model).
Related
I would like two measures that SUM the Sales[Value] for all the Sales[ID] that have a specific StatusID in SalesStatus.
One that can filter on Sales[Date], and one that can filter on SalesStatus[statusDate]
Diagram
Regards,
Anders
In this scenario I would consider modifying your model to have only two tables by combining what appears to be two FACT tables (sales, sales status). Depending on what your data consists of I would either UNION the two tables after joining and then treat the date in your Sales table as another status date (i.e. shipped complete or sale finished, whatever that date represents) OR I would join the two tables and have two relationships to the date table.
This will create a duplicated data issue as you will ideally result in having the value column in your final fact table. If you go with the union option, you can force the user to select a single sales status effectively removing the sales duplication. If you end up with two connections to the date table, you can use the USERELATIONSHIP() function to write the two different sales measures, and the one that uses the date from the Sales table will need some clever tricks to ensure the data does measure does not duplicate. I would try to UNION the tables though.
For more details, I would research what's referred to as SEMI-ADDITIVE fact tables in datawarehousing. There is a great article from SQL BI on the subject. I have tried setting up models like you diagrammed and even if I could get them to work through intense DAX measures, they would produce unexpected results and have poor performance. I find the Semi additive fact table pattern to be a much cleaner solution once you get passed the data duplication that results.
Example:
Tableau automatically groups measures together but Power BI Desktop doesn't natively support this. I find it annoying to have to place measures under imported tables as the measures don't really belong to those "parent tables" (and quite often take input from multiple tables — which one would you consider the "parent"?)
So I experimented with some workarounds and I'm sharing the successful (as of the date of this post) ones here:
Method 1 (recommended): "Model" view > "Enter data" to enter a manual data table. Give a name like "_Measures_" so it appears on top of tables, and keep only the default dummy column "Column". Create/move measures under it, then right click to delete that "Column". Now you're left with a blank table that groups those measures under it.
Method 2: "Data" view > "New table" to create a DAX calculated table. Rest the same as above, except that for a DAX calculated table you can't delete the dummy column but instead you can hide it.
You can also "Enter data" using Power Query Editor but I don't recommend going with that extra step -- workarounds are supposed to be quick (and dirty) anyways!
Final results look like this (note the difference of the icons):
I've imported three tables into Power BI using a REST API, have added relationships between them, and am now trying to add fields from the various tables to a table on the canvas. The table names (from a Human Resources database) are named Employees, Job History, and Salary History.
Employees is joined to Job History using EmployeeID as a 1:Many relationship, and also to Salary History using EmployeeID on a 1:Many relationship.
I can add fields from the Employee table and EITHER the Salary History OR the Job History table to the table on the report canvas. However, if I try to add fields from all three tables, I'm seeing the error 'Can't display the data because Power BI can't determine the relationship between two or more fields'.
Could anyone advise where I'm going wrong? Many thanks.
If I understand correctly, you have a model like in this picture:
The way PBI filters work is: The 1: side table filters the N: side table. Filters propagate that way. So in this case, you can filter JobHistory with data from Employees, and SalaryHistory with data from Employees. But the 2 fact tables can't relate because the filters don't propagate that way.
Look into DAX measures like RELATED(), RELATEDTABLE() and USERELATIONSHIP() that might work for you.
Without that, I don't think you can use data from the 3 tables, since you have a model with 2 Fact Tables.
Someone correct me if I'm wrong.
I have created a powerpivot model include in the image below. I am trying to include the "IncurredLoss" value and have it sliced by time. Written Premium is in the fact table and is displaying correctly. I am aiming for IncurredLoss to display in a similar fashion
I have tried the following solutions:
Add new related column: Related(LossSummary[IncurredLoss]). Result: No data
DAX Summary Measure: =CALCULATE(SUM(LossSummary[IncurredLoss])). Result: Sum of everything in LossSummary[IncurredLoss] (not time sliced)
Simply adding the Incurred Loss column to the Pivot Table panel. Result: Sum of everything in LossSummary[IncurredLoss] (not time sliced)
A few other notes:
LossKey joins LossSummary to PolicyPremiumFact
Reportdate joins PolicyPremiumFact to the Calendar.
There is 1 row in LossSummary per date and Policy. LossKey contains this information and is the PK on that table.
Any ideas, clarifications or pointers are most certainly welcome. Thank you!
The related column should work. I was able to get it to work in both Excel 2016 and Power BI Desktop. Rather than bombarding you with questions, I'll try and walk through how I would troubleshoot further, in the hopes it gets you to a solution faster:
First, check the PolicyPremiumFact table inside Power Pivot and see if the IncurredLossRelated field is blank or not. If it is consistently blank, then the related column isn't working. The primary reason the related column wouldn't work is if there's a problem with your relationships. Things I would check:
Ensure that the relationships are between the fields you think they are between (i.e. you didn't accidentally join LossKey in one table to a different field in the other table)
Ensure that the joined fields contain the same data (i.e. you didn't call a field LossKey, but in fact, it isn't the LossKey at all)
Ensure that the joined fields are the same data type in Power Pivot (this is most common with dates: e.g. joining a text field that looks like a date to an actual date field may work, but not act as expected)
If none of the above are the problem, it doesn't hurt to walk through your data for a given date in Power Pivot. E.g. filter your PolicyPremiumFact table to a specific date and look at the LossKeys. Then go the LossSummary table and filter to those LossKeys. Stepping through like this might reveal an oversight (e.g. maybe the LossKeys weren't fully loaded into your model).
If none of the above reveals anything, or if the related column is not blank inside Power Pivot, my suggestion would be to try a newer version of Excel (e.g. Excel 2016), or the most recent version of Power BI Desktop.
If the issue still occurs in the most recent version of Excel/Power BI Desktop, then there's something else going on with your data model that's impacting the RELATED calculation. If that's the case, it would be very helpful if you could mock up your file with sample data that reproduces the problem and share it.
One final suggestion I have is to consider restructuring your tables before they arrive in your data model. In your case, I'd recommend restructuring PolicyPremiumFact to include all the facts from LossSummary, rather than having a separate table joined to your primary fact table. This is what you're doing with the RELATED field to some extent, but it's cleaner to do before or as your data is imported into Power Pivot (e.g. using SQL or Power Query) rather than in DAX.
Hope some of this helps.
I have two tables from Azure SQL in PowerBI, using direct query:
EMP(empID PK)
contactInfo(contactID PK, empID FK, contactDetail)
which have an obvious one-to-many relationship from EMP.empID to contactInfo.empID. The foreign key constraint is successfully enforced.
However I can only create a many-to-one relationship (contactInfo.empID to EMP.empID) in PowerBI. If I ever try the opposite, PowerBI always automatically converts the relationship to many-to-one (by swapping the from and to column), which prevents me from creating visuals. Does PowerBI think the two are equivalent?
Update:
What I'm doing is to just create a table in PowerBI showing the join results of these two tables. The foreign key constraint is contactInfo.empID REFERENCES EMP.empID, which is many-to-one. That should not be a problem, I guess, since I can directly query the join using SQL.
Please also suggest if I should create the foreign key in the opposite direction.
More info on failure to create visual
The exact error message is:
Can't display the data because Power BI can't determine
the relationship between two or more fields.
Version: 2.43.4647.541 (PBIDesktop)
To reproduce the error:
DB schema is as follows:
What I want is a table in PowerBI showing contact and sales info of am employee, that is, joining all the four tables. The error will occur when VALUES of the table visual contains "empName, contactDetail, contactType, productName", however, error will NOT occur if I only include "empName, contactDetail, contactType" or "empName, productName". At first I thought the problem may lie in the relationship between contactInfo and emp, but it now seems to be more complicated. I guess it may be caused by multiple one-to-many relationships?
Expanding my comments to make an answer:
Root of the Problem
In your data model, a single employee can have multiple contacts and multiple sales. But, there's no way for Power BI to know which contactDetail corresponds to which productName, or vice versa (which it needs to know to display them together in a table).
Deeper Explanation
Let's say you have 1 emp row, that joins to 10 rows in the sales table, and 13 rows in the contactInfo table. In SQL, if you start from the emp row and outer join to the other 2 tables, you'll get back (1*10)*(1*13) rows (130 rows in total). Each row in the contactInfo table is repeated for each row in the sales table.
That repetition can be a problem if you do something like sum the sales and don't realize a single sales record is repeated 13 times but might be fine otherwise (e.g. if you just want a list of sales and all associated contacts).
Power BI vs. SQL
Power BI works slightly differently. Power BI is designed primarily to aggregate numbers, and then break them down by different attributes. E.g. sales by product. Sales by contact. Sales by day. In order to do this, Power BI needs to know 100% how to divide numbers up between the attributes on your table.
At this point, I'll note that your database diagram doesn't include any obvious metrics that you'd use Power BI to aggregate. However, Power BI doesn't know that. It behaves the same whether you have metrics to aggregate or not. (And failing all else, Power BI can always count your rows to make a metric.)
Let's say that you have a metric on your sales table called Amt Sold. If you bring in the empName, productName, and Amt Sold columns, Power BI will know exactly how to divide up Amt Sold between empName and ProductName. There's no problem.
Now add in contactDetail. Using your database diagram, Power BI has no way of knowing how an Amt Sold metric in the sales table relates to a given contactDetail. It might know that $100 belongs to empID 27. And that empID 27 corresponds to 3 records in the contactInfo table. But it has no way of knowing how to divide up the $100 between those 3 contacts.
In SQL, what you'd get is 3 contacts, each showing the $100 amount sold. But in Power BI, that would imply $300 was sold, which isn't the case. Even equally dividing the $100 up would be misleading. What if the $100 belonged entirely to 1 contact? So instead, Power BI shows the error you're seeing.
My Recommendations
If you can, I recommend changing your data model before your bring it in. Power BI works best with a single fact table, which would contain your metrics (like amount sold). You then join this fact table to as many lookup tables as you like (e.g. customer, product, etc.), directly. This allows you to slice & dice your metrics with any combination of attributes from any of the lookup tables. I would recommend checking out the star schema data model and the concept of lookup tables: powerpivotpro.com/2016/02/data-modeling-power-pivot-power-bi
At the very least, you would want to flatten your tables (i.e. merge the contactInfo and sales tables into a single table before importing them into your data model.
It may be that Power BI isn't the best tool for what you're trying to accomplish. If all you want is a table showing all sales & contact info for an employee, without any associated metrics, a regular reporting tool + SQL query might be a better way to go.
Side Note: You can't reverse a many:one relationship to get past this error. The emp table contains one row per empID. Both the contactInfo and sales tables contain multiple rows with the same empID. This means the emp table is necessarily the "one" side of the relationship to both those tables. You can't arbitrarily change that.