I have a table of dates (every date from 2003 to 2035) in my data model but am wondering if I need to create relationship(s) between this and my other data tables, wondered if anyone could please share best practice?
If so, my main table has several columns of dates so which would I link to?
To be honest, I am thinking I shouldn't create a relationship as any filtering of the date table will then only filter my model by which date column I have a relationship with?
I hope all that makes sense. It's more of an abstract question at the moment but my ultimate goal is to create some kind of rolling average.
Thanks in advance.
The best practise is clearly to create a relationship between your date table and your data table (fact table I assume). But you have to choose the most relevant column to make your link, knowing that it's preferred to not make multiple relationships between the same tables.
If you have a "snapshot date" column, you could make the link with this one to see the status for that period for example. It really is up to you.
If the filtering is annoying to you, you can always disable it on the visuals.
I hope it helps.
Related
I'm trying to make a measure that needs two "many to many" relationships between the same tables to work.
Is this possible ? i'm trying to make it work but it seems that the PowerBi program doesn't permit this from happening.
It's a name column and a date column. i use filters based on this two columns but it only permits one relation. So, if i create a date relation the name filter doesn't work and vice versa.
Someone knows another way around it ?
The data is sensitive so i can't post it, but if needed i can try to make a fake one that shares its characteristics.
I would like two measures that SUM the Sales[Value] for all the Sales[ID] that have a specific StatusID in SalesStatus.
One that can filter on Sales[Date], and one that can filter on SalesStatus[statusDate]
Diagram
Regards,
Anders
In this scenario I would consider modifying your model to have only two tables by combining what appears to be two FACT tables (sales, sales status). Depending on what your data consists of I would either UNION the two tables after joining and then treat the date in your Sales table as another status date (i.e. shipped complete or sale finished, whatever that date represents) OR I would join the two tables and have two relationships to the date table.
This will create a duplicated data issue as you will ideally result in having the value column in your final fact table. If you go with the union option, you can force the user to select a single sales status effectively removing the sales duplication. If you end up with two connections to the date table, you can use the USERELATIONSHIP() function to write the two different sales measures, and the one that uses the date from the Sales table will need some clever tricks to ensure the data does measure does not duplicate. I would try to UNION the tables though.
For more details, I would research what's referred to as SEMI-ADDITIVE fact tables in datawarehousing. There is a great article from SQL BI on the subject. I have tried setting up models like you diagrammed and even if I could get them to work through intense DAX measures, they would produce unexpected results and have poor performance. I find the Semi additive fact table pattern to be a much cleaner solution once you get passed the data duplication that results.
Example:
I wanted to know what would be the best approach for creating the dim tables. Can I maintain it as a single table with all fields and use them as required or create separate dim tables and use them individually.
Can someone please help me out here
PS: I'm a beginner here.
Creating 1 table per dimension is the best practice. In data warehouse concept, you will get 4 types of schema as below-
Start Schema
Snowflakes Schema
Galaxy Schema
Combined Schema
People select any of the above based on their Data type/nature, requirement and other parameter. But in all case, there are single table per dimension. This is easy to maintain and give better performance.
I've imported three tables into Power BI using a REST API, have added relationships between them, and am now trying to add fields from the various tables to a table on the canvas. The table names (from a Human Resources database) are named Employees, Job History, and Salary History.
Employees is joined to Job History using EmployeeID as a 1:Many relationship, and also to Salary History using EmployeeID on a 1:Many relationship.
I can add fields from the Employee table and EITHER the Salary History OR the Job History table to the table on the report canvas. However, if I try to add fields from all three tables, I'm seeing the error 'Can't display the data because Power BI can't determine the relationship between two or more fields'.
Could anyone advise where I'm going wrong? Many thanks.
If I understand correctly, you have a model like in this picture:
The way PBI filters work is: The 1: side table filters the N: side table. Filters propagate that way. So in this case, you can filter JobHistory with data from Employees, and SalaryHistory with data from Employees. But the 2 fact tables can't relate because the filters don't propagate that way.
Look into DAX measures like RELATED(), RELATEDTABLE() and USERELATIONSHIP() that might work for you.
Without that, I don't think you can use data from the 3 tables, since you have a model with 2 Fact Tables.
Someone correct me if I'm wrong.
I have created a powerpivot model include in the image below. I am trying to include the "IncurredLoss" value and have it sliced by time. Written Premium is in the fact table and is displaying correctly. I am aiming for IncurredLoss to display in a similar fashion
I have tried the following solutions:
Add new related column: Related(LossSummary[IncurredLoss]). Result: No data
DAX Summary Measure: =CALCULATE(SUM(LossSummary[IncurredLoss])). Result: Sum of everything in LossSummary[IncurredLoss] (not time sliced)
Simply adding the Incurred Loss column to the Pivot Table panel. Result: Sum of everything in LossSummary[IncurredLoss] (not time sliced)
A few other notes:
LossKey joins LossSummary to PolicyPremiumFact
Reportdate joins PolicyPremiumFact to the Calendar.
There is 1 row in LossSummary per date and Policy. LossKey contains this information and is the PK on that table.
Any ideas, clarifications or pointers are most certainly welcome. Thank you!
The related column should work. I was able to get it to work in both Excel 2016 and Power BI Desktop. Rather than bombarding you with questions, I'll try and walk through how I would troubleshoot further, in the hopes it gets you to a solution faster:
First, check the PolicyPremiumFact table inside Power Pivot and see if the IncurredLossRelated field is blank or not. If it is consistently blank, then the related column isn't working. The primary reason the related column wouldn't work is if there's a problem with your relationships. Things I would check:
Ensure that the relationships are between the fields you think they are between (i.e. you didn't accidentally join LossKey in one table to a different field in the other table)
Ensure that the joined fields contain the same data (i.e. you didn't call a field LossKey, but in fact, it isn't the LossKey at all)
Ensure that the joined fields are the same data type in Power Pivot (this is most common with dates: e.g. joining a text field that looks like a date to an actual date field may work, but not act as expected)
If none of the above are the problem, it doesn't hurt to walk through your data for a given date in Power Pivot. E.g. filter your PolicyPremiumFact table to a specific date and look at the LossKeys. Then go the LossSummary table and filter to those LossKeys. Stepping through like this might reveal an oversight (e.g. maybe the LossKeys weren't fully loaded into your model).
If none of the above reveals anything, or if the related column is not blank inside Power Pivot, my suggestion would be to try a newer version of Excel (e.g. Excel 2016), or the most recent version of Power BI Desktop.
If the issue still occurs in the most recent version of Excel/Power BI Desktop, then there's something else going on with your data model that's impacting the RELATED calculation. If that's the case, it would be very helpful if you could mock up your file with sample data that reproduces the problem and share it.
One final suggestion I have is to consider restructuring your tables before they arrive in your data model. In your case, I'd recommend restructuring PolicyPremiumFact to include all the facts from LossSummary, rather than having a separate table joined to your primary fact table. This is what you're doing with the RELATED field to some extent, but it's cleaner to do before or as your data is imported into Power Pivot (e.g. using SQL or Power Query) rather than in DAX.
Hope some of this helps.