I recently changed my data source to an Azure SQL DB, I kept the column/table names and datatypes the exact same as previous for a smooth transition. However, for two of the tables the new columns are not recognized in the original DAX measures even though they are named the same, this is also causing issues in tabular editor and DAX studio.
I have worked through the ETL process and can't see any issues throughout, the only thing that seemed to work was changing the column name in power query to the exact same name. Any ideas if this is on the SQL side or an issue with Power Bi?
I expected this to work without any issues.
Related
When I get data into Power BI I can edit the query as well as perform edit to the model.
What is difference between edit performed in query edit vs during modelling?
When you edit the query, you use Power Query, with its own Query Editor user interface. The steps you apply are recorded in the "M" language. Use Power Query to extract, transform, and finally load data into the Data Model.
Once the data is in the Data Model, you use DAX to create measures that you use in visuals. You can also use DAX to add more columns or even tables to the data model.
Whether to use Power Query or DAX to add columns or tables to the data model depends on a variety of factors. Some things are dead easy to do in Power Query, but harder to achieve with DAX, and vice versa. If you create a column with a formula that depends on a DAX measure, then you can only do that with DAX, because Power Query is not aware of the measures that are created after the load into the data model.
Power Query is very powerful, but the M code syntax is very different to the Excel formula syntax, or the VBA macro language. Learning to write advanced M code can be quite challenging.
DAX, on the other hand, behaves very similar to Excel formulas. Many Excel functions can even be used in DAX verbatim. If you know Excel, you've already got a head start on DAX and you can ease your way into it by learning additional functions and then expanding into more complex formulas.
The latter is probably the reason why many data manipulations are done in DAX, even though they could as well have been done in Power Query.
There are also some efficiencies with data storage and performance. Power Query makes use of query folding with SQL queries, for example, where its transformations are actually performed at the data source, i.e. on the SQL server side, and not in desktop client, and only the final query result is transferred to the desktop client.
Edit after comment: When the data is loaded into the data model, an algorithm processes the data and sorts it in a way that is most efficient for maximum compression and minimum storage. I don't have any concreate examples, but adding a column in Power Query will result in a smaller footprint than adding the same column with DAX. Read more about the compression algorithm VertiPaq here: https://towardsdatascience.com/inside-vertipaq-in-power-bi-compress-for-success-68b888d9d463
But apart from that, it mainly comes down to personal preference based on skill and experience.
By the way, many of your questions can be answered by reading through the Microsoft documentation, e.g. https://learn.microsoft.com/en-us/power-bi/guidance/import-modeling-data-reduction
My power Bi model contains the following tables:
However, when I connect to this model from DAX studio, I get the following:
Please note that all of these local date tables show the same metadata like the date table. Is this a potential problem? I believe this might be the reason behind this report working comparatively slow while development.
These tables show date tables automatically created by Power BI for all fields type "Date", to enable date hierarchy for them (year/quarter/month/day).
If you want to remove them, go to File/Options, then under "Current File" uncheck "Auto Date/Time":
These hidden tables will disappear. If you want to turn this feature off for all future/new files (I always do), uncheck a similar option under "Global".
I have created a powerpivot model include in the image below. I am trying to include the "IncurredLoss" value and have it sliced by time. Written Premium is in the fact table and is displaying correctly. I am aiming for IncurredLoss to display in a similar fashion
I have tried the following solutions:
Add new related column: Related(LossSummary[IncurredLoss]). Result: No data
DAX Summary Measure: =CALCULATE(SUM(LossSummary[IncurredLoss])). Result: Sum of everything in LossSummary[IncurredLoss] (not time sliced)
Simply adding the Incurred Loss column to the Pivot Table panel. Result: Sum of everything in LossSummary[IncurredLoss] (not time sliced)
A few other notes:
LossKey joins LossSummary to PolicyPremiumFact
Reportdate joins PolicyPremiumFact to the Calendar.
There is 1 row in LossSummary per date and Policy. LossKey contains this information and is the PK on that table.
Any ideas, clarifications or pointers are most certainly welcome. Thank you!
The related column should work. I was able to get it to work in both Excel 2016 and Power BI Desktop. Rather than bombarding you with questions, I'll try and walk through how I would troubleshoot further, in the hopes it gets you to a solution faster:
First, check the PolicyPremiumFact table inside Power Pivot and see if the IncurredLossRelated field is blank or not. If it is consistently blank, then the related column isn't working. The primary reason the related column wouldn't work is if there's a problem with your relationships. Things I would check:
Ensure that the relationships are between the fields you think they are between (i.e. you didn't accidentally join LossKey in one table to a different field in the other table)
Ensure that the joined fields contain the same data (i.e. you didn't call a field LossKey, but in fact, it isn't the LossKey at all)
Ensure that the joined fields are the same data type in Power Pivot (this is most common with dates: e.g. joining a text field that looks like a date to an actual date field may work, but not act as expected)
If none of the above are the problem, it doesn't hurt to walk through your data for a given date in Power Pivot. E.g. filter your PolicyPremiumFact table to a specific date and look at the LossKeys. Then go the LossSummary table and filter to those LossKeys. Stepping through like this might reveal an oversight (e.g. maybe the LossKeys weren't fully loaded into your model).
If none of the above reveals anything, or if the related column is not blank inside Power Pivot, my suggestion would be to try a newer version of Excel (e.g. Excel 2016), or the most recent version of Power BI Desktop.
If the issue still occurs in the most recent version of Excel/Power BI Desktop, then there's something else going on with your data model that's impacting the RELATED calculation. If that's the case, it would be very helpful if you could mock up your file with sample data that reproduces the problem and share it.
One final suggestion I have is to consider restructuring your tables before they arrive in your data model. In your case, I'd recommend restructuring PolicyPremiumFact to include all the facts from LossSummary, rather than having a separate table joined to your primary fact table. This is what you're doing with the RELATED field to some extent, but it's cleaner to do before or as your data is imported into Power Pivot (e.g. using SQL or Power Query) rather than in DAX.
Hope some of this helps.
I have been working on Power BI for a while now and I often get confused when I browse through help topics of it. They often refer to the functions and formulas being used as DAX functions or Power Query, but I am unable to tell the difference between these two. Please guide me.
M and DAX are two completely different languages.
M is used in Power Query (a.k.a. Get & Transform in Excel 2016) and the query tool for Power BI Desktop. Its functions and syntax are very different from Excel worksheet functions. M is a mashup query language used to query a multitude of data sources. It contains commands to transform data and can return the results of the query and transformations to either an Excel table or the Excel or Power BI data model.
More information about M can be found here and using your favourite search engine.
DAX stands for Data Analysis eXpressions. DAX is the formula language used in Power Pivot and Power BI Desktop. DAX uses functions to work on data that is stored in tables. Some DAX functions are identical to Excel worksheet functions, but DAX has many more functions to summarize, slice and dice complex data scenarios.
There are many tutorials and learning resources for DAX if you know how to use a search engine. Or start here.
In essence: First you use Power Query (M) to query data sources, clean and load data. Then you use DAX to analyze the data in Power Pivot. Finally, you build pivot tables (Excel) or data visualisations with Power BI.
M is the first step of the process, getting data into the model.
(In PowerBI,) when you right-click on a dataset and select Edit Query, you're working in M (also called Power Query). There's a tip about this in the title bar of the edit window that says Power Query Editor. (but you have to know that M and PowerQuery are essentially the same thing). Also (obviously?) when you click the get data button, this generates M code for you.
DAX is used in the report pane of PowerBI desktop, and predominantly used to aggregate (slice and dice) the data, add measures etc.
There is a lot of cross over between the two languages (eg you can add columns and merge tables in both) - Some discussion on when to choose which is here and here
Think of Power Query / M as the ETL language that will be used to format and store your physical tables in Power BI and/or Excel. Then think of DAX as the language you will use after data is queried from the source, which you will then use to calculate totals, perform analysis, and do other functions.
M (Power Query): Query-Time Transformations to shape the data while you are extracting it
DAX: In-Memory Transformations to analyze data after you've extracted it
One other thing worth mentioning re performance optimisation is that you should "prune" your datatset (remove rows / remove columns) as far "upstream" - of the data processing sequence - as possible; this means such operations are better done in Power Query than DAX; some further advice from MS here: https://learn.microsoft.com/en-us/power-bi/power-bi-reports-performance
In Power BI, I've got some query tables generated from imported data. All the data comes in as type 'Any', and I'm trying to automatically detect the type of the data in each column.
Some of the queries generate tables with columns based on the in-coming data - I don't know what the columns are going to be until the query runs and sets up the table (data comes from an Azure blob). As I will have quite a few tables to maintain, which columns can change (possibly new columns being added) with any data refresh, it would be unmanageable to go through all of them each time and press 'Detect Data Type' on the columns.
So I'm trying to figure out how I can do a 'Detect Data Type' in the query formula language to attach to the end of the query that generates the table columns. I've tried grabbing the first entry in a column and do Value.Type(column{0}), however this seems to come out as 'Text' for a column which has integers in it. Pressing 'Detect Data Type' does however correctly identifies the type as 'Whole Number'.
Does anyone know how to detect a column's entry types?
P.S. I'm not too worried about a column possibly holding values of different data types
You seem to have multiple issues here. And your solution will be fragile, there's a better way. But let's first deal with column type detection. Power Query uses the 'any' data type as it's go to data type. You can write a function that samples the rows of a column in a table does a best match data type detection then explicitly sets the data type of the column. This is probably messy and tricky since you need to do it once per column. This might be workable for a fixed schema but for a dynamic schema you'll run into a couple of things very quickly. First you'll need to write some crazy PQ code to list all the columns and run you function on each. This will work the first time, but might break in subsequent refreshes because data model changes are not allowed during refresh. If you're using a tool like Power BI Desktop, you'll be able to fix things up. If you publish your report to the Power BI service, you'll just see refresh errors.
Dynamic Schemas will suffer the same data model change issue I mentioned above.
The alternate solution that you won't have problems with is using a Direct Query data source instead of using Power Query. If you load your data into Azure SQL or a Tabular Model, the reporting layer will get the updated fields automatically so you don't have to try to work around using PQ.