Why does filtering my Dataverse Table in Power Query increase my file size? - powerbi

I'm using Power Query in Power BI to retrieve tables from Dynamics 365 using the Dataverse connector in Import mode. I'm currently having a problem where I have a table that has 23 columns and 6M+ rows, and a lot of the keys are alphanumeric GUIDS (which I intend to replace with numeric keys). So far I have retrieved a base table from D365 (TableBase), which I referenced to another table (Table0) to do some basic transformation like changing text types to integers, replacing nulls with 0, and changing column labels. My next step would be to create 3 smaller fact tables referencing Table0 (Table1, Table2, Table 3) by filtering on a category code for each table.
I have two problems occurring:
When I create a dimension table and left join to the fact table on the GUID, then expand the table to only include the numeric key I created in the dim table, it gives an error saying the file size is too large.
I tried filtering Table1 to return a decreased number of rows (the current table has 6,213,553 rows, I want to get it down to 567,458) it also gives an error saying the file size is too large.
The error message I get is this: Microsoft SQL: Return records size cannot exceed 83886080. Make sure to filter result set to tailor it to your report.
Of course my alternative is to use the OData connector instead of the Dataverse connector for Dynamics 365, but it's dreadfully slow and one simple model I have build with the OData connector is not even refreshing anymore due to the refresh taking so long. Also, I'm confused as to why filtering a table to have fewer rows throws this error, as I'm trying to reduce the file size.
Do you all have any suggestions on why this is happening, and how I can get around it? I know the most obvious answer is that I should remove some columns from the fact table, and I'm going to have a talk with one of my consumers about doing this because that is the only solution I can figure out at this point.
I appreciate any help you can provide!

Related

Join 2 table in power bi

I need help on this issue as i don't have any experience in Power Bi. I want to join 2 table in Power Bi where it have the same column which is Part_Number. How can i make this 2 table to match by Part Number and return the value?
Recon Table
Inventory Table
I would like to have Part Number, Part Name, QTY, Total Quantity as the result. Hope that i can the clarification i need. Thanks a lot!
For this case you simply must merge the tables. It doesn't look like you have done a lot of research on the matter though, so it's hard to understand exactly what you need help with.
To merge your two tables in Power Query, I would right click in the left hand side menu and select Merge Queries as New.
After that you simply follow the on-screen instructions and select your two tables and their respective key columns. After merging you can choose to disable load of your two original tables to save space in your data model, but this depends on your requirements.
If this was my data model, I would think on why joining these tables are necessary, instead of using these two tables as fact tables, and creating a third table to handle the part number dimension with associated part metadata.
Read the docs: Merge queries in Power Query

Can you replace fields from query with fields from a split query in Power BI?

I have a report in Power BI that cannot refresh because the data from the table is too large:
The amount of data on the gateway client has exceeded the limit for a single table. Please consider reducing the use of highly repetitive strings values through normalized keys, removing unused columns, or upgrading to Power BI Premium
I have tried to shrink the columns used in the data set to the best of my ability, but it is still too large to refresh. I did a test where, instead of using just a single query to retrieve the data, I made two queries that split the columns roughly half and half and then link them back together in Power BI using their ID column. It looked to me that the test data refresh started working upon splitting up the table's data into two separate queries.
Please correct me if there is a better method to trim the data down to allow the data set to refresh, but for now this is the best solution I see. What I am wondering is, since now my data is split into two separate queries, what is the best way to adapt the already existing visualizations I have that are linked up to the full, non-refreshable query to the split, refreshable queries? It looks to me like I would have to recreate the visuals from scratch, but if there is a way to simply do a mass replace of the fields that would save so much time. The split queries I created both have the same fields as the non-split query.

Power BI / Power Query - M language - playing with data inside group table

Hello M language masters!
I have a question about working with grouped rows when the Power Query creates a table with data. But maybe it is better to start from the beginning.
Important information! I will be asking for example only about adding an index. I know that there are different possibilities to reach such a result. But for this question, I need an answer about the possibility to work on tables. I want to use this answer in different actions (e.g table sorting, adding columns in group table).
In my sample data source, I have a list of fake transactions. I want to add an index for each Salesman, to count operations for each of them.
Sample Data
So I just added this file as a data source in Power BI. In Power query, I have grouped rows according to name. This step created form me column contained a table for each Salesman, which stores all his or her operations.
Grouping result
And now, I want to add an index column in each table. I know, that this is possible by adding a new column to the main table, which will be store a new table with added index:
Custom column function
And in each table, I have Indexed. That is good. But I have an extra column now (one with the table without index, and one with a table with index).
Result - a little bit messy
So I want to ask if there is any possibility to add such an index directly to the table in column Operations, without creating the additional column. My approach seems to be a bit messy and I want to find something cleaner. Does anyone know a smart solution for that?
Thank you in advance.
Artur
Sure, you may do it inside Table.Group function:
= Table.Group(Source, {"Salesman"}, {"Operations", each Table.AddIndexColumn(_, "i", 1, 1)})
P.S. To add existing index column to nested table use this code:
= Table.ReplaceValue(PreviousStep,each [index],0,(a,b,c)=>Table.AddColumn(a,"index", each b),{"Operations"})

PowerBI / PowerPivot - Data not aggregating by time frame

I have created a powerpivot model include in the image below. I am trying to include the "IncurredLoss" value and have it sliced by time. Written Premium is in the fact table and is displaying correctly. I am aiming for IncurredLoss to display in a similar fashion
I have tried the following solutions:
Add new related column: Related(LossSummary[IncurredLoss]). Result: No data
DAX Summary Measure: =CALCULATE(SUM(LossSummary[IncurredLoss])). Result: Sum of everything in LossSummary[IncurredLoss] (not time sliced)
Simply adding the Incurred Loss column to the Pivot Table panel. Result: Sum of everything in LossSummary[IncurredLoss] (not time sliced)
A few other notes:
LossKey joins LossSummary to PolicyPremiumFact
Reportdate joins PolicyPremiumFact to the Calendar.
There is 1 row in LossSummary per date and Policy. LossKey contains this information and is the PK on that table.
Any ideas, clarifications or pointers are most certainly welcome. Thank you!
The related column should work. I was able to get it to work in both Excel 2016 and Power BI Desktop. Rather than bombarding you with questions, I'll try and walk through how I would troubleshoot further, in the hopes it gets you to a solution faster:
First, check the PolicyPremiumFact table inside Power Pivot and see if the IncurredLossRelated field is blank or not. If it is consistently blank, then the related column isn't working. The primary reason the related column wouldn't work is if there's a problem with your relationships. Things I would check:
Ensure that the relationships are between the fields you think they are between (i.e. you didn't accidentally join LossKey in one table to a different field in the other table)
Ensure that the joined fields contain the same data (i.e. you didn't call a field LossKey, but in fact, it isn't the LossKey at all)
Ensure that the joined fields are the same data type in Power Pivot (this is most common with dates: e.g. joining a text field that looks like a date to an actual date field may work, but not act as expected)
If none of the above are the problem, it doesn't hurt to walk through your data for a given date in Power Pivot. E.g. filter your PolicyPremiumFact table to a specific date and look at the LossKeys. Then go the LossSummary table and filter to those LossKeys. Stepping through like this might reveal an oversight (e.g. maybe the LossKeys weren't fully loaded into your model).
If none of the above reveals anything, or if the related column is not blank inside Power Pivot, my suggestion would be to try a newer version of Excel (e.g. Excel 2016), or the most recent version of Power BI Desktop.
If the issue still occurs in the most recent version of Excel/Power BI Desktop, then there's something else going on with your data model that's impacting the RELATED calculation. If that's the case, it would be very helpful if you could mock up your file with sample data that reproduces the problem and share it.
One final suggestion I have is to consider restructuring your tables before they arrive in your data model. In your case, I'd recommend restructuring PolicyPremiumFact to include all the facts from LossSummary, rather than having a separate table joined to your primary fact table. This is what you're doing with the RELATED field to some extent, but it's cleaner to do before or as your data is imported into Power Pivot (e.g. using SQL or Power Query) rather than in DAX.
Hope some of this helps.

Power Query Formula Language - Detect type of columns

In Power BI, I've got some query tables generated from imported data. All the data comes in as type 'Any', and I'm trying to automatically detect the type of the data in each column.
Some of the queries generate tables with columns based on the in-coming data - I don't know what the columns are going to be until the query runs and sets up the table (data comes from an Azure blob). As I will have quite a few tables to maintain, which columns can change (possibly new columns being added) with any data refresh, it would be unmanageable to go through all of them each time and press 'Detect Data Type' on the columns.
So I'm trying to figure out how I can do a 'Detect Data Type' in the query formula language to attach to the end of the query that generates the table columns. I've tried grabbing the first entry in a column and do Value.Type(column{0}), however this seems to come out as 'Text' for a column which has integers in it. Pressing 'Detect Data Type' does however correctly identifies the type as 'Whole Number'.
Does anyone know how to detect a column's entry types?
P.S. I'm not too worried about a column possibly holding values of different data types
You seem to have multiple issues here. And your solution will be fragile, there's a better way. But let's first deal with column type detection. Power Query uses the 'any' data type as it's go to data type. You can write a function that samples the rows of a column in a table does a best match data type detection then explicitly sets the data type of the column. This is probably messy and tricky since you need to do it once per column. This might be workable for a fixed schema but for a dynamic schema you'll run into a couple of things very quickly. First you'll need to write some crazy PQ code to list all the columns and run you function on each. This will work the first time, but might break in subsequent refreshes because data model changes are not allowed during refresh. If you're using a tool like Power BI Desktop, you'll be able to fix things up. If you publish your report to the Power BI service, you'll just see refresh errors.
Dynamic Schemas will suffer the same data model change issue I mentioned above.
The alternate solution that you won't have problems with is using a Direct Query data source instead of using Power Query. If you load your data into Azure SQL or a Tabular Model, the reporting layer will get the updated fields automatically so you don't have to try to work around using PQ.