DAX way to return summarised data - powerbi

I hope I'm not missing an easy solution am still getting used to DAX and can't yet find an appropriate logic.
I have a large dataset, >10m rows which I want to test. An identifier column "DocumentNumber" might occur on multiple rows and I want to find where the sum of "Value" over these rows for a given "DocumentNumber" is non-zero.
Tried to use PowerQuery > removed all but these two columns > Group By > DocumentNumber > Sum of Value. However my 32 bit version of Excel appears to run out of memory performing this step Expression.Error: Evaluation ran out of memory and can't continue.
Wrote a DAX measure > Sum of Values and dropped into a pivot table with a view to filtering out the zero values but when I try to drag in the DocumentNumber to rows there are more than a million rows so the table won't render.
Is there a logic I should follow in DAX that would achieve step 2 before bringing it to the pivot table? Can DAX actually create a new table in the data model which is the aggregated and filtered data rather than using a pivot? I believe this is possible in PowerBI but not sure about Excel evironment.

Related

Table visual is unintuitively aggregating my data

I am loading an Excel file, in which it has 43 rows, all the rows are identical. This is the only file I'm loading and there are no connections relationships in the model whatsoever.
When I plot my data into a table visual, and choose not to summarize any of my fields, Power BI still shows me one row. While if I change any of the field to do a count of it, it shows me correctly that I have 43 rows. I need to be able to see all the 43 rows in my table.
Why is Power BI summarizing my data even if I command it not to do so?
Am I missing something simple?
Input table as seen in Power BI data tab:
The visual I'm trying to create:
The table visual in Power BI behaves similar to a Pivot Table in Excel.
W/o an aggregation defined, the "Values" fields behave like "Rows" in a Pivot Table and you will only see distinct items or distinct item combinations. You have 43 identical records, hence it is represented as one row in the visual.
With an aggregation defined (Sum, Count, ...) the field behaves like "Values" in a Pivot Table, and you get the result of that aggregation, filtered by the distinct items/combinations to the left, which is again one row in your case.
If you just want to show all the records in a table visual, you'll have to make them unique. The easiest way to achieve that is by adding an index column in PowerQuery and then also showing that index in the table visual.
However, this is not exactly what Power BI is made for and you are probably better off by switching to something like PowerPoint instead.
And btw., newer show sceenshots in stackoverflow, always provide sample data instead.

Create a pivot table in Power BI with dates instead of aggregate values?

I have a table of companies with descriptive data about where we are in the sales stage with the company, and the date we entered that specific stage. As can be seen below, the stages are rows in a Process Step column
My objective is to pivot this column so each Process Step is a column, with a date below it, as shown in excel:
I tried to edit the query and pivot the column upon loading it, but for the "aggregate value" column, no matter which column I use as the values column, it results in some form of error
My advice would be not to pivot the table in the query and use measures to get dates that you want. The benefit of not doing so is that you are able to perform all sorts of other analytics. For instance, Sankey chart would be hard to do properly with pivoted table.
To get the pivot table you are showing from Excel, it's as simple as using matrix visual in Power BI and putting Client code in rows and Process Step in Columns, then Effective date in values.
If you need to perform calculations between stages, it's also not too difficult. For instance, you can create a meausure that shows only dates at certain stages, another measure for another stage, and so on. For example:
Date uploaded = CALCULATE(MAX(Table[Effective Date]), FILTER(Table, Table[Process Step] = "Upload"))
Date exported = CALCULATE(MAX(Table[Effective Date]), FILTER(Table, Table[Process Step] = "Export"))
Time upload to export = DATEDIFF([Date uploaded], [Date exported], DAY)
These measures will work in the context of client and assuming there is only one date for the upload step (but no Process step in rows or columns). If another scenario is needed, perhaps a different approach could be taken.
Please let me know if that solves your problem.

Calculated column in Power BI that repeats different sums based on conditions in 2 other columns

I need a calculated column based on conditions in two columns (Business Unit Number in both tables and L1/Account Categories in 1st table and the second table) which sum and then repeat for several rows before the conditions change and a new sum is repeated for several rows and so on. The L1/Account Categories columns have different names because it's the raw data.
For example, any time ASSETS and 111 appear in the same row, I would want to use those as conditions and with the sum of all of the other matching rows in a new column and the sum would repeat each time both conditions appeared in the same row. Any time P/L and 111 appear in the same row, that would be a sum of all other P/L and 111 appearances in the dataset (about 1000 rows overall)... and so on.
I've tried formulas with DAX using FILTER, SUMX, nested IF statements and also tried the Power Query language among other attempts. Maybe I have to create one or more than one new table? If you need to take a look at a few of my attempts, just let me know.
The top image is how I imagine the output will look in the power query editor and the bottom image is a sample of the source data.
This last pic is from Tableau - I need to make a table in Power BI which essentially a duplicate of this image. The last 2 columns are pulling from different tables.
This should be very simple to achieve with relationships and measures - no need for calculated columns or power query merges. You need to build a relationship between these two tables. In fact, I would introduce a third table in your model for Business unit.
The limitation of Power BI model relationships is that they can only be based on a single column. So to build a relationship between these two tables you would have to add a calculated column in both of them that would contain both a BU and the financial statement line, for instance: JoinCol = CONCATENATE([Business_Unit_Number], [L1]). Then you could create a relationship and do what you want.
The better (one that I would recommend) approach would be to separate Business Unit into a separate table and have relationships built like this:
Then all you have to do in your visual is drag Business unit name from the BU table, L1 from the FS Lines table and a measure to sum the amounts Amount = SUM('Financial Data'[Rolled Up Detail]).
Here is a working sample: https://1drv.ms/u/s!AmqvMyRqhrBpgtUT5HKnZP1U3Gzc9w?e=en91dV

In Power Bi how to create a measure giving Number of Rows in a Table Visual

I know similarly named topics exist but I spent hours looking for the answer to something which i feel must be easy to do.
I have a table Visual in Power Bi and need to get a row count which would adjust as users set filters and use slicers.
The columns in the table don't have one level of hierarchy (otherwise a measure with DISTINCTCOUNT would do the trick), see an image of my table - I need to count rows in this table (21).
I couldn't find any way to directly refer to the Table Visual in DAX so simple COUNTROWS() wasn't useful. I tried to create a measure using various DAX expressions i found, e.g. recreate the table in DAX using CALCULATETABLE and have it use active filters, i failed there...
I also created an index variable on the physical/dataset table and tried to add it to the visual table and summarize it in some way (the column "Counter" in the table) - it didn't work, didn't give me the number of rows in the visual table but gave the number of rows in the physical/database table.
Please could you help me how to do this...
You won't really be able to access the table visual in the way you're thinking, but as long as you know how it comes together, that should be just as good.
CountSummaryRows =
CALCULATE(
COUNTROWS(SUMMARIZE(Table1, Table1[A], Table1[B], Table1[C], Table1[D])),
ALL(Table1)
)
This measure will aggregate to the level of your A,B,C,D columns, then count the total. ALL(Table1) just ensures you get the grand total, so every row should show as 21, but at least you'll have a handle on that summary table count.

Is it possible to get a percentage of data, from an OData source, in Power BI?

I'm still very new to Power BI, so forgive my possible ignorance.
We need to do a quarterly check-up, on some data.
To get this data, we have an OData endpoint.
Some of the checks require us to get a random sample of data, from within a certain time.
The random sample of data could be something like: "a random 20% of all papers from 01-01-2020 to 01-02-2020".
I'm not sure if this is possible in Power BI.
If possible, I don't know if I need to adjust my query or do these calculations after getting all the data.
You may use RAND() in DAX as a calculated column or a measure (RAND in a measure is not always recalculated as it is volatile).
You can add an index column to your table in power query then apply query changes to load the updated table.
Next step , create a measure :
ramdom mymeasure = RAND()
and add desired fields to a table including the index column that you choose by your logic.
Go to
Visualizations pane > Fields > Visual level filters>
Selected Top N under filter type >Show Top 20 items> Drag random measure to By Value.