I have raw data like this:
There's other columns not shown but disease site is the only one that will change within a study.
And in power Bi, ultimately in the report I want to show a table to colleagues that has just one row per study. Obviously given the unique disease sites, I need to group those first, and I need some help doing so. What I'd like to show colleagues is a table like this:
Where if there were multiple disease sites associated with a "study" then they are clumped as "multi". I figure to do so it'll mean creating a custom disease site column with 'multi' in it and then filter to one row per study, but I'm having trouble with the details.
Do I do that in power query? Should I do it in power bi after the query is imported? Any help would be appreciated, thank you!
Load your data into Powerquery or similar
Click select the Study and Primary_Investigatory columns, right click, group by and choose operation All Rows
Change the ending of the group in the formula window (or in Home .. advanced editor) from
{"Primary_Investigatory", "Study"}, {{"data", each _, ... })
to
{"Primary_Investigatory", "Study"}, {{"data", each if Table.RowCount(_) = 1 then [Disease_Site]{0} else "Multi"}})
sample full code for example image:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Primary_Investigatory", type text}, {"Study", type text}, {"Disease_Site", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Primary_Investigatory", "Study"}, {{"data", each if Table.RowCount(_) = 1 then [Disease_Site]{0} else "Multi"}})
in #"Grouped Rows"
Related
I have an API data source I am refreshing daily to gather power bi activity. Each day, the data returns a different amount of columns, so it might have 60 one day and 80 (+20) additional another day.
When I try to refresh the dataset in the Power BI Service, it naturally fails and states that the new columns cannot be found in the row set.
I have explored many options such as creating a combine table, however I do not know all the names of the columns that could come in each day so this failed because it was very static. Does anyone know of a way to dynamically handle these daily changes?
Many thanks
The only way to refresh a data source that has changing schema is to unpivot that table and bring it into your model as key/value pairs.
It depends on what you want to do.
If you're just trying to change all extra columns from type any to type number, you can try something like
let
Source = #"table with extra columns",
OtherColumnNames = List.RemoveItems(
Table.ColumnNames(Source),
#"List of known column names"
),
#"Changed Type" = Table.TransformColumnTypes(
Source,
List.Transform(
OtherColumnNames,
each {_, type number}
)
)
in
#"Changed Type"
or, if that's something you will be doing to multiple tables you could turn it into a function, like a query with a name of "fTransformOtherColumnTypes" with the following code.
(
#"List of known column names" as list,
#"table with extra columns" as table,
Type as type
) =>
let
Source = #"table with extra columns",
OtherColumnNames = List.RemoveItems(
Table.ColumnNames(Source),
#"List of known column names"
),
#"Changed Type" = Table.TransformColumnTypes(
Source,
List.Transform(
OtherColumnNames,
each {_, Type}
)
)
in
#"Changed Type"
and then your other queries can use it, e.g. fTransformOtherColumnTypes({"name","color","org", "alias"}, #"your source data", type number)
I have a matrix table in Power BI where the lowest heirarchy has 2 users with same product but for their manager, it needs to be only counted once. How can I do that in the matrix table?
When I was pulling the heirarchy from one table and sales from another, Power Bi was doing it on it's own but when sales is in the same table as the user heirarchy, it is simply taking a sum of all the sales when it should only sum once for cases when product is repeated for multiple users for the same manager.
As seen in the image, manager's total should be 300 but Power BI sums it up to 400. How can I make sure that manager's total is taken as 300? I'd really appreciate any help. Thank you
Simply put, you should remove the duplicate items related to manager "A" in the "Product" column. In the real scenario, you need to filter this way for each manager.
You can do this within Power Query:
(notice the table name 'SalesTable')
let
Source = Excel.CurrentWorkbook(){[Name="SalesTable"]}[Content],
#"Filtered Rows" = Table.SelectRows(Source, each [Manager] = "A"),
#"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows",{{"Manager", type text}, {"User", type text}, {"Product", Int64.Type}, {"Sales", Int64.Type}}),
#"Duplicate Removed" = Table.Distinct(#"Changed Type", {"Product"}),
Sales = #"Duplicate Removed"[Sales],
CustomSUM = List.Sum(Sales)
in
CustomSUM
In trying to make a Vlookup on PowerQuery that also makes a sum of the multiple values fond. I have 2 tables on my Power BI that are conected by the Report Number as showed below. I need to create a new column on table B that gets the sum of cost at Table A according to their report numbers.
At Power Query I have created a new Column on Table B using the following code:
After that I was planning to simply create a new column summing the list result, but my list is Empty and I can't realize why. Can anyone help me understand why I can't get the results?
I can't do this using DAX, it should be in M
One way to add the column into TableB is:
= (i)=>List.Sum(Table.SelectRows(TableA, each [Report Num]=i[Report Num]) [Cost])
Another way is to Group TableA and merge it in. I tend to think this is a faster method for larger tables
let Source = Excel.CurrentWorkbook(){[Name="TableB"]}[Content],
#"Grouped Rows" = Table.Group(TableA, {"Report Num"}, {{"Cost", each List.Sum([Cost]), type number}}),
#"Merged Queries" = Table.NestedJoin(Source, {"Report Num"}, #"Grouped Rows", {"Report Num"}, "Table1", JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"Cost"}, {"Cost"})
in #"Expanded Table1"
of course, if those are the only two columns in TableB, you could just create the whole table in one go
let Source = Table.Group(TableA, {"Report Num"}, {{"Cost", each List.Sum([Cost]), type number}})
in Source
I do Power BI for a logistics company. We want to show performance by stop location. The data is currently a table of all orders by Order ID, so -- ID, Rev $, Pickup Stop, Delivery Stop. Everything is a 2-stop load, fortunately.
What I am struggling with is building a calculated table that looks at the Pickup Stop AND the Delivery Stop at the same time while ALSO respecting filters set on the page. I would like the stops table to say something like: Stop Location, X Pickups, $X Pickup Revenue, X Deliveries, $X Delivery Revenue.
How would I go about this? I've tried a number of approaches but every time it either misses filters or can only handle one stop at a time.
Thanks!
Current Datacall it Orders
The calculated table I'm trying to makecall it Stops
One method of creating your Stops, given your Orders is by using Power Query, accessed via Queries=>Transform Data on the Power BI Home Tab.
The Table.Group function is where the magic happens. Unfortunately, it needs to be done by coding in the Advanced Editor, as the UI does not provide for these custom aggregations.
When the PQ Editor opens: Home => Advanced Editor
The first three lines should be replaced by whatever you are reading in your own Orders table with.
Paste the rest of M Code below in place of what is below your setup lines in your own query
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
//Input data and set datatypes
//These lines should be replaced with whatever you need to
//set up your data table
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bYzBCoMwEER/Zck5BxPRu1RaLJaW6qEQPIS4tEExoonQv+/a0oLQyw5vZnaUYuepxQmKnHFWO697uOKCQ0DiizVdGKHybiTKsbcLTs8PN1wxIZMooiR938z3evCawyFbKczeDhzq268qyBZpsg23f9+qJF+Skuwe1ui741CU/2djsmO53lJ3SFsth/3aPWrTzY7Kp4o1zQs=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
dataSource = Table.TransformColumnTypes(#"Promoted Headers",{
{"Order ID", Int64.Type}, {"Total Revenue", Int64.Type},
{"Pickup Stop", type text}, {"Delivery Stop", type text}}),
//Unpivot to get single column of Stops
#"Unpivoted Columns" = Table.UnpivotOtherColumns(dataSource, {"Order ID", "Total Revenue"}, "Attribute", "Stop"),
//Group by stop and do the aggregations
#"Grouped Rows" = Table.Group(#"Unpivoted Columns", {"Stop"}, {
{"Orders Picked Up", (t)=> List.Count(List.Select(t[Attribute], each _ = "Pickup Stop" )), Int64.Type},
{"Total Revenue Picked Up", (t)=> List.Sum(Table.SelectRows(t, each [Attribute]="Pickup Stop")[Total Revenue]), type number},
{"Orders Delivered", (t)=> List.Count(List.Select(t[Attribute], each _ = "Delivery Stop" )), Int64.Type},
{"Total Revenue Delivered", (t)=> List.Sum(Table.SelectRows(t, each [Attribute]="Delivery Stop")[Total Revenue]), type number}
})
in
#"Grouped Rows"
Orders
Stops
I have a calendar table in Power BI linked to two other tables, one with occupancy by date and another with predicted occupancy by date. The second table goes well into the future.
I want the report to have a rolling 15 day range, 7 days prior to today and 7 days into the future. I tried to create a custom column using:
ReportRange = IF(DATESBETWEEN (Calendar[SQL_Date], (TODAY()-7), (TODAY()+7)),1,0)
I get a response "No syntax errors have been detected."
But when I click "OK", I get a yellow bar/warning:
"Expression.Error: The name 'IF' wasn't recognized. Make sure it's spelled correctly."
Can anyone help with this?
Thanks!
You need to write custom columns in the query editor in M code, not DAX.
Something like this may work:
if Date.IsInPreviousNDays([SQL_Date], 7) and Date.IsInNextNDays([SQL_Date], 7)
then 1
else 0
You may prefer to use relative date filtering instead though.
You are attempting to use a PowerQuery M Language column through the custom column editor. This will not work in M. You will need to create a measure in DAX to do the calculation.
In your DAX you can use DatesBetween, as per this example:
=CALCULATE(SUM(InternetSales_USD[SalesAmount_USD]), DATESBETWEEN(DateTime[DateKey],
DATE(2007,6,1),
DATE(2007,8,31)
))
You can use the TODAY keyword to work calculations off the current date.
for "rolling" filters i found it very useful to build generic offset columns for day/week/month/quarter into your date table. with that you can easy filter a visual e.g. with "week offset" > 1 and "week offset" <-1 to get rolling 2 weeks views... Detailed howto can be find here: https://radacad.com/offset-columns-for-the-date-table-flexibility-in-relative-date-filtering-for-power-bi
If you just want to filter the data coming in to the report, you can use a filter.
You can do this in the Advanced Editor, sample below.
let
Source = BillingData,
#"Removed Other Columns" = Table.SelectColumns(Source,{"Date"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Date", type date}}),
#"Removed Duplicates" = Table.Distinct(#"Changed Type"),
#"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each Date.IsInPreviousNDays(Date.AddDays(DateTime.LocalNow(), 7), 14))
in
#"Filtered Rows"
I started with the standard date filter and customised it using the advanced editor to add 7 days and then subtract 14.