Connecting Datasets in PowerBi by several variables

Connecting Datasets in PowerBi by several variables - powerbi

I have a dashboard that looks like this in PowerBi:
Almost every slicer and visual on this page comes from the "visits" dataset. That dataset is 70,000+ rows, where each row stands for a single patient visit to the hospital. There are a few relevant columns for this question such as: "subject mrn", and "protocol_no" (the study they're on).
Well, elsewhere, I have a dataset called "Data Managers" that is the staff assigned to each protocol. It has relevant columns of "subject mrn", "protocol no" and "staff name"
I have these datasets in my power bi like this:
When I connect these datasets by dragging in between them, Power BI warns me that they are many-to-many relationships. This makes sense because:
Lets say staff member John is the data manager for patient 12345 on study x
Well patient 12345 might also be on study y, and on that study, staff member steve is the data manager.
Also, other patients on study x might have other data managers.
So I need to connect these datasets in a way that when I filter to John, I only get rows back from the visits data where John is the data manager for that combination of subject AND study.
When I just drag across from protocol no and subject mrn like this
it doesnt work. The dropdown appears to filter to lists for john, but when I check for accuracy, its people with totally different data managers. Any idea what to do?

If anybody is looking at this, theres probably a way to do it with managing multiple relationships, but I ended up creating a concatenated column in each dataset of "Protocol, subject_mrn" and then linking those new columns together.

Related

Power consumption in houses by date

I'm starting with Power Bi and I need some help to create the data model for this practice case.
I receive the data in a table that has the power consumption of different houses by hour. In the first column it has the date and hour and the rest of columns has the power consumption of each house (one column for each house).
My idea is to create a dashboard with a map as a filter to select the house and a date slicer to filter a period of time.
I dont know exactly what is the better way to face that, if create a subtable for each house or what.
I know that I also have to create another table with the information of each house with coordinates and relate it to the information table, but this information table has the name of the houses in the header of each column and the name is followed by the text "Power Cosumption".
I would appreciate your help, I think that this is not a complicate problem and you can give me some clues or examples to face it.
Thanks!
Regards

Best way to organize data from multiple companies in Power BI

Basically I have a big Excel dataset about 500x500 with economic information from various companies.
Each row is representing a different company and in columns we have the information. A little bit of it is qualitative like ZIP code, type, etc. But most of it is quantitative. For each of the quantitative info, we have info for 5 years, so we have one column for each year and for each information i.e. Debt 2019, Debt 2020, etc.
So my question is which is the best way to preprocess this data to work with it and how should it be done. Either doing the preprocessing with Excel, running a Script on PowerBI, using Query, SQL, ...
The objective is to have a report which will be accessible online and the user will type the name of the company and it will show them the dashboard with the information of that company (only that one), so they can navigate through it.
The structure and which information is shown is the same for each company, the only thing that changes is the "numbers" that each company has. So it has to be possible to change which data is showing (to use the one from the company they want).
It also needs to be able to show comparative data to other groups of companies or to the total.
I want to have it right from the start, because then changes get complicated.
I thought about doing sort of a "relational model" with one "table" for each company with the quantitative data (with one row for each year and each column one info point) and then a general table with the qualitative data (with rows being each company and the columns the info). But I am not really sure.
I know how to use Power BI but I have never used it for something this big. I would like to know which way to organize this data is better and some info on how to do it.
Many thanks to everyone.

I thought about doing sort of a "relational model" with one "table" for each company with the quantitative data (with one row for each year and each column one info point) and then a general table with the qualitative data (with rows being each company and the columns the info).
Yes, do that.
General guidance is to use Power Query in PowerBI to transform the data into a star schema model. See Understand star schema and the importance for Power BI
So that would typically result in one table that has the "dimension" data for each company, a date table, and a "fact" table at the grain of (CompanyId,Date) with the quantitative data.

Showing Side by Side Measures from Semi-Related Tables in Power BI

I have a data model in Power BI that, among other things, has the following tables
Employees (Dimension; employee ID/name)
Jobs (Dimension; contains details about the job, including job ID)
Employee history- Contains a record for each day an associate was in a job(snapshot table);
Job Budget History- Contains a record for each day a job was budgeted(snapshot table)
Calendar Table
The table is modeled like so (simplified version):
In Power BI, I am trying to make a simplified table view that contains measures based on both the budget history as well as the employee history for the most recent day in the dataset (simple count rows/distinct count of calendar table)
However, attempting to do so gives me the results below if I try to put both measures on the table. Basically it appears to be doing a cross join between each table and matching associates with jobs they don't have (this happens when the budget is added).
Of course, if I just do one of the singular measures everything works perfectly. I am fairly certain it is because there is no real connection between the 'employee' and the 'budget history' in this relationship, so it is just joining everything on the date without any context.
I have tried several things such as making inactive relationships with userelationship(), using visual level filters etc. but I'm not sure what the best option in this situation would be. (I am trying to avoid bidirectional relationships if possible)
Ideally this information should show on this date that Joe was present as President, Sally was present as an operator, and the Manager position had nobody, but all three were budgeted.
Any advice is appreciated. I have attached a simplified mockup pbix file for reference.
PBIX File

This is a complicated problem for many reasons. I was able to produce this report:
by removing field "Name" from the table and replacing it a measure:
Employee Name =
CALCULATE(
SELECTEDVALUE(Employees[Name]),
CROSSFILTER(Employees[Employee_ID], Employee_History[Employee_ID], BOTH)
)
It looks exactly like the report you want, but if you have additional requirements, you'll need to make sure that such approach works for you.
If this is acceptable, a brief explanation:
the root cause of the issue is missing Employee-Budget relationship. When you put Name in the table as a filter, it doesn't propagate to the budget table and causes a cartesian product.
Removing Name from the table eliminates the need for the filter propagation, but then you won't see employee names. I solved this by pulling employee names with the measure, where required propagation is forced by CROSSFILTER function (essentially, it's like a temporary bi-directional relation only when you need it, so it does not negatively affect the rest of the model).

Can't determine relationships between the fields

I've imported three tables into Power BI using a REST API, have added relationships between them, and am now trying to add fields from the various tables to a table on the canvas. The table names (from a Human Resources database) are named Employees, Job History, and Salary History.
Employees is joined to Job History using EmployeeID as a 1:Many relationship, and also to Salary History using EmployeeID on a 1:Many relationship.
I can add fields from the Employee table and EITHER the Salary History OR the Job History table to the table on the report canvas. However, if I try to add fields from all three tables, I'm seeing the error 'Can't display the data because Power BI can't determine the relationship between two or more fields'.
Could anyone advise where I'm going wrong? Many thanks.

If I understand correctly, you have a model like in this picture:
The way PBI filters work is: The 1: side table filters the N: side table. Filters propagate that way. So in this case, you can filter JobHistory with data from Employees, and SalaryHistory with data from Employees. But the 2 fact tables can't relate because the filters don't propagate that way.
Look into DAX measures like RELATED(), RELATEDTABLE() and USERELATIONSHIP() that might work for you.
Without that, I don't think you can use data from the 3 tables, since you have a model with 2 Fact Tables.
Someone correct me if I'm wrong.

Power BI - Display a Value based on two dimensions in the same visual

I'm trying to figure out how to structure my dataset such that someone can have a visual with the same value based on two dimensions.
Very simplified example:
Main table
Working Attorney (Person)
Billing Attorney (Person)
City
Revenue
Cost
etc...
There are many more Dimensions, values and measures, but this is enough to make my point.
I can create a visual that is:
WorkingAttorney City State Country sum(Revenue) sum(Cost)
And I can create one that is:
BillingAttorney City State Country sum(Revenue) sum(Cost)
However, I have been asked, since the same person can be the billing attorney in one situation and a working attorney in another, or even both at the same time, can we create a single visual that has both?
Essential, if these were SQL tables, I would just full outer join on the Working Attorney = Billing Attorney and then be able to display the 4 values.
Attorney City State Country sum(Revenue as Working Attorney) sum(Cost as Working Attorney) sum(Revenue as Billing Attorney) sum(Cost as Billing Attorney)
Hopefully this example is still a little more concrete?
Not sure the best way to do this. Perhaps there is a better way to structure my dataset?
What I think I want to avoid is having to create a measure for every value specific to each dimension, because that seems unmanageable.
Ideas?
Thanks!

I just posted a blog about doing exactly this. Select a name on a slicer and it filters your results on where that name could appear in multiple columns.
https://bisamurai.wordpress.com/2018/03/02/filtering-on-multiple-columns-using-one-slicer/
Hope this helps.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js