Power BI - Merging multiple customer tables - powerbi

Connection Type: Direct Query to multiple sources so limited DAX available especially in Power Query load.
Data Model Query: The Data model is not a perfect star schema but there is an attempt to separate tables into business processes and lookup tables. There are probably a few issues to discuss the current data model. I only have 1 question at this time.
My current goal is to generate a single summarised customer table to replace the current two tables that have some measure I need like the number of app customers, a number of total customers, date customer first accessed app etc.
So I cannot merge the 2 customer tables and add calculated columns and measures at the import stage as power bi does not support or allow it and sql is out as I am using direct query. My plan is to create a summarised customer table using DAX summarise function on front end visual page, that has only the app customers and then measures like the total number of customers etc. Is this best practice or is there a better way of approaching this? Understand you would ideally do in sql, or power query but in these circumstances, I think this is the best way but wanted a second view.

Is there a reason to use Direct Query over Import? If you are in Import mode, you can easily Append the two client tables together in PowerQuery.
Treb Gatte, Power BI MVP

Related

What is the approach to merge data from multiple databases (same schema) using Power BI?

I have 3 OLTP databases, all using the same database schema. Each db represents one department.
I am exploring Power BI as a solution for reporting at the company level, so all departments combined.
What is the approach to combine data from multiple dbs into a data warehouse? For example - do I need SSIS to combine the 3 dbs into 1 data warehouse?
Another option could be to have 1 shared dataset per db, and then the final report can connect and combine multiple live datasets? Or is there another way with Power BI like combining multiple live datasets?
Any reference link on how if someone has done this?
Or is there another way with Power BI
Yes. Simply create a single import model and load data from all three databases in it. So for each table in your Power BI model you would have three Power Queries set to not load into the model, and you would append them in a query that is used to load your model. See eg: https://learn.microsoft.com/en-us/power-query/append-queries
Best practice would be to:
Extract the data into a single database (DWH or reporting schema)
Build the necessary items there for your data model, be it reporting schema, or star/snowflake schemas
Connect Power BI to that schema.
Combining datasets is going to be tricky, you may have the same measures in each of the datasets. Combining in the database, with any added columns to indicate the department is the best option in terms of supporting updating/adding/removing items. For example, if the schema changes in the DB's you do it in one place, not three datasets. The toolset in DB/SSIS will be better suited to the heavy lifting of the data to a location.
You would use SSIS to extract the data if on-prem data, Azure Data Factory for Azure DB's. Extract to a staging schema, convert/transform the data into its final from, with a new schema to define what it is, facts/dimensions other schema names such as reporting can be used, depending on the data model you wish to build. Most of this is covered by the standard ETL pattern of OLTP to an OLAP database.

compare a measure to a value in a column

Power BI noob here, still thinking like a SQL coder, so please be patient.
How can I use the user name of the person running the report to filter the report?
As a convenience for my users, I want to provide a way for them to automatically filter to only see data related to their office or region. I have a Person table that includes details like their office location. If I can filter that based on the user name of the person running the report, and join it to the rest of the data, that would work.
Unfortunately, I don't see a way to get the user name in M.
Using the USERNAME() function in DAX, I don't see a way to compare this with individual values in a column. I get an error about being unable to compare a measure to multiple values.
It seems this would be a common request, so I'm sure somebody has solved this problem. But I haven't yet found the solution.
Use a RLS in your model. You can use function USERNAME(),USEROBJECTID (), USERPRINCIPALNAME (); The last one is useful if you have table with users and their email.
https://learn.microsoft.com/en-us/power-bi/admin/service-admin-rls
check also this GuyInCube video:
https://www.youtube.com/watch?v=MxU_FYSSnYU
First, be aware of the differences between USERNAME() and USERPRINCIPALNAME(). Most likely you will want to use the later one.
You can't use neither of these in M. Imagine your model is importing the data. The M code is executed once in the context of one user, then each other user accessing the published report will reuse the already loaded and calculated model. And of course, these are DAX functions, not M.
So in DAX you can use these to compare their values to columns from your model. You didn't gave any information about your model, but lets say there is a table Sales with columns Customer and Amount:
Customer
Amount
Bill Gates#Microsoft.local
100
Steve Ballmer#Microsoft.local
110
In this case, you can write a measure like this:
My Sales = CALCULATE(SUM('Sales'[Amount]), 'Sales'[Client] = USERPRINCIPALNAME())
When Bill Gates opens the report, he will see sales amount of 100, while if Steve Ballmer opens it he will see 110.
For diagnostic purpose, you can make a measure like this and show it somewhere in your report:
Who am I = USERPRINCIPALNAME()
If your goal is to build dynamic Row-Level Security within Power BI, it has some functionality which might help you, so take a look at these articles:
Row-level security (RLS) with Power BI
Restrict data access with row-level security (RLS) for Power BI Desktop
Row-level security (RLS) guidance in Power BI Desktop
Dynamic Row Level Security with Profiles and Users in Power BI : Many-to-Many Relationship

concatenation of data into superset

there are two tables, one collects facts on a daily basis, the other on a monthly basis with the same set of attributes (for example, region, city, technology).
I need to calculate the formula in a superset
SUM(t1.count_exp) / SUM(t2.count_base)
which will be correctly visualized when calculating by region, or by city, or by region + city + technology per month.
in other bi systems, the group by is performed first, then the join is executed and the formula above is calculated, which gives the desired result. How to achieve a similar result in a superset?
Assuming both tables are in same database, then you can write your own query joining the two tables in 'SQL Lab' and then visualize the query results using 'Explore' option available there.
Once you click on 'Explore' from SQL Lab, Superset will create a Virtual Dataset(Table) inside Superset from results of SQL query. Any filters/group by/limit applied on this virtual table from visualization will query over this query.
https://superset.apache.org/docs/frequently-asked-questions
A view is a simple logical layer that abstract an arbitrary SQL
queries as a virtual table. This can allow you to join and union
multiple tables, and to apply some transformation using arbitrary SQL
expressions. The limitation there is your database performance as
Superset effectively will run a query on top of your query (view). A
good practice may be to limit yourself to joining your main large
table to one or many small tables only, and avoid using GROUP BY where
possible as Superset will do its own GROUP BY and doing the work twice
might slow down performance.

Problems loading data in to Analysis Services Model

I’m building an model in Azure Analysis Services. The model should contain only data for the last 3 months and is processed every day.
I have a separate dimension for date that has a relation with a fact table using a datekey. I’m using a power query to only load the last 3 months in the date dimension. In the power query to load the fact table I used Table.nestedjoin to only load the rows that have a value in the date table.
When I do this, the processing of the model takes forever. After some troubleshooting I saw that the query Analysis Services is using to retrieve data from the SQL database retrieves all rows. So, Am I correct saying AS load all data before it merge the rows? Is there a way to change this? Or is there a better way to a chief my solution?
Kind regards,
Joins are super slow in Power Query. You should avoid them if you can do it in the datasource or use normal relationships in the data model.
Also, you can setup the date dimension in DAX and dynamically populate it to contain only dates present in the FACT table.
As for the load of all the data, it could be because the data is fetched as is, and only then power query applies the transformations (the join).
You can modify the query in the Power Query Editor / Advenced Editor to add a where clause direclty in the query

Can a data source have multiple servers connection string in Power BI?

I have multiple databases which have similar schemas. I need to combine the data from all these databases and do reporting over it.
For example -
Customer table in AdventureWorks in Server 1
Customer table in AdventureWorks in Server 2
Customer table in AdventureWorks in Server 3
Now in Power BI i will have a data set called Customer. The data for this needs to come from all the 3 servers mentioned above. I know I can do it using merge queries in Power BI but it means I will have to pull the data from different server as different datasets in power bi and merge which I want to avoid.
Do let me know if there is any other way to do this.
You must use Edit queries -> Append queries (or Append queries as new) to combine the data from these data sources into one table:
I find Append queries as new more meaningful in this case. It will create a new (fourth) table, which will contain all the rows from the other 3 tables. Select Three or more tables and select all customers tables from your data sources:
Then use this new table in the report.
You should not worry that duplicating the data will lead to increased size of the data. The data itself will be reused and not copied twice.
Your model is the right place for combining data from different data sources in one place. Maybe this should not be your report, but one dataset shared between your reports, or a SSAS model. Combining the data on SQL Server level (e.g. partitioned view over 3 databases) is not a good idea. Combining these in your model also gives you the option to combine data from different types of data sources, e.g. Customers 1 is in SQL Server, Customers 2 is .csv file in OneDrive and Customers 3 is coming from a web service call.