Without a premium licensing, is it possible to simulate an incremental refresh to speed up Power BI Desktop?
Say, we keep all the data before a certain date in a local Access database and connect to the "live" database only for data after that date?
The question is how to export the historical data from one or several pbix file to Access, how can we do that?
Try doing it as a composite model. Load your archive data as one query using Import and your recent data as another query using Direct Query. Then you can union those to tables as a DAX calculated table and use that for your report.
If you aren't using Direct Query for recent data or you need to be refreshing your model, then I believe you can uncheck "Include in report refresh" in the query editor (right-click on the query in the Queries pane) and it won't refresh that archive table unless you specifically ask it to.
Related
In Power BI first we get source data. And then we add multiple query steps to filter data/remove column/etc. Then we add relations and model the data.
We can have calculated columns that are stored in the data. And measures that are not stored in the data but calculated on the fly.
Which data is stored in Power BI - the one after query or the one after modelling?
Power BI has 3 connection types for data access. They are import, direct query and live connections.
If we use import method as a connection type, data imported into Power bi file using Power BI desktop. So all the data always stays in disk. When query or refresh, data stays in computer memory.This data we can use to query and modeling. After work, we save the Power BI file it will save as file with .pbix extension. Data compressed and stored inside this file.
in direct query mode , data stays in remote location and we can connect data. each time we refresh or make change in slicer request goes to data source and bring back data to power bi. In this method, we can't access data but we can create data model.
live connection is another method. It only support for few data sources. In this method, data not stored in computer memory and can't create data model using Power BI desktop.
Power BI is very well documented. Many of the questions you've recently asked are answered in that resource, so please take a look. I get the feeling that you are using this community because you don't want to read the manual. I strongly suggest you take a look at the documentation, because everything we write in answer to your questions has already been written and documented, and SO is not meant to be a shadow user guide for well documented systems.
Depending on the data source you use in Power BI Desktop, Power BI supports query folding, which will do as much processing of the data at the source (for example SQL Server).
If query folding is not possible because the source does not support it, then the source data is loaded before the query steps are applied.
Read more about query folding here: https://learn.microsoft.com/en-us/power-bi/guidance/power-query-folding
When you perform additional modelling after the Power Queries are loaded, i.e. creating tables with DAX, adding columns, etc., these will be performed when the PBIX file is published to the Power BI service, and they will be performed each time the data is refreshed with the data gateway.
I have a Power BI report which I need to update one in a while.
All its tables were loaded from some internet URL, but only one of them needs to be updated, all the others have static data.
How do I make Power BI stop trying to reload the static tables?
Or how do I copy the data from these tables into new "non-internet-loaded" tables?
In the Query Editor, right-click on the queries and toggle off Include in report refresh where appropriate.
(The Enable load toggle is useful for queries that you only use for staging and don't actually want to load into your model.)
Currently we have a problem with loading data when updating the report data with respect to the DB, since it has too many records and it takes forever to load all the data. The issue is how can I load only the data from the last year to avoid taking so long to load everything. As I see, trying to connect to the COSMO DB in the box allows me to place an SQL query, but I don't know how to do it in this type of non-relational database.
Example
Power BI has an incremental refresh feature. You should be able to refresh the current year only.
If that still doesn’t meet expectations I would look at a preview feature called Azure Synapse Link which automatically pulls all Cosmos DB updates out into analytical storage you can query much faster in Azure Synapse Analytics in order to refresh Power BI faster.
Depending on the volume of the data you will hit a number of issues. First is you may exceed your RU limit, slowing down the extraction of the data from CosmosDB. The second issue will be the transforming of the data from JSON format to a structured format.
I would try to write a query to specify the fields and items that you need. That will reduce the time of processing and getting the data.
For SQL queries it will be some thing like
SELECT * FROM c WHERE c.partitionEntity = 'guid'
For more information on the CosmosDB SQL API syntax please see here to get you started.
You can use the query window in Azure to run the SQL commands, or Azure Storage Explorer to test the query, then move it to Power BI.
What is highly recommended is to extract the data into a place where is can be transformed into a strcutured format like a table or csv file.
For example use Azure Databricks to extract, then turn the JSON format into a table formatted object.
You do have the option of using running Databricks notebook queries in CosmosDB, or Azure DataBricks in its own instance. One other option would to use change feed to send the data and an Azure Function to send and shred the data to Blob Storage and query it from there, using Power BI, DataBricks, Azure SQL Database etc.
In the Source of your Query, you can make a select based on the CosmosDB _ts system property, like:
Query ="SELECT * FROM XYZ AS t WHERE t._ts > 1609455599"
In this case, 1609455599 is the timestamp which corresponds to 31.12.2020, 23:59:59. So, only data from 2021 will be selected.
When data are embedded into Power BI Desktop saved or shared report?
As far as I understand, PBI Import Mode will always embed all input tables data into saved or shared .pbix report. Am I right?
Suppose you have table A, and than based on it Aggregation Table B. What data would be saved to report if I report depended on table B?
Does PBI save any data with saved/shared report in DirectQuery mode?
When data are embedded into Power BI Desktop saved or shared report?
Data is saved in the model when dataset mode is Import or Composite a.k.a. Dual (i.e. both Import and DirectQuery). For more information see Dataset modes in the Power BI service and Manage storage mode in Power BI Desktop.
As far as I understand, PBI Import Mode will always embed all input tables data into saved or shared .pbix report. Am I right?
Yes, the imported data (if any) is always in the .pbix file. When published, it is split into separate report and dataset.
Suppose you have table A, and than based on it Aggregation Table B. What data would be saved to report if I report depended on table B?
It depends. There are options to reference or duplicate table. Also take a look at Use aggregations in Power BI Desktop.
Does PBI save any data with saved/shared report in DirectQuery mode?
No, in DirectQuery data is not imported, as noted in the documentation:
DirectQuery mode is an alternative to Import mode. Models developed in DirectQuery mode don't import data.
In this case queries are sent directly to the data source. There is some temporary caching though.
As far as I understand, PBI Import Mode will always embed all input
tables data into saved or shared .pbix report. Am I right?
Yes, import mode copies the data from the source into the pbix file.
Does PBI save any data with saved/shared report in Direct Query mode?
No, with direct query mode it only stores the connection details. If you create a new DAX calculated table based on the main Direct Query, it is evaluated and loaded into memory when the file is opened, so it only saves the query that generates the table, not the data in the table.
What you can do is change the pbix file extension to .zip and have a look inside the file for data and whats saved in the file
Hope that helps
I'm new to Power BI (Free Version) and I have been asked to develop a report system which generates report from an excel sheet, the reports work good for only the data I have collected.
but my question is how to connect to the data immediately from SQL server without the need to convert it to excel and then import it in power BI, I also want the data to be refreshed dynamically.
One of the solutions I tried is to add new dataset but I get the following message:
Refresh can't be scheduled because the data set doesn't contain any
data model connections, or is a worksheet or linked table. To schedule
refresh, the data must be loaded into the data model.
I have looked for many solutions but none has worked.
am I missing a concept? thank you
If this data is stored in a SQL SERVER table it is a pretty straight forward process.
When you create a new power bi report (.Pbix) you should see a prompt asking you if you want to "Get Data". You would select the 'SQL Server Database' option - See the image below:
Then, you will be asked to enter the Server and Database name, and to specify either 'Import' or 'Direct Query' mode. If you choose 'Import' the data will be refreshed every time you access the report or upon 'Refresh' within a report session. If you choose the latter, the connection will always be live i.e. any changes to the data in your database will be reflected in the report.
Once you get passed this window, you will be asked to either specify credentials or use a windows authentication to access the database and server. After that you can either specify a query to pull in some data or you can select from a list of tables.
I hope this helps!!