Publish Power BI report with maximum size - powerbi

I have Power BI report which is connected to the dynamic 365 to show the report of contact and account but the data set has more then 2 GB of data and I am not able to publish the report.
how can I decrease the size of the data set so that when publish I can refresh and update data set to get the 2 GB of data.
I am not able to publish the report
one thing which I used to take the top 100 rows by using one of the option in the Data Set but when I refresh the data set after publish it get top 100 record

A .pbix file has no size limit, but PowerBI service has a limit of 1GB (for a pro subscription) per dataset, higher-level subscriptions (embedded/premium) have higher limits.
Unless you can upgrade your subscription (which may not be worth it for one report), you have to decrease the report size.
The suggestions are always the same and can be found on MS website, I had a look at it and I have nothing to add.
Below the points that you will find on the MS website, in case the link stops working:
Remove unnecessary columns
Remove unnecessary rows
Group by and summarize > pre-aggregate/aggregate data, with the detail you need
Optimize column data types > chose the right data type for each column
Preference for custom columns > create custom columns in PowerQuery (M), they have a better compression rate compared to DAX calculated column
Disable Power Query query load > do not load tables you don't need (support table used for calculations but not needed in the model)
Disable auto date/time > disables the calendar hierarchy created by PBI for each date in your model
Switch to Mixed mode > This mode is a mix of import and direct-query, you will find more info online about this. (if you choose this have a look at the aggregations functionality)

Related

Power BI Incremental Refresh without Service

I need help with improving refresh times on a Power BI dashboard with about 20M rows of data and 80 columns pulling from SQL Server. I cannot use Power BI Service in any capacity, this has to load into Power BI Desktop.
My refresh times on the raw data (virtually no transformations in Power Query) are taking about 3-4 hours.
Microsoft recommends incremental refresh to archive my historical data and only refresh the latest changes, but that requires Service and I 100% cannot use it.
Is there any other way to significantly improve my refresh times beyond Service's incremental refresh? If it was under an hour I'd be happy.
What I've tried:
Native Query to leverage the server
reducing column selections
removing all transformations
Splitting tables in Power Query and selectively turning off refresh in the historical tables - as soon as they get stacked/appended Power Query triggers a refresh on all stacked tables regardless of which ones have refresh turned off.
Looked into Power Query PQFL/M code to activate refresh of tables - can't find any method/property to control this in M code.
optimizing the SQL, haven't gotten any significant improvements.
20 million rows should not take that long, especially with no transformations. Something else is going wrong but without access to your data and hardware, it is impossible to say.
One possibility is do an initial data load and then turn off refresh on that query. Add a new query for just the new data (which should be quick) but load the new query to a completely new table. In PBI, you will then have two tables. Create a calculated table in DAX which is a union of your old, non-refreshed data and your new data. Refreshes should be very quick after your first load but obviously you need to think about how it scales as your data grows.

Power BI Dashboard shows 'Load Data' message twice every time I click on the report link

I have a power Bi dashboard (deployed to a report server) where I have imported the data (query folding is in place), but every time that I click on the report link it shows a message of 'Load data' 2 times before it displays the report.
To allow me to test this on a simpler dashboard, I created a power Bi dashboard where I have imported 1 view with only 3 columns and I have deployed to the server, but I still get the 'Load data message twice every time I click on the report link.
Why is Power BI showing this message twice and is there a way to disable it as this is causing delays when loading other reports.
Is it because of one of the following?
Data eviction
Is it due to how Import uses the VertiPaq engine to store data in memory?
Power BI wants to know the schema of the table before the query actually runs, so it asks Power Query to return the top 0 rows. Unfortunately, in this case query folding can’t take place and the top 0 filter can’t be pushed back to the database, so the entire query gets run once to get the schema and once to get the data.
Or is there another reason for this and is it possible to disable this?

PowerBI streaming dataset limitations

We have a requirement of generating reports on PowerBI for real time transactions. We have roughly 2000,000 transactions flowing in 1 day and we would like reports generated atleast for these number of rows.
We have I understand that the push streaming API has a limitation of 200,000 rows for FIFO datasets and 5,000,000 for "none retention policy" Link
My questions are as follows:
If we create a streaming data set push API via the PowerBI service, what dataset is created by default in the background? FIFO or the none retention policy dataset?
For a none-retention policy dataset, what happens when we cross the 5000,000 limit? If there is a failure, does that mean we need to delete old rows via an API call on a frequent basis? An example API to do this will help. Deleting all rows is not an option as business would like reports like KPIs over the last 24 hrs for example.
If we use Azure stream analytics to push data to PowerbI, what are the limitation of data storage in PowerBI in this case?
I'm afraid you misunderstood the idea of Power BI. Power BI is not a database! Do not try to use it as such. There are better options out there. That's why you have hard times trying to work around these limitations.
What I'm trying to say is that you should store and process your data somewhere else. Use Power BI for visualizing it only. In this case, if we say that you want to use real-time streaming, which must be updated every second, this means you need to send only 86,400 records per day (which is way less than the limit of 200,000 records of a FIFO dataset). If you do not want to use real-time streaming dashboard, but a normal Power BI report, then why you are looking at push datasets? So collect your data somewhere, aggregate the results, and then push the aggregated data to Power BI.
And to answer your questions anyway:
If you create a dataset using the Power BI REST API without specifying the retention policy, it will create Push dataset with none retention policy - basicFIFO must be enabled explicitly.
If you reach the limit of 5M rows, you will get an error when trying to push more rows to the dataset. Your only option is to delete all rows - there is no way to delete only some of them, because Power BI is not a database. That's why your data should be stored somewhere else and this is the idea behind basicFIFO retention policy and Power BI's streaming dataset.
Power BI limits doesn't change based on the data source. It doesn't matter are you pushing data through Azure Stream Analytics or a service written by you - the Power BI dataset is the same.

PowerBI Auto Refresh

Good Day
A client I am working with wants to use a PowerBI dashboard to display in their call centre with stats pulled from an Azure SQL Database.
Their specific requirement is that the dashboard automaticly refresh every minute between their operating hours (8am - 5pm).
I have been researching this a bit but can't find a definitive answer.
Is it possible for PowerBI to automaticly refresh every 1min?
Is it dependant on the type of license and/or the type of connection (DIRECTQUERY vs IMPORT)
You can set a report to refresh on a direct query source, using the automatic report refresh feature.
https://learn.microsoft.com/en-us/power-bi/create-reports/desktop-automatic-page-refresh
This will allow you to refresh the report every 1 minute or other defined interval. This is report only, not dashboards as it is desktop only.
When publishing to the service you will be limited to a minimum refresh of 30 mins, unless you have a dedicated capacity. You could add an A1 Power BI Embedded SKU and turn it on then off during business hours to reduce the cost. Which would work out around £200 per month.
Other options for importing data would be to set a Logic App or Power Automate task to refresh the dataset using an API call, for a lower level of frequency, say 5 mins. It would be best to optimise your query to return a small amount of pre aggregated data to the dataset.
You can use Power Automate to schedule refresh your dataset more than 48 times a day. You can refresh it every minute with Power Automate, it looks like. I can also see that you may be able to refresh your dataset more frequently than that with other tools.
Refreshing the data with 1 min frequency is not possible in PowerBI. If you are not using powerBI premium than you can schedule upto 8 times in a day, with the minimum gap of 15 minutes. If in case you are using PowerBI premium you are allowed to schedule 48 slots.
If you are not able to compromise with the above restrictions, then it might be worth to look into powerBI reports for streaming datasets. But then again there are some cons to that as well, like they work only with DirectQuery etc etc.

Custom Realtime Data in PowerBI Dashboards

I am new to the powerbi platform and have a challenge of scoping/converting an old dashboard solution to powerbi.
The old dashboard solution is custom made and refreshes data every minute.
Powerbi lists its refresh rate for 8 times a day for Pro and 48times a day on Enterprice. Does that means there is no option to provide the same realtime (1min) updates in a dashboard using powerbi?
Can you embed iframes or anything in a powerbi dashboard?
How can you do a realtime graph in powerbi if it only refreshes 8 times a day?
Refreshing data 8/48 times a day is applicable when you import data in Power BI dataset. But if you want "more recent" data, you can connect to your data source using DirectQuery mode. DirectQuery sends queries to your data source when rendering the report. If you apply a filter, the database will get a new query, and so on. Not every data source supports DirectQuery, e.g. you can't use it for flat files (obviously). You may want to take a look at Data refresh in Power BI article.
For embedding iframes in Power BI report you can use HTML Viewer custom visual.
By default when using DirectQuery mode, tiles pinned to a dashboard refresh automatically hourly (Datasets in DirectQuery/LiveConnect mode), but you can reduce this time down to 15 minutes:
A tile is a report visual pinned to a dashboard, and dashboard tile refreshes happen about every hour so that the tiles show recent results. You can change the schedule in the dataset settings, as in the screenshot below, or force a dashboard update manually by using the Refresh Now option.
However if you want to display real-time data in Power BI dashboard, it will be better to use push or streaming datasets. With push datasets you can programatically push data to the dataset and the data will be stored and you can use them in reports. Streaming datasets are similar to push, but they will keep only the last hour of data pushed to them and can't be used in report, only pinned to a dashboard. There are also other options, like using PubNub or Microsoft Flow. For more information also take a look at Real-time streaming in Power BI.