Tableau Desktop: Live Vs Extract and Publishing to Tableau Server - regex

I have a question that I can't seem to find a straightforward answer to. I am loading a roughly 20GB CSV file into Tableau Desktop to create some worksheets for future internal views. Where I am stumbling is whether to first use Extract or Live data source. The data itself will not change, only the reports or worksheets generated based on the data. The computations within the worksheets on Tableau Desktop take forever to complete.
On to the publishing to Tableau Server. I am under the assumption that I must upload my locally stored file to the server for future use. Would it be better to find a network drive to have the Tableau Server data source point to?
In short, what is the best method to manipulate a large dataset on Tableau Desktop and present on Tableau Server? Additionally, what Regex methodology does Tableau follow, as it doesn't seem to follow standards I use for Python.
Thanks!

When publishing a workbook with a file-based data source, choose the Include external files option in the publishing to server dialog box. This will eliminate the need to have the file in a networked location accessible by the server.
This approach only works if the data doesn't change. It remains static and embedded in the workbook. Under this option, if the data changes and you want your viz to reflect changes, you would need to update the data in Desktop and republish.

As you mentioned, you have to connect/load your 20GB csv file into Tableau Desktop for proper visualizations, if I understand your requirement correctly.
Please find below steps to do the same:
Yes, you have to manually update your data in that csv file whenever you want to show the same on Tableau (Note: Make sure "Name" of entire csv file & columns remain same)
After this on opening Tableau, you have to click on "Refresh data source" to get visualization of latest data present in csv file
enter image description here
For this case, I would say connection (LIVE / EXTRACT) won't help as this there is no role of this, However please use EXTRACT technique as per my understanding (means extract will update once we load latest data)

Related

Access Mainframe (Z/OS) Flat files

I have been tasked to research Power BI Connection to Mainframe Flat files in some cases vsam files
This is needed to replace an existing Legacy BI tool/Reporting that connects to Mainframe
POWER BI does not have a direct connection to Mainframe , so what would be the best way to connect to data sources (Flat files in some cases vsam files, if we need to convert vsam files to flat, we will do it)
Is there any third party tools that can can be used to bridge this gap between Power BI (or any other BI Tool) and the Mainframe data files (our shop already converts Vsam files to Flat files)
Thanks
You could do it my way, but it really only works for smaller, text-based files.
I have some code on the mainframe that converts the data in question into a sensible .csv format. I then compose and send an email on the mainframe to myself from a pre-defined email address (in my case I use "SAVE_TO_SHAREPOINT#<company_name>.com" with the .CSV data as an attachment.
I use Power Automate to pick up emails to myself from that email address and save the attachment to a SharePoint folder that is specified as the subject of the email.
This process can be automated to run whenever it is required. You can then use Power BI to pull in the .csv data fromthe specified SharePoint folder.
This won't work for everyone, but it currently works for me.
You could also use FTP, but you can't FTP to SharePoint, so you'd have to work out how you want the FTP'd data to get into Power BI.

Connect to Azure Data Lake Storage Gen 2 with SAS token Power BI

I'm trying to connec to to ADLS Gen 2 container with Power BI, but I've only found the option to connect with the key1/2 from the container (active directory is not an option in this case).
However, I don't want to use those keys since they are stored in Power BI and it can be seeing for the people who will have the .pbix file.
Is there anyway to connect to the ADLS Gen 2 from Power BI using Shared Access Signature (SAS)? so I can control only read access to what is really needed?
Thanks
As far as I know, the only way is to use the Storage Key, however I don't think the Key can be read or seen by the user after the Storage Data Source is applied and saved. It can be changed, but the Key itself is shown as dotted secret.
You can do it ;-)
I've tested it with Parquet files but CSV format should work as well. In PBI Desktop:
Select your source file type
Construct your whole file path with Advanced option. This will give you an opportunity to provide more than one part of the whole path.
Replace "blob" part of the URL with "dfs".
Paste your SAS token to the second text box
You should be ready to rock.

Reuse a previously published datasource in a Power BI report

I have developed a Power BI report using Power BI Desktop, pointing to a private on premise development database as the datasource so that I was able to develop and test it easily. Then, I published it from my Power BI Desktop pbix to the work area of my customer.
As a result, the work area contains the published report and the dataset. Later, my customer has changed the dataset so that it now points to the correct on premise production database of their own. It works perfectly.
Now, I want to publish a new report for my customer using the previously published and reconfigured dataset. The problem is that I can't see any option in Power BI Desktop to have the report point to the published dataset, nor I can't see any option to avoid creating a new dataset each time I publish a report, nor any way to reconfigure from the web portal the new published report to point to the same dataset as the first one.
Is there any way to do this or any work around for this scenario? I think the most reasonable solution would be to be able to change the dataset of any report, so that the datasets of any report could be interchangeable.
Update:
I had already used connection specific parameters, but I'm not given rights to change the published dataset, so thats a dead end.
Another thing I have come up to is that in Power BI Desktop you cannot change the connection parameters values to those of production enviroment and publish the report if you can't access the target database from your computer, because PowerBI Desktop ask you to apply changes first, and when it tries to apply the values it tries to connect to the corresponding database and, obviously, ends with a network related error or timeout error trying to connect to the database server, therefore cancelling changes and returning to the starting point.
It's always a good practice to use connection specific parameters to define the data source. This means that you do not enter server name directly, but specify it indirectly using a parameter. The same for the database name, if applicable.
If you are about to make a new report, cancel Get data dialog, define parameters as described bellow, and then in Get data specify the datasource using these parameters:
To modify an existing report, open Power Query Editor by clicking Edit Queries and in Manage Parameters define two new text parameters, lets name them ServerName and DatabaseName:
Set their current values to point to one of your data sources, e.g. SQLSERVER2016 and AdventureWorks2016. Then right click your query in the report and open Advanced Editor. Find the server name and database name in the M code:
and replace them with the parameters defined above, so the M code will look like this:
Now you can close and apply changes and your report should work as before. But now when you want to change the data source, do it using Edit Parameters:
and change the server and/or database name to point to the other data source, that you want to use for your report:
After changing parameter values, Power BI Desktop will ask you to apply the changes and reload the data from the new data source. To change the parameter values (i.e. the data source) of a report published in Power BI Service, go to dataset's settings and enter new server and/or database name:
If the server is on-premise, check the Gateway connection too, to make sure that it is configured properly to use the right gateway. You may also want to check the available gateways in Manage gateways:
After changing the data source, refresh your dataset to get the data from the new data source. With Power BI Pro account you can do this 8 times per 24 hours, while if the dataset is in a dedicated capacity, this limit is raised to 48 times per 24 hours.
This is a easy way to make your reports "switchable", e.g. for switching one report from DEV or QA to PROD environment, or as part of your disaster recovery plan, to automate switching all reports in some workgroup to another DR server. In your case, this will allow you (or your customers) to easily switch the datasource of the report.
I think the only correct answer is that it cannot be done, at least at this moment.
The most closest way of achieving this is with Live connections:
https://learn.microsoft.com/en-us/power-bi/desktop-report-lifecycle-datasets
But if you have already designed your report without using the Live connection but your own development enviroment and corresponding connection parameters then you are lost, your only chance is redo all your report with the Live Connection, or the queerest one solution, to use an alias in your configuration matching the name of the database server and the same database name that in the target production environment.

When I am working on a Power BI report using Power BI Desktop on my local machine, where is the data stored?

As I understand it, Power BI creates its own internal tabular model... but where?
Say I'm working with sensitive data, and the reports will be ultimately published to a Report Server on prem. The reports will only be accessible by selected Active Directory groups. If, during development, I save the pbix file to a network share, or internally e-mail it to a colleague, will it contain the sensitive data in an accessible way? I'm just thinking about ways the information could fall into the wrong hands. Is there an option to automatically save the pbix file with no data?
If you zip a PBIX file (see this reference), you can see that the data is stored in the DataModel file inside the top folder level in a highly compressed format. Though it's compressed, I doubt it's encrypted, so it's likely that someone could theoretically decompress the data if they know what they're doing.
One option would be to export the report as a PBIT instead, which is designed to only save the report structure, relations, queries and such but not the actual data if it comes from external sources.

Output table in slack slash command

I want slash command to output data in a table format?
I know that I will have to setup a custom integration for this. I did that using GET Method.
I can setup my own web service on EC2 machine, but how should I make sure that data comes in table format.
May be something like this
My problem is how should I make available data present in tabular format?
It's unfortunately not possible to format Slack messages as a table in this way. You would need to resort to generating an image and referencing it in a message attachment. There is limited support for displaying simple fields and labels, but may not quite meet your needs.
We had the same problem. So We made a slack app and made it free for public. Please feel free to check it out. https://rendreit.digital
After installing the app to your slack. You can do /tableit and paste in csv data or anything you copied from a spreadsheet (Excel or Google Sheet).
It also let your preview the rendered table before you send it to the chat.